Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 202893

Summary: sys-cluster/heartbeat version bump to 2.1.3
Product: Gentoo Linux Reporter: Bas Nedermeijer <bas>
Component: [OLD] ServerAssignee: Gentoo Cluster Team <cluster>
Status: RESOLVED FIXED    
Severity: enhancement CC: barzog, bug, dertobi123, mvolaski, rodrigo, web, wschlich, xarthisius
Priority: High    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: Ebuild for heartbeat-2.1.1
pacemaker 0.6.6 ebuild for compile with heartbeat 2.1.xx
cluster-glue-9999.ebuild
logd-init
resource-agents-9999.ebuild
heartbeat-9999.ebuild
heartbeat-init-r2
Bumping to the last stable version using 2.0.8 build
fixing as-needed issues

Description Bas Nedermeijer 2007-12-20 21:05:54 UTC
There is a new version of heartbeat available. Ebuild follows (old one was used as a template)

Reproducible: Always
Comment 1 Bas Nedermeijer 2007-12-20 21:06:49 UTC
Created attachment 138999 [details]
Ebuild for heartbeat-2.1.1
Comment 2 Bas Nedermeijer 2007-12-20 21:54:22 UTC
(In reply to comment #1)
> Created an attachment (id=138999) [edit]
> Ebuild for heartbeat-2.1.1
> 

My bad, there is already a heartbeat-2.1.2 available.
Comment 3 Bas Nedermeijer 2007-12-23 21:30:08 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Created an attachment (id=138999) [edit]
> > Ebuild for heartbeat-2.1.1
> > 
> 
> My bad, there is already a heartbeat-2.1.2 available.
> 

And a day later, they released 2.1.3.
http://linux-ha.org/download/index.html#2.1.3

Same ebuild still works for me.
Comment 4 Ilya Volynets (RETIRED) gentoo-dev 2008-01-29 09:34:33 UTC
it seems that management daemon links with system libpe_status,
when using the ebuild in this bug, which causes problem, in case
of upgrade.
Comment 5 Wolfram Schlich (RETIRED) gentoo-dev 2008-02-06 09:19:33 UTC
Guys, can you please try out and comment on my 2.1.3 ebuild?

http://overlays.gentoo.org/dev/wschlich/browser/testing/sys-cluster/heartbeat

Thanks!
Comment 6 El Goretto 2008-03-04 15:19:43 UTC
Wolfram, is there a thread on the gentoo forum do return you our feedback or do you want them here?
In doubt, just to say that I tried the -r2 ebuild (what are the differences between "normal", r1 and r2 ebuild, btw? I'm not a dev, so I only saw the useflag differences, the r2 not dealing with gui flag, and some additionnal warnings).
Anyway, I met a failed dependency, tcp-wrappers (I disable the tcpd useflag system-wide) which I had to manually correct.
After that, it's correctly emerged.

Thanks for your overlay.
Comment 7 Wolfram Schlich (RETIRED) gentoo-dev 2008-03-04 23:17:10 UTC
(In reply to comment #6)
> Wolfram, is there a thread on the gentoo forum do return you our feedback or do
> you want them here?

Well, I'm fine with the comments here -- not sure about the
ha-cluster herd though :)

> In doubt, just to say that I tried the -r2 ebuild (what are the differences
> between "normal", r1 and r2 ebuild, btw? I'm not a dev, so I only saw the
> useflag differences, the r2 not dealing with gui flag, and some additionnal
> warnings).

Ok, let me sum it up:

(I just deleted heartbeat-2.1.3.ebuild)

- heartbeat-2.1.3-r1.ebuild:
  this is the best version to use if you want to stay with the classical
  heartbeat+crm-all-in-one version for the time being.
  this is also what I recommend until pacemaker (the new CRM) matures
  as a dedicated project. give it another 1 or 2 months.
  if you want to experiment, try the newer ebuilds + pacemaker :)

- heartbeat-2.1.3-r2.ebuild
  this is the first version to support USE=-crm to be able to use
  pacemaker (the new CRM) instead of the builtin CRM.
  it excludes the gui and the management thingy (both are broken,
  please do not use them at all).
  if you want to use the classical heartbeat+crm-all-in-one version,
  you can as well emerge this with USE=crm.

- pacemaker-0.6.2.ebuild
  this is an ebuild for the "new CRM" from the pacemaker project.
  you probably want this if you emerged heartbeat-2.1.3-r2 with USE=-crm :)

- heartbeat-2.1.3_p15-r8.ebuild
  this is an ebuild for the SuSE SRPM heartbeat-2.1.3-15.10.src.rpm
  it does not support USE=crm at all, as crm has been removed from it.

- pacemaker-0.6.2_p11-r7.ebuild
  this is an ebuild for the SuSE SRPM pacemaker-heartbeat-0.6.2-11.7.src.rpm
  you probably want this if you emerged heartbeat-2.1.3_p15-r8 :)

All those ebuilds contain one or more important patches.

> Anyway, I met a failed dependency, tcp-wrappers (I disable the tcpd useflag
> system-wide) which I had to manually correct.
> After that, it's correctly emerged.

Thanks! Have to figure out how to fix this though...

> Thanks for your overlay.

You're welcome :)
Have you also tried the drbd-8.2 ebuilds?
Comment 8 Stefan Behte (RETIRED) gentoo-dev Security 2008-04-14 13:16:40 UTC
I'd also like to see this in portage, was any progress made lately?
Comment 9 El Goretto 2008-04-23 07:55:16 UTC
(In reply to comment #7)

> - heartbeat-2.1.3-r2.ebuild
>   this is the first version to support USE=-crm to be able to use
>   pacemaker (the new CRM) instead of the builtin CRM.
>   it excludes the gui and the management thingy (both are broken,
>   please do not use them at all).
>   if you want to use the classical heartbeat+crm-all-in-one version,
>   you can as well emerge this with USE=crm.
> 
> - pacemaker-0.6.2.ebuild
>   this is an ebuild for the "new CRM" from the pacemaker project.
>   you probably want this if you emerged heartbeat-2.1.3-r2 with USE=-crm :)

Thanks for your explanations, I choose this combo (it's a testing cluster... almost :)).

> Have you also tried the drbd-8.2 ebuilds?
Sorry, I don't plan to use DRBD for this system.


Anyway, I went further and notice a problem with this heartbeat in crm enable mode (with pacemaker in my case). The CIB process complain that:
# /usr/lib/heartbeat/cib
/usr/lib/heartbeat/cib: error while loading shared libraries: libgnutls.so.13: cannot open shared object file: No such file or directory
(the symptom was that the node was reboted when the initdead time was reached).
I first tried to do the very bad and dirty trick (ln -s.... I know ^^) to see what happens, and:
# /usr/lib/heartbeat/cib
/usr/lib/heartbeat/cib: /usr/lib/libgnutls.so.13: version `GNUTLS_1_3' not found (required by /usr/lib/heartbeat/cib)

It seems we need the 2.0.4 version (tested after that, still available in portage), and not the 2.2.2 which is the newest stable version available in portage, which I was using of course.

Now my fist node is up (without ressource wut with CRM OK), I can clone it and go on...

See you later, if I notice something valuable concerning the ebuild during the next steps.
Comment 10 Dean Hall 2008-06-18 17:13:52 UTC
Any progress on this? It's June, and we're still stuck with rather old versions in portage.
Comment 11 El Goretto 2008-07-23 13:58:05 UTC
(In reply to comment #10)
> Any progress on this? It's June, and we're still stuck with rather old versions
> in portage. 

Yes, but I don't think it's a bad thing...
From what I've seen by now with my (now) 3 nodes 2.1.3 cluster+pacemaker (ebuild taken from Wolfram Schlich's overlay), I wouldn't recommend these "recent" versions for anything else than playing...
HB behaviour is strange (some weird interaction with gentoo ressources init script when doing a HB restart), and I got a HB segfault on a node that wasn't running any ressource... Ouch.
Comment 12 Wolfram Schlich (RETIRED) gentoo-dev 2008-07-23 16:05:09 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Any progress on this? It's June, and we're still stuck with rather old versions
> > in portage. 
> 
> Yes, but I don't think it's a bad thing...
> From what I've seen by now with my (now) 3 nodes 2.1.3 cluster+pacemaker
> (ebuild taken from Wolfram Schlich's overlay), I wouldn't recommend these
> "recent" versions for anything else than playing...
> HB behaviour is strange (some weird interaction with gentoo ressources init
> script when doing a HB restart), and I got a HB segfault on a node that wasn't
> running any ressource... Ouch.

Interesting.

I'm running 2.1.3-r3 from my overlay on 2 production clusters
(with USE=crm and without pacemaker) without problems (failover
tests were successful).

I'm now trying 2.1.3-r3 + pacemaker-0.6.5. I've already experienced
some strange behavior after a previously STONITHed node came back up
(all resources were stopped and started again on the same node),
but I've not yet talked to beekhof and dejanm about that...

Also, you should NOT use any Gentoo init scripts with Heartbeat!
Gentoo init scripts are simply incompatible with Heartbeat.
Instead, you might want to try those from my heartbeat-scripts
package from my testing overlay -- it contains OCF resource agents
for fcron, openssh, apache2, mysql, samba, postfix, avmailgate,
dovecot and STONITH plugins for IPMI-over-LAN and USB power outlet
devices along with some examples.
If you need an OCF resource agent for any other service/daemon,
please let me know! I'm glad to either work out one for you or
incorporate one you create based on one of those already in my
package :)
Comment 13 Wolfram Schlich (RETIRED) gentoo-dev 2008-07-23 16:09:16 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Any progress on this? It's June, and we're still stuck with rather old versions
> > in portage. 
> 
> Yes, but I don't think it's a bad thing...
> From what I've seen by now with my (now) 3 nodes 2.1.3 cluster+pacemaker
> (ebuild taken from Wolfram Schlich's overlay), I wouldn't recommend these
> "recent" versions for anything else than playing...
> HB behaviour is strange (some weird interaction with gentoo ressources init
> script when doing a HB restart), and I got a HB segfault on a node that wasn't
> running any ressource... Ouch.

Also, have you tried pacemaker-0.6.5 from my overlay?!
Comment 14 El Goretto 2008-08-13 12:28:03 UTC
(In reply to comment #13)
> Also, have you tried pacemaker-0.6.5 from my overlay?!

Of course, as I said, everything comes from your overlay (thanks again for sharing your "playground" ^^)
Following your advice, I convert my "lsb" ressources to ocf ones thanks to your agents scripts (apache2 and samba).
I'm still missing one OCF script for vsftp. I don't wanna appear like a lazy guy (I mean, in fact I AM, but not always ^^) but of course, if you have time to adapt one of yours (which looks great with just what is needed as for options) for vsftp, I wouldn't mind :)
The main point is to have the "IP adress listen" option too, just like apache2 script, as the actual lsb script is able to do "multiplexing" and start multiple vsftp at a time (each one has his configuration file though).

For my short experience, I don't feel comfortable with HB+pacemaker "cutting edge overlay" versions behaviour. For example, just now during HB start, a NFS ressource in a group wasn't mounted, so remaining NFS ressource in that group weren't (as expected) but ressources in another group which should normally start AFTER the 1st group (with help of constraints) where still started by HB...

I will try a pure HB setup without pacemaker, and see if my 3 nodes cluster get rids of the "segfault problem after a while".
Comment 15 Wolfram Schlich (RETIRED) gentoo-dev 2008-08-13 13:34:07 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > Also, have you tried pacemaker-0.6.5 from my overlay?!
> 
> Of course, as I said, everything comes from your overlay

Ok, you should *definitely* report a bug to the pacemaker
bugzilla then!
http://bugs.clusterlabs.org/cgi-bin/bugzilla/index.cgi
Alternatively, you might want to join #linux-ha on
irc.freenode.net -- Andrew Beekhof (Pacemaker/CRM guy)
and the other HA devs are hanging around there...

> Following your advice, I convert my "lsb" ressources
> to ocf ones thanks to your agents scripts (apache2 and samba).

So, do they work for you? :)

> I'm still missing one OCF script for vsftp. I don't wanna appear
> like a lazy guy (I mean, in fact I AM, but not always ^^) but of
> course, if you have time to adapt one of yours (which looks great
> with just what is needed as for options) for vsftp, I wouldn't mind :)
> The main point is to have the "IP adress listen" option too, just like apache2
> script, as the actual lsb script is able to do "multiplexing" and start
> multiple vsftp at a time (each one has his configuration file though).

I might consider that in the near future :)

> For my short experience, I don't feel comfortable with HB+pacemaker
> "cutting edge overlay" versions behaviour. For example, just now
> during HB start, a NFS ressource in a group wasn't mounted, so
> remaining NFS ressource in that group weren't (as expected) but
> ressources in another group which should normally start AFTER the
> 1st group (with help of constraints) where still started by HB...
> 
> I will try a pure HB setup without pacemaker, and see if my 3 nodes
> cluster get rids of the "segfault problem after a while".

Hmm. A 3 node cluster is a special case anyway... you should really
talk to the Linux HA devs on IRC or on the mailing list!

Good luck!
Comment 16 El Goretto 2008-08-18 09:35:57 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > Also, have you tried pacemaker-0.6.5 from my overlay?!
> > Of course, as I said, everything comes from your overlay
> Ok, you should *definitely* report a bug to the pacemaker
> bugzilla then!
> http://bugs.clusterlabs.org/cgi-bin/bugzilla/index.cgi
> Alternatively, you might want to join #linux-ha on
> irc.freenode.net -- Andrew Beekhof (Pacemaker/CRM guy)
> and the other HA devs are hanging around there...

I already tried, but I can't: "Either no products have been defined to enter bugs against or you have not been given access to any. "
And (I hate proxies, ggr) I've no IRC access.

> > Following your advice, I convert my "lsb" ressources
> > to ocf ones thanks to your agents scripts (apache2 and samba). 
> So, do they work for you? :)
Yeah great, thank you :)

> > For my short experience, I don't feel comfortable with HB+pacemaker
> > "cutting edge overlay" versions behaviour. For example, just now
> > during HB start, a NFS ressource in a group wasn't mounted, so
> > remaining NFS ressource in that group weren't (as expected) but
> > ressources in another group which should normally start AFTER the
> > 1st group (with help of constraints) where still started by HB...
> > 
> > I will try a pure HB setup without pacemaker, and see if my 3 nodes
> > cluster get rids of the "segfault problem after a while".

More about my tests: I can confirm that a pure 2.1.3-r3 HB cluster is "stable". Mine is still up and running after 4 days, when the HB+PM cluster didn't last 24hours in most cases. I observed lost or bad connectivity between nodes sometimes, but nothing that prevent HB from keeping ressources up, even if they were moved sometimes.

So I want definetly to fill a bugreport for pacemaker... I'll try the mailing-list (its huge activity frighten me sometimes... :/)

BTW: I'll probably test your DRBD ebuild, I'll keep you informed if I see something worth mentioning (something bad of course ;)).

Comment 17 Wolfram Schlich (RETIRED) gentoo-dev 2008-08-18 09:47:03 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > (In reply to comment #14)
> > > (In reply to comment #13)
> > > > Also, have you tried pacemaker-0.6.5 from my overlay?!
> > > Of course, as I said, everything comes from your overlay
> > Ok, you should *definitely* report a bug to the pacemaker
> > bugzilla then!
> > http://bugs.clusterlabs.org/cgi-bin/bugzilla/index.cgi
> 
> I already tried, but I can't: "Either no products have been defined to enter
> bugs against or you have not been given access to any. "

Oh, seems that the Bugzilla at that URL is not being used
anymore: http://www.clusterlabs.org/#Bugzilla
Comment 18 Diego Elio Pettenò (RETIRED) gentoo-dev 2008-08-31 12:43:39 UTC
*** Bug 162162 has been marked as a duplicate of this bug. ***
Comment 19 Oleg Gawriloff 2008-12-01 13:55:47 UTC
Any plans to add pacemaker 1.0.1 and heartbeat 2.99.1 to your repository?
http://www.clusterlabs.org/mw/Releases
Comment 20 Oleg Gawriloff 2008-12-01 14:05:02 UTC
Also when using heartbeat 2.99.0:
falcon-cl2 ha.d # equery uses heartbeat
[ Searching for packages matching heartbeat... ]
[ Colour Code : set unset ]
[ Legend : Left column  (U) - USE flags from make.conf              ]
[        : Right column (I) - USE flags packages was installed with ]
[ Found these USE variables for sys-cluster/heartbeat-2.99.0_beta ]
 U I
 - - doc        : Adds extra documentation (API, Javadoc, etc)
 - - ldirectord : Adds support for ldiretord, use enabled because it has a lot of deps

there must be changes section in pacemaker ebuild which complains about 'heartbeat must be build without crm flag, 'cause heartbeat 2.99.xx does not have any flags.

 *
 * ERROR: sys-cluster/pacemaker-0.6.5 failed.
 * Call stack:
 *                ebuild.sh, line   49:  Called pkg_setup
 *   pacemaker-0.6.5.ebuild, line   36:  Called built_with_use 'pkg_setup' 'pkg_setup'
 *            eutils.eclass, line 1740:  Called die
 * The specific snippet of code:
 *                                      die)   die "$PKG does not actually support the $1 USE flag!";;
 *  The die message:
 *   sys-cluster/heartbeat-2.99.0_beta does not actually support the crm USE flag!
 *
 * If you need support, post the topmost build error, and the call stack if relevant.
 * A complete build log is located at '/var/tmp/portage/sys-cluster/pacemaker-0.6.5/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/sys-cluster/pacemaker-0.6.5/temp/die.env'.

Comment 21 Oleg Gawriloff 2008-12-01 14:35:02 UTC
Created attachment 173969 [details]
pacemaker 0.6.6 ebuild for compile with heartbeat 2.1.xx

Here's pacemaker 0.6.6 ebuild which is compiles sucessfully with heartbeat 2.99.0_beta (0.6.5 complains about missing -lstonithd)
Also addition check on heartbeat version included
Comment 22 Peter Alfredsen (RETIRED) gentoo-dev 2009-02-15 21:05:25 UTC
*** Bug 244981 has been marked as a duplicate of this bug. ***
Comment 23 Vitali Kari 2009-03-11 13:04:43 UTC
> - heartbeat-2.1.3-r2.ebuild
>   this is the first version to support USE=-crm to be able to use
>   pacemaker (the new CRM) instead of the builtin CRM.
>   it excludes the gui and the management thingy (both are broken,
>   please do not use them at all).

what is in the gui broken?
is there a way to use hb-gui (management) with heartbeat-2.1.4.ebuild?
Comment 24 Janos Pasztor 2009-06-05 15:02:12 UTC
Please note, that you have to remove user id 65 (cluster) manually, if you want to use the pacemaker ebuild.
Comment 25 INODE64 Sistemas 2009-08-28 16:29:53 UTC
Created attachment 202509 [details]
cluster-glue-9999.ebuild

New version of heartbeat 3.0.0 from mercurial
Comment 26 INODE64 Sistemas 2009-08-28 16:30:15 UTC
Created attachment 202510 [details]
logd-init
Comment 27 INODE64 Sistemas 2009-08-28 16:30:31 UTC
Created attachment 202512 [details]
resource-agents-9999.ebuild
Comment 28 INODE64 Sistemas 2009-08-28 16:30:48 UTC
Created attachment 202514 [details]
heartbeat-9999.ebuild
Comment 29 INODE64 Sistemas 2009-08-28 16:31:09 UTC
Created attachment 202516 [details]
heartbeat-init-r2
Comment 30 Kacper Kowalik (Xarthisius) (RETIRED) gentoo-dev 2009-09-18 21:25:01 UTC
Created attachment 204550 [details, diff]
Bumping to the last stable version using 2.0.8 build
Comment 31 Kacper Kowalik (Xarthisius) (RETIRED) gentoo-dev 2009-09-18 21:25:50 UTC
Created attachment 204551 [details, diff]
fixing as-needed issues
Comment 32 Patrick Lauer gentoo-dev 2009-11-07 21:20:14 UTC
*** Bug 190345 has been marked as a duplicate of this bug. ***
Comment 33 Ultrabug gentoo-dev 2011-03-14 09:47:12 UTC
fixed by bump to sys-cluster/heartbeat-2.0.8
Comment 34 Ultrabug gentoo-dev 2011-03-14 10:57:31 UTC
(In reply to comment #33)
> fixed by bump to sys-cluster/heartbeat-2.0.8

stupid me meant heartbeat-3.x