This bug is informational only: Mysql-5.1 and possible mysql-5.0 too put their relay files in /var/run by default. If you don't specify alternate locations the files end up there. This would normally not be a problem but the nasty part of this is that the gentoo supplied /etc/init.d/bootmisc script cleans out /var/run End result: broken replication on forced reboot
/etc/init.d/bootmisc doesn't clean up /var/run *subdirectories* at all. What kind of 'relay' files are you talking about here anyway?
Snippet from /etc/init.d/bootmisc: ebegin "Cleaning /var/lock, /var/run" rm -rf /var/run/console.lock /var/run/console/* if [[ -z ${CDBOOT} ]] ; then # # Clean up any stale locks. # find /var/lock -type f -print0 | xargs -0 rm -f -- # # Clean up /var/run and create /var/run/utmp so that we can login. # for x in $(find /var/run/ ! -type d ! -name utmp ! -name innd.pid ! -name random-seed) ; do local daemon=${x##*/} daemon=${daemon%*.pid} # Do not remove pidfiles of already running daemons if [[ -z $(ps --no-heading -C "${daemon}") ]] ; then if [[ -f ${x} || -L ${x} ]] ; then rm -f "${x}" fi fi done fi Looks pretty much like deleting to me. relay files are logfiles created by mysql in case of replication setups. They contain statements from the master server which need to be executed on the slaves
(In reply to comment #2) > Looks pretty much like deleting to me. Sure it's deleting the files there, but not the subdirectories. I thought you are talking about socket/pid files which are expected to be wiped or the service won't even start on reboot... We don't configure the location anywhere, so if upstream uses /var/run as default location, you should complain to them because it's completely stupid place to place non-volatile stuff. http://www.pathname.com/fhs/pub/fhs-2.3.html#VARRUNRUNTIMEVARIABLEDATA <snip> This directory contains system information data describing the system since it was booted. Files under this directory must be cleared (removed or truncated as appropriate) at the beginning of the boot process. </snip>
I agree, but it bites you hard if you're not aware of it. Like the bug opened: "This bug is informational only:" We're upgarding and have just been bitten by it, trying to save others the trouble. I'll complain very VERY loudly to $UPSTREAM as well
Ok, I've traced the source of the upstream bug. sql/log.cc: 445 const char *MYSQL_LOG::generate_name(const char *log_name, 446 const char *suffix, 447 bool strip_ext, char *buff) 448 { 449 if (!log_name || !log_name[0]) 450 { 451 strmake(buff, pidfile_name, FN_REFLEN - strlen(suffix) - 1); 452 return (const char *) 453 fn_format(buff, buff, "", suffix, MYF(MY_REPLACE_EXT|MY_REPLACE_DIR)); 454 455 } If you don't provide a name for the binary log or relay log, it uses the pidfile name to create them. This goes totally against the documentation that says they are created in datadir.
I filed an upstream bug about it.
(In reply to comment #6) > I filed an upstream bug about it. > Would you happen to have a bugnr ? I didn't get round to filing one yet but would like to track mysql-bug status. I'd say this warrants a warning in conf.d/mysql or ewarn when installing. We're upgrading a large farm of mysql servers from 4.1 to 5.1 and got bitten by this when our DC-provider had a short powerdip. All slaves which didn't have their relay-log location defined lost their replication info because the file was cleaned on reboot but the replication.info file was still around (that one does default to $DATADIR)
umm, I put the upstream bug in the URL field here ;-), when I stated i had filed already.
*** Bug 208324 has been marked as a duplicate of this bug. ***
No fix from upstream yet.
Still no resolution from upstream.
Upstream has fixed it in 6.0 only, and is not doing a backport to 5.0 or 5.1. Since that part of the code has changed a lot, it doesn't apply cleanly to them anyway. If you really need it backported, reopen the bug, and I can see about it.
It is fixed in the 5.1.34 builds. That version is actually very stable and could be considered for portage inclusion. We're using it in production and will probably upgrade our 400+ database farm to that version in august/september. The actual bug was fixed in 5.1.24 I believe.