205174 – dev-db/mysql-5 stores relay logs in /var/run by default which are deleted by bootmisc on reboot

Bug 205174 - dev-db/mysql-5 stores relay logs in /var/run by default which are deleted by bootmisc on reboot

Summary: dev-db/mysql-5 stores relay logs in /var/run by default which are deleted by ...

Status:	RESOLVED UPSTREAM

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	High normal
Assignee:	Gentoo Linux MySQL bugs team

URL:	http://bugs.mysql.com/bug.php?id=33693
Whiteboard:
Keywords:

Duplicates (1):	208324 (view as bug list)
Depends on:
Blocks:

Reported:	2008-01-10 12:08 UTC by Ramon
Modified:	2009-07-07 11:18 UTC (History)
CC List:	1 user (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ramon 2008-01-10 12:08:36 UTC

This bug is informational only:

Mysql-5.1 and possible mysql-5.0 too put their relay files in /var/run by default. If you don't specify alternate locations the files end up there.


This would normally not be a problem but the nasty part of this is that the gentoo supplied /etc/init.d/bootmisc script cleans out /var/run

End result: broken replication on forced reboot

Comment 1 Jakub Moc (RETIRED) gentoo-dev

2008-01-10 12:18:08 UTC

/etc/init.d/bootmisc doesn't clean up /var/run *subdirectories* at all. What kind of 'relay' files are you talking about here anyway?

Comment 2 Ramon 2008-01-10 12:38:42 UTC

Snippet from /etc/init.d/bootmisc:

       ebegin "Cleaning /var/lock, /var/run"
        rm -rf /var/run/console.lock /var/run/console/*

        if [[ -z ${CDBOOT} ]] ; then
                #
                # Clean up any stale locks.
                #
                find /var/lock -type f -print0 | xargs -0 rm -f --
                #
                # Clean up /var/run and create /var/run/utmp so that we can login.
                #
                for x in $(find /var/run/ ! -type d ! -name utmp ! -name innd.pid ! -name random-seed) ; do
                        local daemon=${x##*/}
                        daemon=${daemon%*.pid}
                        # Do not remove pidfiles of already running daemons
                        if [[ -z $(ps --no-heading -C "${daemon}") ]] ; then
                                if [[ -f ${x} || -L ${x} ]] ; then
                                        rm -f "${x}"
                                fi
                        fi
                done
        fi


Looks pretty much like deleting to me.

relay files are logfiles created by mysql in case of replication setups.
They contain statements from the master server which need to be executed on the slaves

Comment 3 Jakub Moc (RETIRED) gentoo-dev

2008-01-10 12:48:24 UTC

(In reply to comment #2)
> Looks pretty much like deleting to me.

Sure it's deleting the files there, but not the subdirectories. I thought you are talking about socket/pid files which are expected to be wiped or the service won't even start on reboot...

We don't configure the location anywhere, so if upstream uses /var/run as default location, you should complain to them because it's completely stupid place to place non-volatile stuff.

http://www.pathname.com/fhs/pub/fhs-2.3.html#VARRUNRUNTIMEVARIABLEDATA

<snip>
This directory contains system information data describing the system since it was booted. Files under this directory must be cleared (removed or truncated as appropriate) at the beginning of the boot process. 
</snip>

Comment 4 Ramon 2008-01-10 13:03:22 UTC

I agree, but it bites you hard if you're not aware of it.

Like the bug opened: "This bug is informational only:"

We're upgarding and have just been bitten by it, trying to save others the trouble. I'll complain very VERY loudly to $UPSTREAM as well

Comment 5 Robin Johnson archtester

2008-01-16 04:53:37 UTC

Ok, I've traced the source of the upstream bug.
sql/log.cc:
445 const char *MYSQL_LOG::generate_name(const char *log_name,
 446                                      const char *suffix,
 447                                      bool strip_ext, char *buff)
 448 {
 449   if (!log_name || !log_name[0])
 450   {
 451     strmake(buff, pidfile_name, FN_REFLEN - strlen(suffix) - 1);
 452     return (const char *)
 453       fn_format(buff, buff, "", suffix, MYF(MY_REPLACE_EXT|MY_REPLACE_DIR));
 454 
 455   }

If you don't provide a name for the binary log or relay log, it uses the pidfile  name to create them. This goes totally against the documentation that says they are created in datadir.

Comment 6 Robin Johnson archtester

2008-01-16 05:08:07 UTC

I filed an upstream bug about it.

Comment 7 Ramon 2008-01-16 09:36:54 UTC

(In reply to comment #6)
> I filed an upstream bug about it.
> 

Would you happen to have a bugnr ?
I didn't get round to filing one yet but would like to track mysql-bug status.

I'd say this warrants a warning in conf.d/mysql or ewarn when installing.
We're upgrading a large farm of mysql servers from 4.1 to 5.1 and got bitten by this when our DC-provider had a short powerdip.

All slaves which didn't have their relay-log location defined lost their replication info because the file was cleaned on reboot but the replication.info file was still around (that one does default to $DATADIR)

Comment 8 Robin Johnson archtester

2008-01-17 01:27:02 UTC

umm, I put the upstream bug in the URL field here ;-), when I stated i had filed already.

Comment 9 Jakub Moc (RETIRED) gentoo-dev

2008-01-31 17:26:00 UTC

*** Bug 208324 has been marked as a duplicate of this bug. ***

Comment 10 Robin Johnson archtester

2008-05-29 05:39:14 UTC

No fix from upstream yet.

Comment 11 Robin Johnson archtester

2008-11-14 05:28:59 UTC

Still no resolution from upstream.

Comment 12 Robin Johnson archtester

2009-07-06 18:58:09 UTC

Upstream has fixed it in 6.0 only, and is not doing a backport to 5.0 or 5.1.
Since that part of the code has changed a lot, it doesn't apply cleanly to them anyway.

If you really need it backported, reopen the bug, and I can see about it.

Comment 13 Ramon 2009-07-07 11:18:41 UTC

It is fixed in the 5.1.34 builds.

That version is actually very stable and could be considered for portage inclusion. We're using it in production and will probably upgrade our 400+ database farm to that version in august/september.

The actual bug was fixed in 5.1.24 I believe.