Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 486556 - dev-db/postgresql: start up fails (9.1 and earlier), pg_timezone_name query overflows when /usr/share/zoneinfo/posix is a symlink to .
Summary: dev-db/postgresql: start up fails (9.1 and earlier), pg_timezone_name query o...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal with 4 votes (vote)
Assignee: PgSQL Bugs
URL: http://www.postgresql.org/message-id/...
Whiteboard:
Keywords:
: 488650 (view as bug list)
Depends on:
Blocks: 486532
  Show dependency tree
 
Reported: 2013-09-30 08:58 UTC by Jens Rutschmann
Modified: 2015-04-08 17:43 UTC (History)
12 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
fix for postgresql-9.1 (postgresql-9.1-timezone-fsloop.patch,1.06 KB, patch)
2015-04-07 18:33 UTC, Ian Stakenvicius (RETIRED)
Details | Diff
fix for postgresql-9.2 through 9.4 (postgresql-9.3-timezone-fsloop.patch,695 bytes, patch)
2015-04-07 20:01 UTC, Ian Stakenvicius (RETIRED)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Rutschmann 2013-09-30 08:58:02 UTC
The postgres server fails to start with the following error message in its log file when using sys-libs/timezone-data-2013d or later:
FATAL:  exceeded MAX_ALLOCATED_DESCS while trying to open directory "/usr/share/zoneinfo"

This is caused by the symlink in /usr/share/zoneinfo:
lrwxrwxrwx   1 root root     1 Sep 28 15:25 posix -> .

As soon as I delete the symlink the postgres server starts up normally.

It seems this symlink has been introduced with sys-libs/timezone-data-2013d.
When using 2013c there's a directory /usr/share/zoneinfo/posix instead and the postgres server starts up fine as well.
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2013-09-30 14:02:07 UTC
Bug #485720 comment #2 seems to mention this.
Comment 2 SpanKY gentoo-dev 2013-09-30 17:53:27 UTC
version 2013d does not have a symlink.  it is new to 2013f-r1.

that said, postgres is broken and needs fixing.
Comment 3 Aaron W. Swenson gentoo-dev 2013-10-16 18:02:46 UTC
Please check if the issue persists with the latest release (9.3.1, 9.2.5, 9.1.10, 9.0.14 and 8.4.18).
Comment 4 Erich Seifert 2013-10-16 20:49:04 UTC
I still see the same problem with dev-db/postgresql-server-9.1.10 and sys-libs/timezone-data-2013g.
Comment 5 Timo Gurr (RETIRED) gentoo-dev 2013-10-17 13:27:56 UTC
I hit the problem, too. Upstream BUG #8532: http://www.postgresql.org/message-id/E1VWO2i-0003Fk-TC@wrigleys.postgresql.org
Comment 6 Timo Gurr (RETIRED) gentoo-dev 2013-10-21 16:09:53 UTC
Upstreams has answered:

http://www.postgresql.org/message-id/52652C9C.1080306@vmware.com

> In summary, I'd call this a packaging bug.

It would be great if you guys could carry the discussion further if needed as I lack the required knowledge.
Comment 7 Aaron W. Swenson gentoo-dev 2014-01-11 11:49:55 UTC
*** Bug 488650 has been marked as a duplicate of this bug. ***
Comment 8 Aaron W. Swenson gentoo-dev 2014-01-11 12:18:47 UTC
(In reply to Timo Gurr from comment #6)
> Upstreams has answered:
> 
> http://www.postgresql.org/message-id/52652C9C.1080306@vmware.com
> 
> > In summary, I'd call this a packaging bug.
> 
> It would be great if you guys could carry the discussion further if needed
> as I lack the required knowledge.

It looks like Gentoo is the one to blame for this problem as we're reverting a change made 16 years ago. Long before Gentoo became a recognized distro. Essentially, that puts the problem firmly on the individual packages that need the posix directory inside the timezone directory.

@toolchain: Why are we reverting that change?

However, PostgreSQL should be doing some kind of sanity check.
Comment 9 SpanKY gentoo-dev 2014-03-25 18:07:33 UTC
we're installing timezone-data the same way glibc has always been doing it.  they used zoneinfo/posix/ hence so do we.  it's also what other distros are using.

Debian:
https://packages.debian.org/sid/all/tzdata/filelist

Ubuntu:
http://packages.ubuntu.com/saucy/all/tzdata/filelist

Fedora:
mkdir zoneinfo/{,posix,right}
zic -y ./yearistype -d zoneinfo -L /dev/null -p America/New_York $FILES
zic -y ./yearistype -d zoneinfo/posix -L /dev/null $FILES
zic -y ./yearistype -d zoneinfo/right -L leapseconds $FILES

so no, Gentoo isn't doing anything unique here.
Comment 10 eroen 2014-03-25 22:26:35 UTC
(In reply to SpanKY from comment #9)
> we're installing timezone-data the same way glibc has always been doing it. 
> they used zoneinfo/posix/ hence so do we.  it's also what other distros are
> using.
> 
[snip]
> 
> so no, Gentoo isn't doing anything unique here.

eroen@falcon ~/test $ ls -l /usr/share/zoneinfo/posix
lrwxrwxrwx 1 root root 1 Mar 11 02:28 /usr/share/zoneinfo/posix -> .

I don't see this symlink mentioned in the package lists you linked. Possibly other distros do some magic to make it not be a symlink?
Comment 11 SpanKY gentoo-dev 2014-03-27 01:21:18 UTC
(In reply to eroen from comment #10)

the issue isn't whether posix/ is a symlink.  it's whether the data that is contained in there is in /usr/share/zoneinfo/posix or /usr/share/zoneinfo-posix.  Gentoo (and everyone else) uses the former as that is what glibc has historically done.

the symlink change btw is coming from upstream timezone-data.  it's just a good idea which is why Gentoo also has it.
Comment 12 Mike Smyth 2014-05-17 16:42:29 UTC
I'm running stable gentoo, and the recent (May 15) stabilization of timezone-data-2014a.ebuild broke my stable dev-db/postgresql-server-9.1.12 install. The server failed to start, with the message:

FATAL:  exceeded maxAllocatedDescs (16) while trying to open directory "/usr/share/zoneinfo"

Manually removing the /usr/share/zoneinfo/posix (since emerge doesn't seem to clear out symbolic links) and then reverting to timezone-2013d fixed the problem.

I'd suggest that timezone-data-2014a shouldn't be marked stable until this issue gets resolved, you wouldn't expect this kind of breakage on stable gentoo.
Comment 13 Leho Kraav (:macmaN @lkraav) 2014-05-21 14:34:46 UTC
9.0 displays an even more obscure error message "too many private dirs requested", which is described in postgres mailing list as top google result. Apparently only 9.1 has the patch, so 9.0 people like me are probably going to waste a few hours minimum trying to find out if they broke something with the reboot or why the heck postgres suddenly won't start.
Comment 14 Francesco Lamonica 2014-05-27 17:26:16 UTC
(In reply to SpanKY from comment #9)
> we're installing timezone-data the same way glibc has always been doing it. 
> they used zoneinfo/posix/ hence so do we.  it's also what other distros are
> using.
> 
> Debian:
> https://packages.debian.org/sid/all/tzdata/filelist
> 
> Ubuntu:
> http://packages.ubuntu.com/saucy/all/tzdata/filelist
> 
> Fedora:
> mkdir zoneinfo/{,posix,right}
> zic -y ./yearistype -d zoneinfo -L /dev/null -p America/New_York $FILES
> zic -y ./yearistype -d zoneinfo/posix -L /dev/null $FILES
> zic -y ./yearistype -d zoneinfo/right -L leapseconds $FILES
> 
> so no, Gentoo isn't doing anything unique here.

Actually Ubuntu (at least starting from 12.04 up to 14.04) has symlinks but *inside* posix dir; i.e. all the directories are real directories, the symlinks are inside e.g.
/usr/share/zoneinfo/posix/GMT -> ../GMT
and
/usr/share/zoneinfo/posix/Europe/Rome -> ../../Europe/Rome
Comment 15 Francesco Lamonica 2014-05-28 07:55:12 UTC
More about my previous comment:
I just checked CentOS 5 and Debian Lenny
they both have replicated (no symlinks) timezones info in the posix folder (that is a real folder and not a symlink)
So it appears that indeed Gentoo is doing something unique here :)
Comment 16 James Le Cuirot gentoo-dev 2014-05-28 08:13:26 UTC
(In reply to Francesco Lamonica from comment #15)
> More about my previous comment:
> I just checked CentOS 5 and Debian Lenny
> they both have replicated (no symlinks) timezones info in the posix folder
> (that is a real folder and not a symlink)
> So it appears that indeed Gentoo is doing something unique here :)

Mike has said that the symlink is not the issue, though I must admit that I'm not convinced. However, both CentOS 5 and Debian Lenny are ancient. Even CentOS 6 is showing its age so Fedora would be a better reference. Please state your arguments against more recent releases.
Comment 17 Francesco Lamonica 2014-05-28 08:16:09 UTC
(In reply to James Le Cuirot from comment #16)
> (In reply to Francesco Lamonica from comment #15)
> > More about my previous comment:
> > I just checked CentOS 5 and Debian Lenny
> > they both have replicated (no symlinks) timezones info in the posix folder
> > (that is a real folder and not a symlink)
> > So it appears that indeed Gentoo is doing something unique here :)
> 
> Mike has said that the symlink is not the issue, though I must admit that
> I'm not convinced. However, both CentOS 5 and Debian Lenny are ancient. Even
> CentOS 6 is showing its age so Fedora would be a better reference. Please
> state your arguments against more recent releases.

Hi James,
indeed CentOS5 and Debian Lenny are ancient but those are the only distro of that kind i had at hand (still in production use :) )
However if you look at my first comment i posted a reference of ubuntu (from 12.04 to latest 14.04) that use symlinks but inside the posix dir
Comment 18 Francesco Lamonica 2014-05-28 08:52:25 UTC
Fedora 20 has the same structure of CentOS 5
Comment 19 Lukas Turek 2014-05-28 10:13:30 UTC
Debian Wheezy has some symlinks in /usr/share/zoneinfo, but they only link to files, not directories, so they can't cause a loop.
Comment 20 SpanKY gentoo-dev 2014-06-03 15:44:50 UTC
(In reply to Francesco Lamonica from comment #14)

the complaint was about gentoo using /usr/share/zoneinfo/posix/ instead of /usr/share/zoneinfo-posix/.  as you've shown, Gentoo isn't the only one doing this.  once that path is accepted as not being incorrect, the fact that it's a symlink to . is irrelevant.

that other distros make it a dir and then symlink every single file inside of there in a bid to workaround broken packages is just that -- it's working around packages that are broken.

at this point, postgres is the only one that hasn't updated things to handle this scenario.
Comment 21 Aaron W. Swenson gentoo-dev 2014-06-05 16:27:46 UTC
Well, I expected to run into this problem, but I haven't. Please check the latest release of PostgreSQL to see if the issue persists. Upstream has updated the tz-data they're using to the 2014 edition (can't remember which in particular).
Comment 22 Leho Kraav (:macmaN @lkraav) 2014-06-05 16:28:46 UTC
This is an issue on 9.0 series only, as I wrote in comment 13
Comment 23 Francesco Lamonica 2014-06-05 16:48:47 UTC
Hi Leho,
actually the problem (at least for me) is present with 9.1 as well. I read on postgresql mlist that they do a check for recursion in 9.2 and 9.3 (unfortunately i am stuck with 9.1 for a project)

@spanKY, i tend to disagree with your comment, only ubuntu has symlinks of files, centOS and Fedora actually have real files (don't know if they are duplicates or not), and the fact that the posix symlink is to . instead of another directory is not irrelevant since a symlink to another directory would not cause loop recursion if i understood correctly what the problem is with postgresql.
This seems a problem like the never-ending discussion about kernel/nvidia-drivers
i tend to think that timezones should have not changed (if they break a stable package) or that 9.1 would be eliminated from stables.

those, of course, are just my 2c :)

(In reply to Leho Kraav (:macmaN @lkraav) from comment #22)
> This is an issue on 9.0 series only, as I wrote in comment 13
Comment 24 SpanKY gentoo-dev 2014-06-10 03:09:25 UTC
if it's fixed in >=pgsql-9.2, then i guess upgrade.  those are already stable.
Comment 25 sf 2014-06-10 14:15:19 UTC
Re comment #11 and comment #20:

The only thing relevant to this issue is the circular link. No one mentioned zoneinfo-posix at all.

Easy fix (in the live filesystem):

cd /usr/share/zoneinfo
rm posix
mkdir posix
cd posix
ln -s ../* .
rm posix

This could as well be done in the ebuild.

And please reopen this bug. It is not fixed. At least stable postgresql:8.4 does not work any more, and the upstream bug report indicates that stable postgresql:9.0 and stable postgresql:9.1 suffer from this issue, too.
Comment 26 Leho Kraav (:macmaN @lkraav) 2014-06-10 14:18:23 UTC
(In reply to SpanKY from comment #24)
> if it's fixed in >=pgsql-9.2, then i guess upgrade.  those are already
> stable.

I also felt this is not the optimal resolution. Upgrading PG is a db migration process that may not easily be done at any convenience or whatnot.

I'd be quite appreciative if all current stables found a way to justwork(tm)
Comment 27 Francesco Lamonica 2014-06-10 14:29:24 UTC
I agree with the last two comments:

switching from 9.1 to 9.2 or 9.3 is not an upgrade but a db migration which (at least for my setup) is not a feasible solution.

@spanKY: resolution reason should be WON'T FIX in this case :)
Comment 28 Fabio Bonfante 2014-06-11 09:24:18 UTC
(In reply to sf from comment #25) 
> And please reopen this bug. It is not fixed. At least stable postgresql:8.4
> does not work any more, and the upstream bug report indicates that stable
> postgresql:9.0 and stable postgresql:9.1 suffer from this issue, too.

Totally agree... and +1 for the easy fix!
Comment 29 SpanKY gentoo-dev 2014-06-15 06:05:07 UTC
(In reply to sf from comment #25)

read the full history of the bug

(In reply to Leho Kraav (:macmaN @lkraav) from comment #26)

if you don't need the symlink, then delete it until you get a chance to upgrade

i imagine if someone backported the change that went into pgsql-9.2, it'd be considered for applying to older versions
Comment 30 Navid Zamani 2014-08-02 13:58:07 UTC
This is still a problem. I just updated my system, and portage recreated the posix symlink, causing my postgresql to not come up again after a reboot.

A nightmare, since I was on a trip and my calendar and contacts were stored in that database.

I am forced to use 9.1 since anything newer won’t work with DAViCal currently.

Obviously since portage has root access, it will recreate that symlink every time, or complain in other ways when it isn’t what it expects.

So I really wish there was a nice permanent fix for 9.1 too. (Apart from installing a RSBAC system and disallowing root from creating that symlink.)
Comment 31 James Le Cuirot gentoo-dev 2014-08-02 19:48:33 UTC
chattr +i will prevent writes even by root. This might cause an error when emerging but at least you'll catch the change.
Comment 32 SpanKY gentoo-dev 2014-08-05 03:51:56 UTC
(In reply to Navid Zamani from comment #30)

you can use INSTALL_MASK and per-package env so that portage never installs that symlink ...
Comment 33 Fabio Bonfante 2014-08-21 11:19:35 UTC
I've to keep EOL versions of postgres so to apply the quickfix at every upgrading/install of timezone-data, this work for me in /etc/portage/bashrc

-----
if [ "${EBUILD_PHASE}" == "postinst" ] && [ "${PN}" == "timezone-data" ];
then
        echo "zoneinfo posix compliance without breaking apps looping in /usr/share/zoneinfo folder (es. postgres-8.*)"
        cd ${EROOT}/usr/share/zoneinfo
        rm posix
        mkdir posix
        cd posix
        ln -s ../* ./
        rm posix
fi
Comment 34 Ian Stakenvicius (RETIRED) gentoo-dev 2015-04-07 16:34:38 UTC
I'm reopening this bug; it still exists.  Whether the fix be done in timezone-data (per bug 512880 which is currently also marked as 'cantfix') or within some sort of patch postgresql, one way or another it needs to be fixed.

I am personally leaning towards the fix in timezone-data, as patching the postgresql server code to understand that it should not follow symlinks more than once seems rather daunting to me.  Note that, although setting 'timezone' and 'log_timezone' in postgresql.conf does allow postgresql's server to start, it doesn't actually fix the issue as a "select * from pg_timezone_names;" will still fail.
Comment 35 Ian Stakenvicius (RETIRED) gentoo-dev 2015-04-07 18:33:30 UTC
Created attachment 400770 [details, diff]
fix for postgresql-9.1

This patch is a possible solution to ensure postgresql doesn't continue to recurse when single-depth filesystem loops are present.  

In postgresql-9.1 there are two locations at issue, the first causes the startup failure and is located in "scan_available_timezones()".  The second is within "pg_tzenumerate_next()" and is what loads the timezone information into postgresql's pg_timezone_* tables.

Later versions of postgresql (only checked 9.3 so far) seem to only have issues with pg_tzenumerate_next().
Comment 36 Ian Stakenvicius (RETIRED) gentoo-dev 2015-04-07 20:01:44 UTC
Created attachment 400778 [details, diff]
fix for postgresql-9.2 through 9.4

Testing has confirmed that the second part of the earlier patch is both necessary and is a solution to the recursive-filesystem-loop issue we have in our timezone-data package, for all newer versions of postgresql.

I have attached that portion of the patch as a separate file, here.

On a related note, thoughts on adding an epatch_user line to src_prepare() in postgresql ebuilds??
Comment 37 Aaron W. Swenson gentoo-dev 2015-04-08 15:10:01 UTC
Thanks, Ian! I'll start working on this right away.

And, I have no objections to adding the line.
Comment 38 Aaron W. Swenson gentoo-dev 2015-04-08 17:43:39 UTC
Ian, please submit your patches upstream as well to pgsql-hackers. It's a simple fix they might like.

*postgresql-9.4.1-r1 (08 Apr 2015)
*postgresql-9.3.6-r1 (08 Apr 2015)
*postgresql-9.2.10-r1 (08 Apr 2015)
*postgresql-9.1.15-r1 (08 Apr 2015)
*postgresql-9.0.19-r1 (08 Apr 2015)

  08 Apr 2015; Aaron W. Swenson <titanofold@gentoo.org>
  +postgresql-9.0.19-r1.ebuild, +postgresql-9.1.15-r1.ebuild,
  +postgresql-9.2.10-r1.ebuild, +postgresql-9.3.6-r1.ebuild,
  +postgresql-9.4.1-r1.ebuild, postgresql-9999.ebuild,
  +files/postgresql-9.1-tz-dir-overflow.patch,
  +files/postgresql-9.2-9.4-tz-dir-overflow.patch:
  Fix bugs 486556, 534124, and 540288