Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 197664 - dev-db/firebird-2.1.0* hangs on compile, create_db empty.db
Summary: dev-db/firebird-2.1.0* hangs on compile, create_db empty.db
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: AMD64 Linux
: High normal (vote)
Assignee: William L. Thomson Jr. (RETIRED)
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2007-10-31 16:35 UTC by Rui Santos
Modified: 2008-07-24 20:17 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Ebuild log file (build.log,22.75 KB, text/plain)
2007-10-31 16:36 UTC, Rui Santos
Details
strace ../gen/firebird/bin/create_db empty.fdb on x86 (borked,187.39 KB, text/plain)
2007-11-05 21:35 UTC, William L. Thomson Jr. (RETIRED)
Details
strace create_db empty.db on amd64 ( previously compiled and installed create_db ) (working,935.38 KB, text/plain)
2007-11-05 21:36 UTC, William L. Thomson Jr. (RETIRED)
Details
tail of build.log with patch to cd before executing create_db empty.db (output,3.47 KB, text/plain)
2007-11-17 16:43 UTC, William L. Thomson Jr. (RETIRED)
Details
patch removing hardcoded optimize flags. (firebird-cflags-fix.patch,625 bytes, patch)
2007-12-19 15:03 UTC, Andrzej Rybczak
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rui Santos 2007-10-31 16:35:58 UTC
firebird-2.1.0.16780_beta2 compiles but, right after it starts to install, it keeps forever ( 4 hours ) on the command:

rm -f empty.fdb
../gen/firebird/bin/create_db empty.fdb

Reproducible: Always

Steps to Reproduce:
1. use emerge to get that version of firebird
Comment 1 Rui Santos 2007-10-31 16:36:19 UTC
Created attachment 134823 [details]
Ebuild log file
Comment 2 Patrizio Bassi 2007-10-31 21:42:55 UTC
i have exactly the same issue
Comment 3 Jakub Moc (RETIRED) gentoo-dev 2007-10-31 21:55:35 UTC
Try w/ MAKEOPTS="-j1" and report back, please.
Comment 4 Patrizio Bassi 2007-10-31 22:54:27 UTC
no way :(
Comment 5 Jakub Moc (RETIRED) gentoo-dev 2007-11-01 07:47:31 UTC
OK, then reopen the bug ;)
Comment 6 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-04 20:12:28 UTC
I am seeing this on x86 as well, despite not having any problems emerging it on amd64. I think it's due to make -j options. As the one it fails on is set to use distcc and has -j5, while my amd64 box is only -j2. Haven't confirmed that one way or another. But I have seen it getting stuck on the empty_db or etc and it's not growing in size on disk. So not sure what's causing it to hang like that at that point. Will comment again when I have further info.
Comment 7 Patrizio Bassi 2007-11-04 20:20:46 UTC
i used -j1 and it fails.

the funny part is that you cannot kill it with simple "kill" but you need kill -9.
for me it's not a jobs problem but application itself.

old firebird version worked flawlessy.
Comment 8 Rui Santos 2007-11-05 11:01:46 UTC
I can also confirm that the "-j1" option on MAKEOPTS did not work. I'm also getting this error. Could mean something:

config.status: src/include/gen/autoconfig.h is unchanged
./configure: line 36820: cd: extern/icu/source: No such file or directory
chmod: cannot access `runConfigureICU': No such file or directory
chmod: cannot access `install-sh': No such file or directory
./configure: line 36822: ./runConfigureICU: No such file or directory
Comment 9 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 15:14:34 UTC
It's really odd. I can emerge without problems on amd64, but on x86 it routinely hangs at create_db empty.db. Changing make options and dropping distcc doesn't make any difference as others found out already.

Really not sure what's causing this. I don't believe I am doing anything different with the ebuild or patches than I was with 2.0.x. Much less why it only occurs on x86, seems like the problem should be the other way around.
Comment 10 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 15:20:11 UTC
(In reply to comment #8)
>  I'm also getting this error. Could mean something:
> 
> config.status: src/include/gen/autoconfig.h is unchanged
> ./configure: line 36820: cd: extern/icu/source: No such file or directory
> chmod: cannot access `runConfigureICU': No such file or directory
> chmod: cannot access `install-sh': No such file or directory
> ./configure: line 36822: ./runConfigureICU: No such file or directory

While that stuff is not ideal, it doesn't seem to cause any harm or matter. Likely due to me not patching/modifying the configure properly. But goal there is mostly to not use bundled icu, which get's accomplished despite above errors.

I confirmed I get the same errors on amd64, where I don't have any problems building or running 2.1.x. So not sure what's up with compiling on x86.

Comment 11 Patrizio Bassi 2007-11-05 15:45:14 UTC
can be due to being "beta" software?
maybe we should wait final delivery and tell upstream.

can you mask this in the meanwhile?
Comment 12 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 15:59:12 UTC
(In reply to comment #11)
> can be due to being "beta" software?
> maybe we should wait final delivery and tell upstream.
>
> can you mask this in the meanwhile?

I will consider masking this, but it's in ~arch where things are known to be unstable and so on. So it's kinda par for the course. Plus I would only mask on x86. Really would prefer more run into it, and maybe someone will know how to fix, what's causing it, or etc.

As for masking, either don't run ~arch, or mask locally if it bugs ya :) But for taking it upstream. I likely will if I can't find a cause for this. At least to see if upstream is aware of this or not. But would like to find a bit more info before I go upstream. Just in case it's not an upstream problem :)

Comment 13 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 16:18:24 UTC
Ok, it looks like the empty.db is never actually created. I did a find on the work dir and found nothing. Then I moved to the dir that contains create_db binary and executed create_db empty.db which completed in a few milliseconds. So something must be off with env, or something during merge. Again strange it only occurs on x86. Still looking into this. But basically when create_db is invoked by build system it doesn't create the empty.db file, and I guess it hangs waiting on the file to be created or something before it exits.
Comment 14 Patrizio Bassi 2007-11-05 16:29:50 UTC
i don't know why it works but i have an amd64 (core2 duo) profile and it doesn't work.

so it's not x86 but x86_64 as well.

for the hangs....it's not hangs only it's endless loop as the cpu is fully used.

it's not waiting...it's burning something....!
Comment 15 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 17:39:48 UTC
(In reply to comment #14)
> i don't know why it works but i have an amd64 (core2 duo) profile and it
> doesn't work.
>

Odd, compiles and merges fine here all day long on single core Turion and Opteron procs. I don't have access to any dual cores to try. Are you running pure ~arch or mixing stable and unstable?

> so it's not x86 but x86_64 as well.

Interesting
 
> for the hangs....it's not hangs only it's endless loop as the cpu is fully
> used.
> 
> it's not waiting...it's burning something....!

Yeah it's stuck in some sort of loop. But it's not writing to disk or doing anything. Other than tying up the cpu, and I can't explain why atm. Invoking the same create_db that's generated by the compile process directly works. Must be something in the build env throwing it off or etc. Reviewing the Makefile.refDatabases that invokes the problem.
Comment 16 Patrizio Bassi 2007-11-05 19:03:58 UTC
i'm at work now.

can you try to strace it? where's the loop?

try without the sandbox as well
Comment 17 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 21:34:02 UTC
After HOURS of looking into this. Running all kinds of tests, debugging output, etc. I can't for the life of me figure out what's going on. At one point when it failed via make I could invoke it via command line. Assuming differences of env were causing that. But now the newly compiled one that hung make and I could invoke, now hangs for me on command line.

Others on amd64 tried it as well with dual core, and dual proc dual core (quad) systems. No hanging and it emerged fine. I will see what I can do to replicate the hanging on amd64 or in places where it presently doesn't hang.

When it does hang, it literally does that, no looping or. Etc. Here is a strace of the hanging create_db newly compiled and still in work dir. Then the same command, once compiled and installed on my amd64 box. Strace output just stops on the hung one.

I am assuming/gut feeling that this is exposing some bug in the core of Fb or etc. Not sure, but it's trying to create a temp db, that never finishes. The file is created, doesn't grow in size. But the command never terminates, completes, or does anything beyond the hung point. At least according to strace.

Attached is a strace from x86 where it hangs. Then one from a installed copy of create_db, where it succeeds during build, and on command line. Strace is from command line invocation. Not sure if they help at all, but providing them anyway.
Comment 18 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 21:35:00 UTC
Created attachment 135269 [details]
strace ../gen/firebird/bin/create_db empty.fdb on x86
Comment 19 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-05 21:36:03 UTC
Created attachment 135271 [details]
strace create_db empty.db on amd64 ( previously compiled and installed create_db )
Comment 20 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-06 05:35:21 UTC
Please sync and emerge the new -r1 version. While stabilizing the split firebird 2.0.3 another developer ( cla ) discovered that some flags were hard coded. So I bumped that ebuild and added a patch to drop hard coded flags. In the process I did the same for 2.1.0 and revbumped that as well.

For the heck of it I tried to emerge that new version -r1, it finally emerged on x86 where it was hanging all day long earlier. Hopefully this will resolve others problems as well. If so we can close this bug and chalk it off to ricer/funky hard coded flags in build.
Comment 21 Rui Santos 2007-11-06 14:16:48 UTC
I've tried with the -r1 version on amd64, and found no apparent difference.

Still hangs on creating empty_db.
Comment 22 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-12 18:20:18 UTC
Good/bad news. I am experiencing this again on my single core Turion, which used to compile it just fine :(. Might be a compiler flag they are setting that I did not drop with last patch. Something about a bug workaround in past gcc version or etc. But could  be the cause of this issue, since that's for a past gcc version or etc.
Comment 23 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-14 21:44:59 UTC
Ok some further info on this. For some reason, looks to be side effect of splitting up firebird, if you move to ${S}/gen/firebird/bin and execute create_db it will succeed. But if you execute it where make is, ${S}/gen or any other place other than in ${S}/gen/firebird/bin it hangs. Testing out a hack/patch to cd into that dir before execution of create_db, and cd ../../ afterward. Will report back once I have more info there.
Comment 24 Patrizio Bassi 2007-11-14 21:57:55 UTC
sounds really strange!

let's see :)
Comment 25 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-17 16:42:59 UTC
Ok so changing directory before executing create_db seems to make all the difference of the world. However in now fails at another point just past that :( Attaching tail of build.log.
Comment 26 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-17 16:43:46 UTC
Created attachment 136180 [details]
tail of build.log with patch to cd before executing create_db empty.db
Comment 27 Patrizio Bassi 2007-11-18 10:49:46 UTC
i have no idea...

what i can say is this:
report upstream the two issues and hardmask this release in portage
Comment 28 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-11-19 01:20:32 UTC
Ok I am losing it. That last attachment is executing create_db from the systems previously compiled and installed firebird. Not the one that was just built in the sources.

Now thinking about this a bit. Seems like when I removed their flags, I no longer could compile on amd64, but could on x86. When before I could not build on x86 and could on amd64. I will see about messing with the flags a bit and confirming this.

As for masking. I am considering that, but likely less with unmask and/or run into this with it masked. So less are likely to help me out in finding a solution and/or cause. I would take this upstream, but with that I am doing and without more info. I doubt that would provide a solution. The makefiles are already commented about that section having problems and being picky. So not sure I would get much luv from them there anyway.

But if this caries on for a week or more. I will mask to save others the headache or etc. Just hoping I can move things forward between now and then.
Comment 29 Patrizio Bassi 2007-12-04 22:05:23 UTC
dev-db/firebird-2.1.0.16780_beta2-r1 merged now on amd64!

Comment 30 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-12-04 22:40:57 UTC
See and that I can't explain. It's still not merging here? WTF I have no idea. I will be committing an updated version soon. Syncing changes with 2.0.3 before commit. Resolves a few bugs, but surely not this one :(. I have no idea why it compiles sometimes and not others. I used to be able to compile it on my ~amd64 machine, and not on ~x86. Now it's the opposite. Plus I have two ~x86 clones, and it built on one and not on the other.

It's really fun stuff.
Comment 31 Rui Santos 2007-12-05 09:43:00 UTC
dev-db/firebird-2.1.0.16780_beta2-r1 and dev-db/firebird-2.1.0.16780_beta2-r2 now compile and install successfully on my amd64.

Thanks for all your hard work.
Comment 32 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-12-05 16:53:36 UTC
Great, still hanging here for me :( Going to leave this open for a while. After all it's a beta release, so won't be a stable candidate till a final official release. Till then I suspect others will still run into this. But glad others aren't experiencing it anymore.

Don't be surprised if you do start seeing it hang again :) I have not found the cause, nor done anything to solve, directly.

Thanks for the praise, much appreciated, in this fruitless work ;)
Comment 33 Andrzej Rybczak 2007-12-19 15:01:35 UTC
firebird-2.1.0.16780_beta2-r2 overrides flags once more and compilation with gcc-4.2.2 hangs on creating empty.db. however, it works with gcc-4.3_alpha.
Comment 34 Andrzej Rybczak 2007-12-19 15:03:37 UTC
Created attachment 138890 [details, diff]
patch removing hardcoded optimize flags.
Comment 35 William L. Thomson Jr. (RETIRED) gentoo-dev 2007-12-19 15:34:27 UTC
When I switched from paths in patch to via sed seems that I also left out the patch stuff that modified the CFLAGS.
http://sources.gentoo.org/viewcvs.py/gentoo-x86/dev-db/firebird/files/firebird-2.1.0.16780_beta2-deps-flags-libs-paths.patch?hideattic=0&view=markup

Thanks for the patch, but seems there was another line or two I modified, per the above. Will see about adding that back ASAP. Was trying to improve my sed_check function, clean up ebuilds, and maybe move the function to an eclass.

Although from my own testing I wasn't able to confirm that the hard coded CFLAGS or dropping them for user specified made any difference on the hanging. It hung for me both ways :(

Comment 36 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-01-04 17:06:15 UTC
Don't believe I have committed the fix for 2.1 less the default cflags from upstream that I accidentally added back after removing :) Anyway it just hung on my new dual core dual opteron box. So problem still exists to some extent. Now that I have a bit faster machine, I can expedite testing a bit more. Hopefully will resolve ASAP. At min will see about committing an ebuild less the default cflags this weekend.
Comment 37 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-01-04 17:35:36 UTC
Just committed -r3 that has a correct patch which removes the default hard coded cflags. Sorry for accidentally dropping that portion of the patch. Will see if this solves my hanging problems or not.
Comment 38 Dave 2008-01-05 17:11:46 UTC
(In reply to comment #37)
> Just committed -r3 that has a correct patch which removes the default hard
> coded cflags. Sorry for accidentally dropping that portion of the patch. Will
> see if this solves my hanging problems or not.
> 

I have compiled -r3 on several x86 32-bit boxes and it seems to work 

Comment 39 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-01-05 17:21:05 UTC
Unfortunately no change for me on amd64. Hangs on both my single proc turion laptop, and my dual proc dual core amd64 desktop. :( I will tweak my cflags when time permits. Likely one or more of those could be causing the hanging. Once I identify if that is the case, I will filter those flags in package in case other uses have them set.
Comment 40 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-01-10 16:28:49 UTC
Upstream seems to have hit this recently :) Maybe that will help produce a resolution. As I am pretty clueless at this point. Aside from cflag testing that I am slacking on.
Comment 41 Freddie Witherden 2008-02-11 23:54:48 UTC
I am also getting this on ~AMD64. Attaching GDB to the create_db process and getting a backtrace gives me:
0x0000000000447973 in write_buffer (tdbb=0x7fffd764a550, bdb=0x2b1bd35ad9a0, 
    page=<value optimized out>, write_thru=<value optimized out>, status=0x7fffd764ac00, 
    write_this_page=<value optimized out>) at ../src/jrd/cch.cpp:4564
4564                    QUE_DELETE(precedence->pre_higher);
(gdb) bt
#0  0x0000000000447973 in write_buffer (tdbb=0x7fffd764a550, bdb=0x2b1bd35ad9a0, 
    page=<value optimized out>, write_thru=<value optimized out>, status=0x7fffd764ac00, 
    write_this_page=<value optimized out>) at ../src/jrd/cch.cpp:4564
#1  0x0000000000447894 in write_buffer (tdbb=0x7fffd764a550, bdb=0x2b1bd35ae220, 
    page=<value optimized out>, write_thru=false, status=0x7fffd764ac00, 
    write_this_page=<value optimized out>) at ../src/jrd/cch.cpp:6464
#2  0x00000000004480bd in get_buffer (tdbb=0x7fffd764a550, page=@0x7fffd7648280, 
    latch=Jrd::LATCH_exclusive, latch_wait=-10792) at ../src/jrd/cch.cpp:5135
#3  0x000000000044a6ce in CCH_fake (tdbb=0x7fffd764a550, window=0x2b1bd3632f30, latch_wait=-10792)
    at ../src/jrd/cch.cpp:678
#4  0x00000000004bfb06 in PAG_allocate (window=0x2b1bd3632f30) at ../src/jrd/pag.cpp:793
#5  0x000000000059196b in locate_space (tdbb=0x7fffd764a550, rpb=0x2b1bd3632ec0, size=104, 
    stack=@0x2b1bd3632880, record=0x0, type=1) at ../src/jrd/dpm.cpp:3091
#6  0x0000000000592cea in DPM_store (tdbb=0x7fffd764a550, rpb=0x2b1bd3632ec0, 
    stack=@0x2b1bd3632880, type=0) at ../src/jrd/dpm.cpp:2042
#7  0x00000000004fa54c in VIO_store (tdbb=0x7fffd764a550, rpb=0x2b1bd3632ec0, 
    transaction=0x2b1bd34944b0) at ../src/jrd/vio.cpp:2806
#8  0x00000000004720b5 in store (tdbb=0x7fffd764a550, node=0x2b1bd35990f8, which_trig=0)
    at ../src/jrd/exe.cpp:3707
#9  0x000000000046e984 in looper (tdbb=0x7fffd764a550, request=0x2b1bd3632b48, 
    in_node=0x2b1bd3636a38) at ../src/jrd/exe.cpp:2641
#10 0x00000000004708f6 in execute_looper (tdbb=0x7fffd764a550, request=0x2b1bd3632b48, 
    transaction=0x2b1bd34944b0, next_state=Jrd::jrd_req::req_proceed) at ../src/jrd/exe.cpp:1450
#11 0x0000000000470b39 in EXE_send (tdbb=0x7fffd764a550, request=0x2b1bd3632b48, msg=37568, 
    length=<value optimized out>, 
    buffer=0x7fffd7648b70 "RDB$PROCEDURE_NAME", ' ' <repeats 14 times>, "RDB$PROCEDURE_NAME", ' ' <repeats 14 times>, "RDB$PROCEDURE_PARAMETERS        \001") at ../src/jrd/exe.cpp:994
#12 0x00000000005b6d08 in store_relation_field (tdbb=0x7fffd764a550, fld=0x908a84, 
    relfld=0x908a5c, field_id=<value optimized out>, handle=0x7fffd7649ef0, 
---Type <return> to continue, or q <return> to quit--- 
    fmt0_flag=<value optimized out>) at ../src/jrd/ini.cpp:2402
#13 0x00000000005b75b5 in INI_format (owner=<value optimized out>, charset=<value optimized out>)
    at ../src/jrd/ini.cpp:525
#14 0x000000000049c3c2 in jrd8_create_database (user_status=0x7fffd764ac00, _file_length=9, 
    _file_name=0x7fffd764aa88 "empty.fdb", handle=0x7fffd764aab8, 
    dpb_length=<value optimized out>, dpb=0x7fffd764a810 "\001", db_type=0, 
    _expanded_filename=0x2b1bd3485fe0 "/var/tmp/portage/dev-db/firebird-2.1.0.16780_beta2-r3/work/Firebird-2.1.0.16780-Beta2/gen/empty.fdb") at ../src/jrd/jrd.cpp:2021
#15 0x00000000004351af in isc_create_database (user_status=<value optimized out>, 
    file_length=<value optimized out>, file_name=<value optimized out>, 
    public_handle=0x7fffd764acac, dpb_length=<value optimized out>, dpb=<value optimized out>, 
    db_type=0) at ../src/jrd/why.cpp:1725
#16 0x0000000000407d0f in main (argc=<value optimized out>, argv=<value optimized out>)
    at ../src/utilities/create_db.cpp:22
Comment 42 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-03-25 19:55:51 UTC
Bumped to latest rc2, still no change on this issue :( Still have not tested out various CFLAGs. Which I hope might be causing this, if not a CFLAG. Then totally at a loss. Not sure if upstream made any progress on this or not.
Comment 43 Patrizio Bassi 2008-04-15 07:27:42 UTC
last rc2 merged correctly on my amd64 profile.
Comment 44 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-04-19 01:37:27 UTC
still no change here :( even with lastest release just committed to tree. Playing with cflags now, like dropping down to very basic ones. Will follow up if there is any change, or not.
Comment 45 William L. Thomson Jr. (RETIRED) gentoo-dev 2008-07-23 22:30:21 UTC
Just commited 2.1.1 and I did not run into this hanging problem during compile with that version. As I did all 2.1.x versions. So going to close this bug as resolved for now. I have removed all 2.1.0.x ebuilds. If anyone runs into this we can reopen bug. But hopefully it's gone now for good ;)
Comment 46 Patrizio Bassi 2008-07-24 20:17:49 UTC
yes i confirm it works for me