First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 197664
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: William L. Thomson Jr. <wltjr@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Rui Santos <rsantos@ruisantos.com>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
build.log Ebuild log file text/plain Rui Santos 2007-10-31 16:36 0000 22.75 KB Details
borked strace ../gen/firebird/bin/create_db empty.fdb on x86 text/plain William L. Thomson Jr. 2007-11-05 21:35 0000 187.39 KB Details
working strace create_db empty.db on amd64 ( previously compiled and installed create_db ) text/plain William L. Thomson Jr. 2007-11-05 21:36 0000 935.38 KB Details
output tail of build.log with patch to cd before executing create_db empty.db text/plain William L. Thomson Jr. 2007-11-17 16:43 0000 3.47 KB Details
firebird-cflags-fix.patch patch removing hardcoded optimize flags. patch unK 2007-12-19 15:03 0000 625 bytes Details | Diff
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 197664 depends on: Show dependency tree
Show dependency graph
Bug 197664 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2007-10-31 16:35 0000
firebird-2.1.0.16780_beta2 compiles but, right after it starts to install, it
keeps forever ( 4 hours ) on the command:

rm -f empty.fdb
../gen/firebird/bin/create_db empty.fdb

Reproducible: Always

Steps to Reproduce:
1. use emerge to get that version of firebird

------- Comment #1 From Rui Santos 2007-10-31 16:36:19 0000 -------
Created an attachment (id=134823) [edit]
Ebuild log file

------- Comment #2 From Patrizio Bassi 2007-10-31 21:42:55 0000 -------
i have exactly the same issue

------- Comment #3 From Jakub Moc 2007-10-31 21:55:35 0000 -------
Try w/ MAKEOPTS="-j1" and report back, please.

------- Comment #4 From Patrizio Bassi 2007-10-31 22:54:27 0000 -------
no way :(

------- Comment #5 From Jakub Moc 2007-11-01 07:47:31 0000 -------
OK, then reopen the bug ;)

------- Comment #6 From William L. Thomson Jr. 2007-11-04 20:12:28 0000 -------
I am seeing this on x86 as well, despite not having any problems emerging it on
amd64. I think it's due to make -j options. As the one it fails on is set to
use distcc and has -j5, while my amd64 box is only -j2. Haven't confirmed that
one way or another. But I have seen it getting stuck on the empty_db or etc and
it's not growing in size on disk. So not sure what's causing it to hang like
that at that point. Will comment again when I have further info.

------- Comment #7 From Patrizio Bassi 2007-11-04 20:20:46 0000 -------
i used -j1 and it fails.

the funny part is that you cannot kill it with simple "kill" but you need kill
-9.
for me it's not a jobs problem but application itself.

old firebird version worked flawlessy.

------- Comment #8 From Rui Santos 2007-11-05 11:01:46 0000 -------
I can also confirm that the "-j1" option on MAKEOPTS did not work. I'm also
getting this error. Could mean something:

config.status: src/include/gen/autoconfig.h is unchanged
./configure: line 36820: cd: extern/icu/source: No such file or directory
chmod: cannot access `runConfigureICU': No such file or directory
chmod: cannot access `install-sh': No such file or directory
./configure: line 36822: ./runConfigureICU: No such file or directory

------- Comment #9 From William L. Thomson Jr. 2007-11-05 15:14:34 0000 -------
It's really odd. I can emerge without problems on amd64, but on x86 it
routinely hangs at create_db empty.db. Changing make options and dropping
distcc doesn't make any difference as others found out already.

Really not sure what's causing this. I don't believe I am doing anything
different with the ebuild or patches than I was with 2.0.x. Much less why it
only occurs on x86, seems like the problem should be the other way around.

------- Comment #10 From William L. Thomson Jr. 2007-11-05 15:20:11 0000 -------
(In reply to comment #8)
>  I'm also getting this error. Could mean something:
> 
> config.status: src/include/gen/autoconfig.h is unchanged
> ./configure: line 36820: cd: extern/icu/source: No such file or directory
> chmod: cannot access `runConfigureICU': No such file or directory
> chmod: cannot access `install-sh': No such file or directory
> ./configure: line 36822: ./runConfigureICU: No such file or directory

While that stuff is not ideal, it doesn't seem to cause any harm or matter.
Likely due to me not patching/modifying the configure properly. But goal there
is mostly to not use bundled icu, which get's accomplished despite above
errors.

I confirmed I get the same errors on amd64, where I don't have any problems
building or running 2.1.x. So not sure what's up with compiling on x86.

------- Comment #11 From Patrizio Bassi 2007-11-05 15:45:14 0000 -------
can be due to being "beta" software?
maybe we should wait final delivery and tell upstream.

can you mask this in the meanwhile?

------- Comment #12 From William L. Thomson Jr. 2007-11-05 15:59:12 0000 -------
(In reply to comment #11)
> can be due to being "beta" software?
> maybe we should wait final delivery and tell upstream.
>
> can you mask this in the meanwhile?

I will consider masking this, but it's in ~arch where things are known to be
unstable and so on. So it's kinda par for the course. Plus I would only mask on
x86. Really would prefer more run into it, and maybe someone will know how to
fix, what's causing it, or etc.

As for masking, either don't run ~arch, or mask locally if it bugs ya :) But
for taking it upstream. I likely will if I can't find a cause for this. At
least to see if upstream is aware of this or not. But would like to find a bit
more info before I go upstream. Just in case it's not an upstream problem :)

------- Comment #13 From William L. Thomson Jr. 2007-11-05 16:18:24 0000 -------
Ok, it looks like the empty.db is never actually created. I did a find on the
work dir and found nothing. Then I moved to the dir that contains create_db
binary and executed create_db empty.db which completed in a few milliseconds.
So something must be off with env, or something during merge. Again strange it
only occurs on x86. Still looking into this. But basically when create_db is
invoked by build system it doesn't create the empty.db file, and I guess it
hangs waiting on the file to be created or something before it exits.

------- Comment #14 From Patrizio Bassi 2007-11-05 16:29:50 0000 -------
i don't know why it works but i have an amd64 (core2 duo) profile and it
doesn't work.

so it's not x86 but x86_64 as well.

for the hangs....it's not hangs only it's endless loop as the cpu is fully
used.

it's not waiting...it's burning something....!

------- Comment #15 From William L. Thomson Jr. 2007-11-05 17:39:48 0000 -------
(In reply to comment #14)
> i don't know why it works but i have an amd64 (core2 duo) profile and it
> doesn't work.
>

Odd, compiles and merges fine here all day long on single core Turion and
Opteron procs. I don't have access to any dual cores to try. Are you running
pure ~arch or mixing stable and unstable?

> so it's not x86 but x86_64 as well.

Interesting

> for the hangs....it's not hangs only it's endless loop as the cpu is fully
> used.
> 
> it's not waiting...it's burning something....!

Yeah it's stuck in some sort of loop. But it's not writing to disk or doing
anything. Other than tying up the cpu, and I can't explain why atm. Invoking
the same create_db that's generated by the compile process directly works. Must
be something in the build env throwing it off or etc. Reviewing the
Makefile.refDatabases that invokes the problem.

------- Comment #16 From Patrizio Bassi 2007-11-05 19:03:58 0000 -------
i'm at work now.

can you try to strace it? where's the loop?

try without the sandbox as well

------- Comment #17 From William L. Thomson Jr. 2007-11-05 21:34:02 0000 -------
After HOURS of looking into this. Running all kinds of tests, debugging output,
etc. I can't for the life of me figure out what's going on. At one point when
it failed via make I could invoke it via command line. Assuming differences of
env were causing that. But now the newly compiled one that hung make and I
could invoke, now hangs for me on command line.

Others on amd64 tried it as well with dual core, and dual proc dual core (quad)
systems. No hanging and it emerged fine. I will see what I can do to replicate
the hanging on amd64 or in places where it presently doesn't hang.

When it does hang, it literally does that, no looping or. Etc. Here is a strace
of the hanging create_db newly compiled and still in work dir. Then the same
command, once compiled and installed on my amd64 box. Strace output just stops
on the hung one.

I am assuming/gut feeling that this is exposing some bug in the core of Fb or
etc. Not sure, but it's trying to create a temp db, that never finishes. The
file is created, doesn't grow in size. But the command never terminates,
completes, or does anything beyond the hung point. At least according to
strace.

Attached is a strace from x86 where it hangs. Then one from a installed copy of
create_db, where it succeeds during build, and on command line. Strace is from
command line invocation. Not sure if they help at all, but providing them
anyway.

------- Comment #18 From William L. Thomson Jr. 2007-11-05 21:35:00 0000 -------
Created an attachment (id=135269) [edit]
strace ../gen/firebird/bin/create_db empty.fdb on x86

------- Comment #19 From William L. Thomson Jr. 2007-11-05 21:36:03 0000 -------
Created an attachment (id=135271) [edit]
strace create_db empty.db on amd64 ( previously compiled and installed
create_db )

------- Comment #20 From William L. Thomson Jr. 2007-11-06 05:35:21 0000 -------
Please sync and emerge the new -r1 version. While stabilizing the split
firebird 2.0.3 another developer ( cla ) discovered that some flags were hard
coded. So I bumped that ebuild and added a patch to drop hard coded flags. In
the process I did the same for 2.1.0 and revbumped that as well.

For the heck of it I tried to emerge that new version -r1, it finally emerged
on x86 where it was hanging all day long earlier. Hopefully this will resolve
others problems as well. If so we can close this bug and chalk it off to
ricer/funky hard coded flags in build.

------- Comment #21 From Rui Santos 2007-11-06 14:16:48 0000 -------
I've tried with the -r1 version on amd64, and found no apparent difference.

Still hangs on creating empty_db.

------- Comment #22 From William L. Thomson Jr. 2007-11-12 18:20:18 0000 -------
Good/bad news. I am experiencing this again on my single core Turion, which
used to compile it just fine :(. Might be a compiler flag they are setting that
I did not drop with last patch. Something about a bug workaround in past gcc
version or etc. But could  be the cause of this issue, since that's for a past
gcc version or etc.

------- Comment #23 From William L. Thomson Jr. 2007-11-14 21:44:59 0000 -------
Ok some further info on this. For some reason, looks to be side effect of
splitting up firebird, if you move to ${S}/gen/firebird/bin and execute
create_db it will succeed. But if you execute it where make is, ${S}/gen or any
other place other than in ${S}/gen/firebird/bin it hangs. Testing out a
hack/patch to cd into that dir before execution of create_db, and cd ../../
afterward. Will report back once I have more info there.

------- Comment #24 From Patrizio Bassi 2007-11-14 21:57:55 0000 -------
sounds really strange!

let's see :)

------- Comment #25 From William L. Thomson Jr. 2007-11-17 16:42:59 0000 -------
Ok so changing directory before executing create_db seems to make all the
difference of the world. However in now fails at another point just past that
:( Attaching tail of build.log.

------- Comment #26 From William L. Thomson Jr. 2007-11-17 16:43:46 0000 -------
Created an attachment (id=136180) [edit]
tail of build.log with patch to cd before executing create_db empty.db

------- Comment #27 From Patrizio Bassi 2007-11-18 10:49:46 0000 -------
i have no idea...

what i can say is this:
report upstream the two issues and hardmask this release in portage

------- Comment #28 From William L. Thomson Jr. 2007-11-19 01:20:32 0000 -------
Ok I am losing it. That last attachment is executing create_db from the systems
previously compiled and installed firebird. Not the one that was just built in
the sources.

Now thinking about this a bit. Seems like when I removed their flags, I no
longer could compile on amd64, but could on x86. When before I could not build
on x86 and could on amd64. I will see about messing with the flags a bit and
confirming this.

As for masking. I am considering that, but likely less with unmask and/or run
into this with it masked. So less are likely to help me out in finding a
solution and/or cause. I would take this upstream, but with that I am doing and
without more info. I doubt that would provide a solution. The makefiles are
already commented about that section having problems and being picky. So not
sure I would get much luv from them there anyway.

But if this caries on for a week or more. I will mask to save others the
headache or etc. Just hoping I can move things forward between now and then.

------- Comment #29 From Patrizio Bassi 2007-12-04 22:05:23 0000 -------
dev-db/firebird-2.1.0.16780_beta2-r1 merged now on amd64!

------- Comment #30 From William L. Thomson Jr. 2007-12-04 22:40:57 0000 -------
See and that I can't explain. It's still not merging here? WTF I have no idea.
I will be committing an updated version soon. Syncing changes with 2.0.3 before
commit. Resolves a few bugs, but surely not this one :(. I have no idea why it
compiles sometimes and not others. I used to be able to compile it on my ~amd64
machine, and not on ~x86. Now it's the opposite. Plus I have two ~x86 clones,
and it built on one and not on the other.

It's really fun stuff.

------- Comment #31 From Rui Santos 2007-12-05 09:43:00 0000 -------
dev-db/firebird-2.1.0.16780_beta2-r1 and dev-db/firebird-2.1.0.16780_beta2-r2
now compile and install successfully on my amd64.

Thanks for all your hard work.

------- Comment #32 From William L. Thomson Jr. 2007-12-05 16:53:36 0000 -------
Great, still hanging here for me :( Going to leave this open for a while. After
all it's a beta release, so won't be a stable candidate till a final official
release. Till then I suspect others will still run into this. But glad others
aren't experiencing it anymore.

Don't be surprised if you do start seeing it hang again :) I have not found the
cause, nor done anything to solve, directly.

Thanks for the praise, much appreciated, in this fruitless work ;)

------- Comment #33 From unK 2007-12-19 15:01:35 0000 -------
firebird-2.1.0.16780_beta2-r2 overrides flags once more and compilation with
gcc-4.2.2 hangs on creating empty.db. however, it works with gcc-4.3_alpha.

------- Comment #34 From unK 2007-12-19 15:03:37 0000 -------
Created an attachment (id=138890) [edit]
patch removing hardcoded optimize flags.

------- Comment #35 From William L. Thomson Jr. 2007-12-19 15:34:27 0000 -------
When I switched from paths in patch to via sed seems that I also left out the
patch stuff that modified the CFLAGS.
http://sources.gentoo.org/viewcvs.py/gentoo-x86/dev-db/firebird/files/firebird-2.1.0.16780_beta2-deps-flags-libs-paths.patch?hideattic=0&view=markup

Thanks for the patch, but seems there was another line or two I modified, per
the above. Will see about adding that back ASAP. Was trying to improve my
sed_check function, clean up ebuilds, and maybe move the function to an eclass.

Although from my own testing I wasn't able to confirm that the hard coded
CFLAGS or dropping them for user specified made any difference on the hanging.
It hung for me both ways :(

------- Comment #36 From William L. Thomson Jr. 2008-01-04 17:06:15 0000 -------
Don't believe I have committed the fix for 2.1 less the default cflags from
upstream that I accidentally added back after removing :) Anyway it just hung
on my new dual core dual opteron box. So problem still exists to some extent.
Now that I have a bit faster machine, I can expedite testing a bit more.
Hopefully will resolve ASAP. At min will see about committing an ebuild less
the default cflags this weekend.

------- Comment #37 From William L. Thomson Jr. 2008-01-04 17:35:36 0000 -------
Just committed -r3 that has a correct patch which removes the default hard
coded cflags. Sorry for accidentally dropping that portion of the patch. Will
see if this solves my hanging problems or not.

------- Comment #38 From Dave 2008-01-05 17:11:46 0000 -------
(In reply to comment #37)
> Just committed -r3 that has a correct patch which removes the default hard
> coded cflags. Sorry for accidentally dropping that portion of the patch. Will
> see if this solves my hanging problems or not.
> 

I have compiled -r3 on several x86 32-bit boxes and it seems to work 

------- Comment #39 From William L. Thomson Jr. 2008-01-05 17:21:05 0000 -------
Unfortunately no change for me on amd64. Hangs on both my single proc turion
laptop, and my dual proc dual core amd64 desktop. :( I will tweak my cflags
when time permits. Likely one or more of those could be causing the hanging.
Once I identify if that is the case, I will filter those flags in package in
case other uses have them set.

------- Comment #40 From William L. Thomson Jr. 2008-01-10 16:28:49 0000 -------
Upstream seems to have hit this recently :) Maybe that will help produce a
resolution. As I am pretty clueless at this point. Aside from cflag testing
that I am slacking on.

------- Comment #41 From Freddie Witherden 2008-02-11 23:54:48 0000 -------
I am also getting this on ~AMD64. Attaching GDB to the create_db process and
getting a backtrace gives me:
0x0000000000447973 in write_buffer (tdbb=0x7fffd764a550, bdb=0x2b1bd35ad9a0, 
    page=<value optimized out>, write_thru=<value optimized out>,
status=0x7fffd764ac00, 
    write_this_page=<value optimized out>) at ../src/jrd/cch.cpp:4564
4564                    QUE_DELETE(precedence->pre_higher);
(gdb) bt
#0  0x0000000000447973 in write_buffer (tdbb=0x7fffd764a550,
bdb=0x2b1bd35ad9a0, 
    page=<value optimized out>, write_thru=<value optimized out>,
status=0x7fffd764ac00, 
    write_this_page=<value optimized out>) at ../src/jrd/cch.cpp:4564
#1  0x0000000000447894 in write_buffer (tdbb=0x7fffd764a550,
bdb=0x2b1bd35ae220, 
    page=<value optimized out>, write_thru=false, status=0x7fffd764ac00, 
    write_this_page=<value optimized out>) at ../src/jrd/cch.cpp:6464
#2  0x00000000004480bd in get_buffer (tdbb=0x7fffd764a550,
page=@0x7fffd7648280, 
    latch=Jrd::LATCH_exclusive, latch_wait=-10792) at ../src/jrd/cch.cpp:5135
#3  0x000000000044a6ce in CCH_fake (tdbb=0x7fffd764a550, window=0x2b1bd3632f30,
latch_wait=-10792)
    at ../src/jrd/cch.cpp:678
#4  0x00000000004bfb06 in PAG_allocate (window=0x2b1bd3632f30) at
../src/jrd/pag.cpp:793
#5  0x000000000059196b in locate_space (tdbb=0x7fffd764a550,
rpb=0x2b1bd3632ec0, size=104, 
    stack=@0x2b1bd3632880, record=0x0, type=1) at ../src/jrd/dpm.cpp:3091
#6  0x0000000000592cea in DPM_store (tdbb=0x7fffd764a550, rpb=0x2b1bd3632ec0, 
    stack=@0x2b1bd3632880, type=0) at ../src/jrd/dpm.cpp:2042
#7  0x00000000004fa54c in VIO_store (tdbb=0x7fffd764a550, rpb=0x2b1bd3632ec0, 
    transaction=0x2b1bd34944b0) at ../src/jrd/vio.cpp:2806
#8  0x00000000004720b5 in store (tdbb=0x7fffd764a550, node=0x2b1bd35990f8,
which_trig=0)
    at ../src/jrd/exe.cpp:3707
#9  0x000000000046e984 in looper (tdbb=0x7fffd764a550, request=0x2b1bd3632b48, 
    in_node=0x2b1bd3636a38) at ../src/jrd/exe.cpp:2641
#10 0x00000000004708f6 in execute_looper (tdbb=0x7fffd764a550,
request=0x2b1bd3632b48, 
    transaction=0x2b1bd34944b0, next_state=Jrd::jrd_req::req_proceed) at
../src/jrd/exe.cpp:1450
#11 0x0000000000470b39 in EXE_send (tdbb=0x7fffd764a550,
request=0x2b1bd3632b48, msg=37568, 
    length=<value optimized out>, 
    buffer=0x7fffd7648b70 "RDB$PROCEDURE_NAME", ' ' <repeats 14 times>,
"RDB$PROCEDURE_NAME", ' ' <repeats 14 times>, "RDB$PROCEDURE_PARAMETERS       
\001") at ../src/jrd/exe.cpp:994
#12 0x00000000005b6d08 in store_relation_field (tdbb=0x7fffd764a550,
fld=0x908a84, 
    relfld=0x908a5c, field_id=<value optimized out>, handle=0x7fffd7649ef0, 
---Type <return> to continue, or q <return> to quit--- 
    fmt0_flag=<value optimized out>) at ../src/jrd/ini.cpp:2402
#13 0x00000000005b75b5 in INI_format (owner=<value optimized out>,
charset=<value optimized out>)
    at ../src/jrd/ini.cpp:525
#14 0x000000000049c3c2 in jrd8_create_database (user_status=0x7fffd764ac00,
_file_length=9, 
    _file_name=0x7fffd764aa88 "empty.fdb", handle=0x7fffd764aab8, 
    dpb_length=<value optimized out>, dpb=0x7fffd764a810 "\001", db_type=0, 
    _expanded_filename=0x2b1bd3485fe0
"/var/tmp/portage/dev-db/firebird-2.1.0.16780_beta2-r3/work/Firebird-2.1.0.16780-Beta2/gen/empty.fdb")
at ../src/jrd/jrd.cpp:2021
#15 0x00000000004351af in isc_create_database (user_status=<value optimized
out>, 
    file_length=<value optimized out>, file_name=<value optimized out>, 
    public_handle=0x7fffd764acac, dpb_length=<value optimized out>, dpb=<value
optimized out>, 
    db_type=0) at ../src/jrd/why.cpp:1725
#16 0x0000000000407d0f in main (argc=<value optimized out>, argv=<value
optimized out>)
    at ../src/utilities/create_db.cpp:22

------- Comment #42 From William L. Thomson Jr. 2008-03-25 19:55:51 0000 -------
Bumped to latest rc2, still no change on this issue :( Still have not tested
out various CFLAGs. Which I hope might be causing this, if not a CFLAG. Then
totally at a loss. Not sure if upstream made any progress on this or not.

------- Comment #43 From Patrizio Bassi 2008-04-15 07:27:42 0000 -------
last rc2 merged correctly on my amd64 profile.

------- Comment #44 From William L. Thomson Jr. 2008-04-19 01:37:27 0000 -------
still no change here :( even with lastest release just committed to tree.
Playing with cflags now, like dropping down to very basic ones. Will follow up
if there is any change, or not.

------- Comment #45 From William L. Thomson Jr. 2008-07-23 22:30:21 0000 -------
Just commited 2.1.1 and I did not run into this hanging problem during compile
with that version. As I did all 2.1.x versions. So going to close this bug as
resolved for now. I have removed all 2.1.0.x ebuilds. If anyone runs into this
we can reopen bug. But hopefully it's gone now for good ;)

------- Comment #46 From Patrizio Bassi 2008-07-24 20:17:49 0000 -------
yes i confirm it works for me

First Last Prev Next    No search results available      Search page      Enter new bug