Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 253451 - [new ebuild] sci-biology/consed-19
Summary: [new ebuild] sci-biology/consed-19
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Andrey Kislyuk (RETIRED)
URL: http://bozeman.mbt.washington.edu/con...
Whiteboard:
Keywords:
Depends on: 253368
Blocks:
  Show dependency tree
 
Reported: 2009-01-02 15:03 UTC by Martin Mokrejš
Modified: 2010-05-22 08:53 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
consed-bin-20080723.ebuild (consed-bin-20080723.ebuild,3.55 KB, text/plain)
2009-01-02 15:08 UTC, Martin Mokrejš
Details
files/consed (consed,645 bytes, text/plain)
2009-01-02 15:08 UTC, Martin Mokrejš
Details
Manifest (Manifest,590 bytes, text/plain)
2009-01-02 15:09 UTC, Martin Mokrejš
Details
consed-bin-20080723.ebuild (consed-bin-20080723.ebuild,3.66 KB, text/plain)
2009-01-02 15:18 UTC, Martin Mokrejš
Details
consed-bin-20080723.ebuild (consed-bin-20080723.ebuild,3.94 KB, text/plain)
2009-01-06 20:33 UTC, Martin Mokrejš
Details
Manifest (Manifest,794 bytes, text/plain)
2009-01-06 20:34 UTC, Martin Mokrejš
Details
consed-19.ebuild (consed-19.ebuild,924 bytes, text/plain)
2009-02-13 18:38 UTC, Andrey Kislyuk (RETIRED)
Details
consed-19.ebuild (consed-19.ebuild,1.48 KB, text/plain)
2009-02-13 19:18 UTC, Andrey Kislyuk (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Mokrejš 2009-01-02 15:03:32 UTC
It is distributed under the same licence as sci-biology/phrap.

The consed binary is compiled from a really huge set of C++ files but currently does not compile for me. Luckily, both static and dynamic binaries are available for download for registered users (actually from registered IP addresses). User has to download the file but there are few tools which have to be compiled (ANSI C). Thus, the package name "*-bin" might be slightly misleading.

The ebuild installs nice set of example data into /usr/share/consed-bin/examples.
To test, do

$ /usr/share/consed-bin/examples/standard/edit_dir
$ consed &

Happy genome assembling. ;-)
Comment 1 Martin Mokrejš 2009-01-02 15:08:04 UTC
Created attachment 177100 [details]
consed-bin-20080723.ebuild
Comment 2 Martin Mokrejš 2009-01-02 15:08:53 UTC
Created attachment 177101 [details]
files/consed

installed as /etc/env.d/21consed
Comment 3 Martin Mokrejš 2009-01-02 15:09:10 UTC
Created attachment 177103 [details]
Manifest
Comment 4 Martin Mokrejš 2009-01-02 15:18:13 UTC
Created attachment 177109 [details]
consed-bin-20080723.ebuild

Documented that amd64 users should download a different binary. However, the ebuild instructs them to download the 32bit version. Somebody more skilled with portage could improve that minor issue.
Comment 5 Martin Mokrejš 2009-01-06 20:33:09 UTC
Created attachment 177597 [details]
consed-bin-20080723.ebuild

Improved support for the different arches. Also respect CFLAGS and CC variables.
Comment 6 Martin Mokrejš 2009-01-06 20:34:21 UTC
Created attachment 177599 [details]
Manifest

Added checksum for the 64bit binary.
Comment 7 Andrey Kislyuk (RETIRED) gentoo-dev 2009-02-13 18:37:10 UTC
Looks like we'll need to split this into consed and consed-utils or something like that. The source archive doesn't seem to contain a lot of what the old ebuild handled.
Comment 8 Andrey Kislyuk (RETIRED) gentoo-dev 2009-02-13 18:38:22 UTC
Created attachment 181893 [details]
consed-19.ebuild
Comment 9 Andrey Kislyuk (RETIRED) gentoo-dev 2009-02-13 19:18:17 UTC
Created attachment 181900 [details]
consed-19.ebuild

various tweaks
Comment 10 Andrey Kislyuk (RETIRED) gentoo-dev 2009-02-18 16:55:17 UTC
Now in gentoo-x86. Please test.

Martin, I have a few comments/questions.
1. I have decided not to break out phd2fasta into its own package since it's an extra manual step for the user to obtain that file from a different person than consed.
2. What's the rationale for naming the env.d script 21consed? I changed it to 99consed (lowest priority).
3. I haven't been able to test whatever you were trying to ask people to test in your comment in the ebuild. Please test it yourself if you need. Also it's not a good idea to put in comments like that, instead please put them in the bug.
Comment 11 Martin Mokrejš 2009-02-18 17:40:30 UTC
(In reply to comment #10)
> Now in gentoo-x86. Please test.
> 
> Martin, I have a few comments/questions.
> 1. I have decided not to break out phd2fasta into its own package since
> it's an extra manual step for the user to obtain that file from a different 
> person than consed.

My feeling was that virtually everybody applies for the license to all the programs at once, so no problem to ask for current version via email one of the three authors. phd2fasta is distributed via email from Brent Ewing along with phred. So everybody having phred has or may get the latest phd2fasta, regardless what is packaged into consed. Just my 2c.

> 2. What's the rationale for naming the env.d script 21consed? I changed it to
> 99consed (lowest priority).

My bad guess.
Comment 12 Martin Mokrejš 2009-02-21 22:25:41 UTC
I would go for the -DX86_GCC_LINUX disabled by default (commented in my last consed-bin-20080723.ebuild attachment). Here is what Brent Ewing (upstream) said:

<quote>
Dear Martin;

I do not have a simple answer or direct answer to your question.
I will try to explain what I do know.

The IEEE double precision floating point operation specification
requires 64-bit precision. Processors such as the SUN SPARC and
H-P PA-RISC perform the double precision floating point operations
with 64-bit precision, as a result, phred produces identical
results when run on any of these processors.

However, the x86 processors (including the AMD64) can use an x87
floating point co-processor to perform the floating point
operations and the x87 registers are 80 bits so they have greater
precision. However one can force the co-processor to use 64-bit
precision by setting bits in the x87 control register. When phred
runs on an x86 machine with the floating point operations
performed by the x87 co-processor, the results can differ
slightly when compared to the results when phred runs on a SUN
SPARC, for example. Some people who use phred want the same
results regardless of the machine on which it runs because they
have a variety of computers on which they run phred and they need
consistent results. The 'X86_GCC_LINUX' pre-processor variable
sets the x87 control register bit so that the co-processor uses
the 64-bit precision.

(I have noticed that some Linux distributions seem to force the
64-bit floating point math by default so one would not need to
set the X86_GCC_LINUX variable with such distributions.)

In addition, more recent x86 processors have SIMD extensions,
'which refers to the ability to use a Single Instruction on
Multiple Data items' (from AMD technical information), such as
MMX, SSE128, and so on. Some of these SIMD extensions support
double precision floating point operations (I am not familiar
with these extensions so I cannot tell you much about them)
so a C compiler can produce executable code that uses the
extensions for floating point operations rather than the x87
co-processor. (I have the impression that such SIMD double
precision floating point operations are made with 64-bit
precision but I have not studied this question.)

My guess is that if you run phred on an x86 machine, including
the AMD64, and you want it to produce results that are consistent
with those it produces when run on 64-bit (IEEE-conformant)
machines, you should set the 'X86_GCC_LINUX' pre-processor
variable.

So the double precision floating point operation precision
can depend upon the C compiler, the math library, the operating
system, and the options that you use when you run the C compiler.
If you need phred to produce consistent results when run on
IEEE-conformant machines and x86 machines, I suggest that you
run some test comparisons and check the operating system, math
library, and C compiler documentation.

I appreciate your consideration and patience.

		Best Wishes,
		Brent

On Thu, Jan 01, 2009 at 04:41:09PM +0100, Martin MOKREJŠ wrote:
> > Brent,
> >   does this apply also to amd64 targets?
> > 
> >    When building phred on x86 Linux machines using the GNU C compiler,
> >    define X86_GCC_LINUX in the CFLAGS Makefile variable. See the
> >    Makefile for additional information. (The GNU C compiler is the
> >    C compiler supplied with Linux distributions.) You can find
> >    additional information on the x86 FPU control register contents in
> >    the Linux system file /usr/include/fpu_control.h.
> > 
> > TIA
> > Martin
</quote>

Comment 13 Martin Mokrejš 2009-02-21 23:19:44 UTC
(In reply to comment #10)
> Now in gentoo-x86. Please test.

The ebuild does not fix the Makefiles to respect users CFLAGS or CXXFLAGS for mktrace/ and phd2fasta/, actually not even for the main:

# grep CFLAGS /etc/make.conf | grep -v "^#"
CFLAGS="-O2 -march=pentium4 -mmmx -msse -msse2 -pipe -fno-strict-aliasing -ggdb"
CXXFLAGS="${CFLAGS}"
#
[cut]
>>> Compiling source in /var/tmp/portage/sci-biology/consed-19/work ...
make 
g++  -DX86_GCC_LINUX  -w -DINLINE_RWTPTRORDEREDVECTOR -DINLINE_RWTVALORDEREDVECTOR -DINLINE_MBTVALVECTOR -DLINUX_COMPILE -DSOCKLEN_T_DEFINED -D__BOOL_DEFINED -DANSI_C -DOFSTREAM_OPEN_WITHOUT_PERMISSIONS -fpermissive -DNO_POUND_POUND_MACROS -DUSE_USING_IN_PUBLIC_TEMPLATE_CLASSES -DINT_CHAR_OPERATOR -D_FILE_OFFSET_BITS=64 -O -I/usr/X11R6/include -c findQueryWithinSubject.cpp
[cut]
make -C misc/mktrace 
make: Entering directory `/var/tmp/portage/sci-biology/consed-19/work/misc/mktrace'
cc -g   -c -o mktrace.o mktrace.c
[cut]
make -C misc/phd2fasta 
make: Entering directory `/var/tmp/portage/sci-biology/consed-19/work/misc/phd2fasta'
cc -O -w   -c -o phd2fasta.o phd2fasta.c
[cut]


You forgot
dobin contributions/*

But it compiles and installs. ;-)
Comment 14 Andrey Kislyuk (RETIRED) gentoo-dev 2009-02-23 14:40:01 UTC
Martin-

Thank you for your research.

I have fixed the CFLAGS in misc/. As for the main compile, are you sure you're using the right ebuild? It substitutes CFLAGS properly on my machine. If you're sure, please attach your ebuild --info and I'll check this further.

I've included contributions/*.

These changes will go into the tree in consed-19-r1.
Comment 15 Martin Mokrejš 2009-02-23 17:26:56 UTC
(In reply to comment #14)

> I have fixed the CFLAGS in misc/. As for the main compile, are you sure you're
> using the right ebuild? It substitutes CFLAGS properly on my machine. If 
> you're sure, please attach your ebuild --info and I'll check this further.

>>> Emerging (1 of 1) sci-biology/consed-19
 * consed-19-linux.tar.gz RMD160 SHA1 SHA256 size ;-) ...                                                                                                                                                                     [ ok ]
 * consed-19-sources.tar.gz RMD160 SHA1 SHA256 size ;-) ...                                                                                                                                                                   [ ok ]
 * checking ebuild checksums ;-) ...                                                                                                                                                                                          [ ok ]
 * checking auxfile checksums ;-) ...                                                                                                                                                                                         [ ok ]
 * checking miscfile checksums ;-) ...                                                                                                                                                                                        [ ok ]
>>> Unpacking source...
>>> Unpacking consed-19-linux.tar.gz to /var/tmp/portage/sci-biology/consed-19/work
>>> Unpacking consed-19-sources.tar.gz to /var/tmp/portage/sci-biology/consed-19/work
>>> Source unpacked in /var/tmp/portage/sci-biology/consed-19/work
>>> Compiling source in /var/tmp/portage/sci-biology/consed-19/work ...
make 
g++ -O2 -march=pentium4 -mmmx -msse -msse2 -pipe -fno-strict-aliasing -ggdb  -w -DINLINE_RWTPTRORDEREDVECTOR -DINLINE_RWTVALORDEREDVECTOR -DINLINE_MBTVALVECTOR -DLINUX_COMPILE -DSOCKLEN_T_DEFINED -D__BOOL_DEFINED -DANSI_C -DOFSTREAM_OPEN_WITHOUT_PERMISSIONS -fpermissive -DNO_POUND_POUND_MACROS -DUSE_USING_IN_PUBLIC_TEMPLATE_CLASSES -DINT_CHAR_OPERATOR -D_FILE_OFFSET_BITS=64 -O -I/usr/X11R6/include -c findQueryWithinSubject.cpp
[cut]

You are right, it works at least as of now after "emerge --sync; layman -S; emerge --regen".
Comment 16 Martin Mokrejš 2009-02-23 17:28:27 UTC
(In reply to comment #12)
> I would go for the -DX86_GCC_LINUX disabled by default (commented in my last
> consed-bin-20080723.ebuild attachment). Here is what Brent Ewing (upstream)
> said:

Ah, wanted to say I would enable it by default!
Comment 17 Martin Mokrejš 2009-03-10 20:41:22 UTC
(In reply to comment #0)

> To test, do
> 

$ cd /usr/share/consed-bin/examples/standard/edit_dir
$ consed &
Comment 18 Andrey Kislyuk (RETIRED) gentoo-dev 2009-03-15 17:07:45 UTC
consed-19-r1 is in the main tree.

I have fixed the handling of the phredpar.dat file location and the screenLibs files.

I haven't been able to find any mention of the X86_GCC_LINUX variable anywhere in the Consed codebase so I omitted setting it. Let me know if you think it's a serious issue.

Please test.
Comment 19 Martin Mokrejš 2009-03-15 19:19:23 UTC
(In reply to comment #18)

> I haven't been able to find any mention of the X86_GCC_LINUX variable anywhere
> in the Consed codebase so I omitted setting it. Let me know if you think it's
> a serious issue.

Hmm, an issue with phred sources, not consed. Sorry for my confusing comment #12. ;-)

I will attach a testcase for phred bug #253364 which could be used on different arches to test whether floatpoint calculations have same precision.
Comment 20 Martin Mokrejš 2009-03-16 18:36:16 UTC
--- consed.old  2009-03-16 12:07:23.000000000 +0100
+++ consed.new  2009-03-16 19:29:50.000000000 +0100
@@ -5,6 +5,10 @@
 /usr/bin
 /usr/bin/ace2Fasta.perl
 /usr/bin/ace2Oligos.perl
+/usr/bin/ace2OligosWithComments.perl
+/usr/bin/ace2fof
+/usr/bin/aceContigs2Phds.perl
+/usr/bin/acestatus.pl
 /usr/bin/add454Reads.perl
 /usr/bin/addReads2Consed.perl
 /usr/bin/addSolexaReads.perl
@@ -13,24 +17,37 @@
 /usr/bin/consed
 /usr/bin/countEditedBases.perl
 /usr/bin/determineReadTypes.perl
+/usr/bin/export_cons
 /usr/bin/fasta2Ace.perl
 /usr/bin/fasta2Phd.perl
 /usr/bin/filter454Reads.perl
 /usr/bin/findSequenceMatchesForConsed.perl
 /usr/bin/lib2Phd.perl
 /usr/bin/makePhdBall.perl
+/usr/bin/mergeAces.perl
 /usr/bin/mktrace
 /usr/bin/orderPrimerPairs.perl
 /usr/bin/phd2Ace.perl
 /usr/bin/phd2fasta
 /usr/bin/phredPhrap
+/usr/bin/recover_consensus_tags
 /usr/bin/removeReads
 /usr/bin/revertToUneditedRead
+/usr/bin/revert_fof
 /usr/bin/selectRegions.perl
 /usr/bin/sff2scf
 /usr/bin/tagRepeats.perl
 /usr/bin/testSocket.perl
 /usr/bin/transferConsensusTags.perl
+/usr/lib
+/usr/lib/screenLibs
+/usr/lib/screenLibs/filter454Reads.fa
+/usr/lib/screenLibs/primerCloneScreen.seq
+/usr/lib/screenLibs/primerSubcloneScreen.seq
+/usr/lib/screenLibs/repeats.fasta
+/usr/lib/screenLibs/sffLinkers.fa
+/usr/lib/screenLibs/singleVectorForRestrictionDigest.fasta
+/usr/lib/screenLibs/vector.seq
 /usr/share
 /usr/share/consed
 /usr/share/consed/examples
@@ -1100,6 +1117,6 @@
 /usr/share/consed/examples/standard/phd_dir/djs74-932.s1.phd.1
 /usr/share/consed/examples/standard/phd_dir/djs74-996.s2.phd.1
 /usr/share/doc
-/usr/share/doc/consed-19
-/usr/share/doc/consed-19/19.0_announcement.txt.bz2
-/usr/share/doc/consed-19/README.txt.bz2
+/usr/share/doc/consed-19-r1
+/usr/share/doc/consed-19-r1/19.0_announcement.txt.bz2
+/usr/share/doc/consed-19-r1/README.txt.bz2

Andrey, wouldn't it be better to place the /usr/lib/screenLibs contents under
/usr/share/consed?


# cat /etc/env.d/99consed 
CONSED_HOME=/usr
PHRED_PARAMETER_FILE=/usr/share/phred/phredpar.dat
# cat /etc/env.d/99phred 
PHRED_PARAMETER_FILE=/usr/share/phred/phredpar.dat
# 

Hope these will be kept in sync. Will changes to these files be respected
by etc-update?
Comment 21 Andreas K. Hüttel archtester gentoo-dev 2010-04-04 22:14:13 UTC
Shouldn't we close this bug since the ebuild is in the tree? 
Please reopen if you think otherwise. 
Comment 22 Martin Mokrejš 2010-04-06 08:39:16 UTC
(In reply to comment #21)
> Shouldn't we close this bug since the ebuild is in the tree? 
> Please reopen if you think otherwise. 
> 

I think /etc/env.d/99consed should not provide PHRED_PARAMETER_FILE env variable, as it depends on phred package which provides it.

Comment 23 Justin Lecher (RETIRED) gentoo-dev 2010-05-22 08:53:43 UTC
+*consed-19-r2 (22 May 2010)
+
+  22 May 2010; Justin Lecher <jlec@gentoo.org> consed-19-r1.ebuild,
+  +consed-19-r2.ebuild:
+  removed PHRED_PARAMETER_FILE env, #253451
+