It is distributed under the same licence as sci-biology/phrap. The consed binary is compiled from a really huge set of C++ files but currently does not compile for me. Luckily, both static and dynamic binaries are available for download for registered users (actually from registered IP addresses). User has to download the file but there are few tools which have to be compiled (ANSI C). Thus, the package name "*-bin" might be slightly misleading. The ebuild installs nice set of example data into /usr/share/consed-bin/examples. To test, do $ /usr/share/consed-bin/examples/standard/edit_dir $ consed & Happy genome assembling. ;-)
Created attachment 177100 [details] consed-bin-20080723.ebuild
Created attachment 177101 [details] files/consed installed as /etc/env.d/21consed
Created attachment 177103 [details] Manifest
Created attachment 177109 [details] consed-bin-20080723.ebuild Documented that amd64 users should download a different binary. However, the ebuild instructs them to download the 32bit version. Somebody more skilled with portage could improve that minor issue.
Created attachment 177597 [details] consed-bin-20080723.ebuild Improved support for the different arches. Also respect CFLAGS and CC variables.
Created attachment 177599 [details] Manifest Added checksum for the 64bit binary.
Looks like we'll need to split this into consed and consed-utils or something like that. The source archive doesn't seem to contain a lot of what the old ebuild handled.
Created attachment 181893 [details] consed-19.ebuild
Created attachment 181900 [details] consed-19.ebuild various tweaks
Now in gentoo-x86. Please test. Martin, I have a few comments/questions. 1. I have decided not to break out phd2fasta into its own package since it's an extra manual step for the user to obtain that file from a different person than consed. 2. What's the rationale for naming the env.d script 21consed? I changed it to 99consed (lowest priority). 3. I haven't been able to test whatever you were trying to ask people to test in your comment in the ebuild. Please test it yourself if you need. Also it's not a good idea to put in comments like that, instead please put them in the bug.
(In reply to comment #10) > Now in gentoo-x86. Please test. > > Martin, I have a few comments/questions. > 1. I have decided not to break out phd2fasta into its own package since > it's an extra manual step for the user to obtain that file from a different > person than consed. My feeling was that virtually everybody applies for the license to all the programs at once, so no problem to ask for current version via email one of the three authors. phd2fasta is distributed via email from Brent Ewing along with phred. So everybody having phred has or may get the latest phd2fasta, regardless what is packaged into consed. Just my 2c. > 2. What's the rationale for naming the env.d script 21consed? I changed it to > 99consed (lowest priority). My bad guess.
I would go for the -DX86_GCC_LINUX disabled by default (commented in my last consed-bin-20080723.ebuild attachment). Here is what Brent Ewing (upstream) said: <quote> Dear Martin; I do not have a simple answer or direct answer to your question. I will try to explain what I do know. The IEEE double precision floating point operation specification requires 64-bit precision. Processors such as the SUN SPARC and H-P PA-RISC perform the double precision floating point operations with 64-bit precision, as a result, phred produces identical results when run on any of these processors. However, the x86 processors (including the AMD64) can use an x87 floating point co-processor to perform the floating point operations and the x87 registers are 80 bits so they have greater precision. However one can force the co-processor to use 64-bit precision by setting bits in the x87 control register. When phred runs on an x86 machine with the floating point operations performed by the x87 co-processor, the results can differ slightly when compared to the results when phred runs on a SUN SPARC, for example. Some people who use phred want the same results regardless of the machine on which it runs because they have a variety of computers on which they run phred and they need consistent results. The 'X86_GCC_LINUX' pre-processor variable sets the x87 control register bit so that the co-processor uses the 64-bit precision. (I have noticed that some Linux distributions seem to force the 64-bit floating point math by default so one would not need to set the X86_GCC_LINUX variable with such distributions.) In addition, more recent x86 processors have SIMD extensions, 'which refers to the ability to use a Single Instruction on Multiple Data items' (from AMD technical information), such as MMX, SSE128, and so on. Some of these SIMD extensions support double precision floating point operations (I am not familiar with these extensions so I cannot tell you much about them) so a C compiler can produce executable code that uses the extensions for floating point operations rather than the x87 co-processor. (I have the impression that such SIMD double precision floating point operations are made with 64-bit precision but I have not studied this question.) My guess is that if you run phred on an x86 machine, including the AMD64, and you want it to produce results that are consistent with those it produces when run on 64-bit (IEEE-conformant) machines, you should set the 'X86_GCC_LINUX' pre-processor variable. So the double precision floating point operation precision can depend upon the C compiler, the math library, the operating system, and the options that you use when you run the C compiler. If you need phred to produce consistent results when run on IEEE-conformant machines and x86 machines, I suggest that you run some test comparisons and check the operating system, math library, and C compiler documentation. I appreciate your consideration and patience. Best Wishes, Brent On Thu, Jan 01, 2009 at 04:41:09PM +0100, Martin MOKREJŠ wrote: > > Brent, > > does this apply also to amd64 targets? > > > > When building phred on x86 Linux machines using the GNU C compiler, > > define X86_GCC_LINUX in the CFLAGS Makefile variable. See the > > Makefile for additional information. (The GNU C compiler is the > > C compiler supplied with Linux distributions.) You can find > > additional information on the x86 FPU control register contents in > > the Linux system file /usr/include/fpu_control.h. > > > > TIA > > Martin </quote>
(In reply to comment #10) > Now in gentoo-x86. Please test. The ebuild does not fix the Makefiles to respect users CFLAGS or CXXFLAGS for mktrace/ and phd2fasta/, actually not even for the main: # grep CFLAGS /etc/make.conf | grep -v "^#" CFLAGS="-O2 -march=pentium4 -mmmx -msse -msse2 -pipe -fno-strict-aliasing -ggdb" CXXFLAGS="${CFLAGS}" # [cut] >>> Compiling source in /var/tmp/portage/sci-biology/consed-19/work ... make g++ -DX86_GCC_LINUX -w -DINLINE_RWTPTRORDEREDVECTOR -DINLINE_RWTVALORDEREDVECTOR -DINLINE_MBTVALVECTOR -DLINUX_COMPILE -DSOCKLEN_T_DEFINED -D__BOOL_DEFINED -DANSI_C -DOFSTREAM_OPEN_WITHOUT_PERMISSIONS -fpermissive -DNO_POUND_POUND_MACROS -DUSE_USING_IN_PUBLIC_TEMPLATE_CLASSES -DINT_CHAR_OPERATOR -D_FILE_OFFSET_BITS=64 -O -I/usr/X11R6/include -c findQueryWithinSubject.cpp [cut] make -C misc/mktrace make: Entering directory `/var/tmp/portage/sci-biology/consed-19/work/misc/mktrace' cc -g -c -o mktrace.o mktrace.c [cut] make -C misc/phd2fasta make: Entering directory `/var/tmp/portage/sci-biology/consed-19/work/misc/phd2fasta' cc -O -w -c -o phd2fasta.o phd2fasta.c [cut] You forgot dobin contributions/* But it compiles and installs. ;-)
Martin- Thank you for your research. I have fixed the CFLAGS in misc/. As for the main compile, are you sure you're using the right ebuild? It substitutes CFLAGS properly on my machine. If you're sure, please attach your ebuild --info and I'll check this further. I've included contributions/*. These changes will go into the tree in consed-19-r1.
(In reply to comment #14) > I have fixed the CFLAGS in misc/. As for the main compile, are you sure you're > using the right ebuild? It substitutes CFLAGS properly on my machine. If > you're sure, please attach your ebuild --info and I'll check this further. >>> Emerging (1 of 1) sci-biology/consed-19 * consed-19-linux.tar.gz RMD160 SHA1 SHA256 size ;-) ... [ ok ] * consed-19-sources.tar.gz RMD160 SHA1 SHA256 size ;-) ... [ ok ] * checking ebuild checksums ;-) ... [ ok ] * checking auxfile checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] >>> Unpacking source... >>> Unpacking consed-19-linux.tar.gz to /var/tmp/portage/sci-biology/consed-19/work >>> Unpacking consed-19-sources.tar.gz to /var/tmp/portage/sci-biology/consed-19/work >>> Source unpacked in /var/tmp/portage/sci-biology/consed-19/work >>> Compiling source in /var/tmp/portage/sci-biology/consed-19/work ... make g++ -O2 -march=pentium4 -mmmx -msse -msse2 -pipe -fno-strict-aliasing -ggdb -w -DINLINE_RWTPTRORDEREDVECTOR -DINLINE_RWTVALORDEREDVECTOR -DINLINE_MBTVALVECTOR -DLINUX_COMPILE -DSOCKLEN_T_DEFINED -D__BOOL_DEFINED -DANSI_C -DOFSTREAM_OPEN_WITHOUT_PERMISSIONS -fpermissive -DNO_POUND_POUND_MACROS -DUSE_USING_IN_PUBLIC_TEMPLATE_CLASSES -DINT_CHAR_OPERATOR -D_FILE_OFFSET_BITS=64 -O -I/usr/X11R6/include -c findQueryWithinSubject.cpp [cut] You are right, it works at least as of now after "emerge --sync; layman -S; emerge --regen".
(In reply to comment #12) > I would go for the -DX86_GCC_LINUX disabled by default (commented in my last > consed-bin-20080723.ebuild attachment). Here is what Brent Ewing (upstream) > said: Ah, wanted to say I would enable it by default!
(In reply to comment #0) > To test, do > $ cd /usr/share/consed-bin/examples/standard/edit_dir $ consed &
consed-19-r1 is in the main tree. I have fixed the handling of the phredpar.dat file location and the screenLibs files. I haven't been able to find any mention of the X86_GCC_LINUX variable anywhere in the Consed codebase so I omitted setting it. Let me know if you think it's a serious issue. Please test.
(In reply to comment #18) > I haven't been able to find any mention of the X86_GCC_LINUX variable anywhere > in the Consed codebase so I omitted setting it. Let me know if you think it's > a serious issue. Hmm, an issue with phred sources, not consed. Sorry for my confusing comment #12. ;-) I will attach a testcase for phred bug #253364 which could be used on different arches to test whether floatpoint calculations have same precision.
--- consed.old 2009-03-16 12:07:23.000000000 +0100 +++ consed.new 2009-03-16 19:29:50.000000000 +0100 @@ -5,6 +5,10 @@ /usr/bin /usr/bin/ace2Fasta.perl /usr/bin/ace2Oligos.perl +/usr/bin/ace2OligosWithComments.perl +/usr/bin/ace2fof +/usr/bin/aceContigs2Phds.perl +/usr/bin/acestatus.pl /usr/bin/add454Reads.perl /usr/bin/addReads2Consed.perl /usr/bin/addSolexaReads.perl @@ -13,24 +17,37 @@ /usr/bin/consed /usr/bin/countEditedBases.perl /usr/bin/determineReadTypes.perl +/usr/bin/export_cons /usr/bin/fasta2Ace.perl /usr/bin/fasta2Phd.perl /usr/bin/filter454Reads.perl /usr/bin/findSequenceMatchesForConsed.perl /usr/bin/lib2Phd.perl /usr/bin/makePhdBall.perl +/usr/bin/mergeAces.perl /usr/bin/mktrace /usr/bin/orderPrimerPairs.perl /usr/bin/phd2Ace.perl /usr/bin/phd2fasta /usr/bin/phredPhrap +/usr/bin/recover_consensus_tags /usr/bin/removeReads /usr/bin/revertToUneditedRead +/usr/bin/revert_fof /usr/bin/selectRegions.perl /usr/bin/sff2scf /usr/bin/tagRepeats.perl /usr/bin/testSocket.perl /usr/bin/transferConsensusTags.perl +/usr/lib +/usr/lib/screenLibs +/usr/lib/screenLibs/filter454Reads.fa +/usr/lib/screenLibs/primerCloneScreen.seq +/usr/lib/screenLibs/primerSubcloneScreen.seq +/usr/lib/screenLibs/repeats.fasta +/usr/lib/screenLibs/sffLinkers.fa +/usr/lib/screenLibs/singleVectorForRestrictionDigest.fasta +/usr/lib/screenLibs/vector.seq /usr/share /usr/share/consed /usr/share/consed/examples @@ -1100,6 +1117,6 @@ /usr/share/consed/examples/standard/phd_dir/djs74-932.s1.phd.1 /usr/share/consed/examples/standard/phd_dir/djs74-996.s2.phd.1 /usr/share/doc -/usr/share/doc/consed-19 -/usr/share/doc/consed-19/19.0_announcement.txt.bz2 -/usr/share/doc/consed-19/README.txt.bz2 +/usr/share/doc/consed-19-r1 +/usr/share/doc/consed-19-r1/19.0_announcement.txt.bz2 +/usr/share/doc/consed-19-r1/README.txt.bz2 Andrey, wouldn't it be better to place the /usr/lib/screenLibs contents under /usr/share/consed? # cat /etc/env.d/99consed CONSED_HOME=/usr PHRED_PARAMETER_FILE=/usr/share/phred/phredpar.dat # cat /etc/env.d/99phred PHRED_PARAMETER_FILE=/usr/share/phred/phredpar.dat # Hope these will be kept in sync. Will changes to these files be respected by etc-update?
Shouldn't we close this bug since the ebuild is in the tree? Please reopen if you think otherwise.
(In reply to comment #21) > Shouldn't we close this bug since the ebuild is in the tree? > Please reopen if you think otherwise. > I think /etc/env.d/99consed should not provide PHRED_PARAMETER_FILE env variable, as it depends on phred package which provides it.
+*consed-19-r2 (22 May 2010) + + 22 May 2010; Justin Lecher <jlec@gentoo.org> consed-19-r1.ebuild, + +consed-19-r2.ebuild: + removed PHRED_PARAMETER_FILE env, #253451 +