* EMBOSS version (from "embossversion"): 4.0.0 * Use "seqret" to store a sequence in GenBank format, in lowercase. * Add a title using "descseq" and convert to Fasta format. Print it * on screen in EMBL format. ID PhiKZ-ORF144 standard; DNA; UNC; 783 BP. DE Putative endolysin, complete CDA SQ Sequence 783 BP; 254 A; 146 C; 151 G; 232 T; 0 other; atgaaagtat tacgcaaagg cgataggggt gatgaggtat gtcaactcca gacactctta 60 aatttatgtg gctatgatgt tggaaagcca gatggtattt ttggaaataa cacctttaat 120 caggtagtta aatttcaaaa agataattgt ctagatagtg atggtattgt aggtaagaat 180 acttgggctg aattattcag taaatattct ccacctattc cttataaaac tatccctatg 240 ccaactgcaa ataaatcacg tgcagctgca actccagtta tgaatgcagt agaaaatgct 300 actggcgttc gtagccagtt gctactaaca tttgcttcta ttgaatcagc attcgattac 360 gaaataaaag ctaagacttc atcagctact ggttggttcc aattccttac tggaacatgg 420 aaaacaatga ttgaaaatta tggcatgaag tatggcgtac ttactgatcc aactggggca 480 ttacgtaaag atccacgtat aagtgcttta atgggtgccg aactaattaa agagaatatg 540 aatattcttc gtcctgtcct taaacgtgaa ccaactgata ctgatcttta tttagctcac 600 ttctttgggc ctggtgcagc ccgtcgtttc ctgaccactg gccagaatga attagctgct 660 acccatttcc caaaagaagc tcaggcaaac ccatctattt tttataacaa agatgggtca 720 cctaaaacca ttcaagaagt ttataactta atggatggta aagttgcagc acatagaaaa 780 taa 783 // * Search for common restriction sites (using "restrict" and the Rebase * database). We expect to find only a few since viral DNA has evolved to * protect itself from restriction enzymes. (There is only one hit, for * the EcoRII enzyme, starting at 610.) * Could we mutate the sequence to make the viral DNA even more robust to * restriction systems whilst keeping the same translation product? This * is investigated using "recoder". (There are seven possible mutations.) * Translate gene to protein sequence. We trim the termination "*" * character from the end of the translation. We use "transeq". The * translation product is printed on screen in SwissProt format. ID PhiKZ-ORF144_1 STANDARD; PRT; 260 AA. DE Putative endolysin, complete CDA SQ SEQUENCE 260 AA; 28816 MW; BD21361996EBBEAB CRC64; MKVLRKGDRG DEVCQLQTLL NLCGYDVGKP DGIFGNNTFN QVVKFQKDNC LDSDGIVGKN TWAELFSKYS PPIPYKTIPM PTANKSRAAA TPVMNAVENA TGVRSQLLLT FASIESAFDY EIKAKTSSAT GWFQFLTGTW KTMIENYGMK YGVLTDPTGA LRKDPRISAL MGAELIKENM NILRPVLKRE PTDTDLYLAH FFGPGAARRF LTTGQNELAA THFPKEAQAN PSIFYNKDGS PKTIQEVYNL MDGKVAAHRK // * Show a few basic statistics about the protein using "pepstats". * Molecular weight of the protein is 28814.90 Da (in 260 residues). * Create an hydropathy plot in PNG format to search for transmembrane * helices. The PNG graph shows three maxima at around residues 75, 100 * and 135. PEPSTATS of PhiKZ-ORF144_1 from 1 to 260 Molecular weight = 28814.90 Residues = 260 Average Residue Weight = 110.827 Charge = 7.5 Isoelectric Point = 9.4289 A280 Molar Extinction Coefficient = 28590 A280 Extinction Coefficient 1mg/ml = 0.99 Improbability of expression in inclusion bodies = 0.866 Residue Number Mole% DayhoffStat A = Ala 23 8.846 1.029 B = Asx 0 0.000 0.000 C = Cys 3 1.154 0.398 D = Asp 14 5.385 0.979 E = Glu 12 4.615 0.769 F = Phe 13 5.000 1.389 G = Gly 20 7.692 0.916 H = His 3 1.154 0.577 I = Ile 12 4.615 1.026 J = --- 0 0.000 0.000 K = Lys 21 8.077 1.224 L = Leu 22 8.462 1.143 M = Met 8 3.077 1.810 N = Asn 16 6.154 1.431 O = --- 0 0.000 0.000 P = Pro 15 5.769 1.109 Q = Gln 9 3.462 0.888 R = Arg 11 4.231 0.863 S = Ser 12 4.615 0.659 T = Thr 21 8.077 1.324 U = --- 0 0.000 0.000 V = Val 13 5.000 0.758 W = Trp 3 1.154 0.888 X = Xaa 0 0.000 0.000 Y = Tyr 9 3.462 1.018 Z = Glx 0 0.000 0.000 Property Residues Number Mole% Tiny (A+C+G+S+T) 79 30.385 Small (A+B+C+D+G+N+P+S+T+V) 137 52.692 Aliphatic (I+L+V) 47 18.077 Aromatic (F+H+W+Y) 28 10.769 Non-polar (A+C+F+G+I+L+M+P+V+W+Y) 141 54.231 Polar (D+E+H+K+N+Q+R+S+T+Z) 119 45.769 Charged (B+D+E+H+K+R+Z) 61 23.462 Basic (H+K+R) 35 13.462 Acidic (B+D+E+Z) 26 10.000