The more sequences to align the better. Dont include similar (>80%) sequences. Sub-groups should be pre-aligned separately, and one member of each subgroup should be included in the final multiple alignment.
!!NA_MULTIPLE_ALIGNMENT 1.0 PileUp of: @tnf.list Symbol comparison table: GenRunData:pileupdna.cmp CompCheck: 68 GapWeight: 5 GapLengthWeight: 1 tnf.msf MSF: 1706 Type: N August 12, 1997 08:10 Check: 5044 Name: Name: Name: Name: Name: Name: Name: Name: Name: Name: OATNFA1 OATNFAR BSPTNFA CEU14683 HSTNFR SYNTNFTRP CATTNFAA CFTNFA RABTNFM RNTNFAA Len: Len: Len: Len: Len: Len: Len: Len: Len: Len: 1706 1706 1706 1706 1706 1706 1706 1706 1706 1706 Check: Check: Check: Check: Check: Check: Check: Check: Check: Check: 5831 7533 1732 6670 191 3706 7430 2566 5089 4296 Weight: Weight: Weight: Weight: Weight: Weight: Weight: Weight: Weight: Weight: 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Output of Pileup
Output of Pileup
//
OATNFA1 OATNFAR BSPTNFA CEU14683 HSTNFR SYNTNFTRP CATTNFAA CFTNFA RABTNFM RNTNFAA
1 ~~~~~~~~~~ ~~~~~~~~~~ ~GGCCAAGAG ~~~~~GGGAC ACCAGGGGAC CAGCCAAGAG ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~GCAGA AGCAGACGCT CCCTCAGCAA GGACAGCAGA ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~AAGCTC CCTCAGTGAG GACACGGGCA ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~
401 OATNFA1 OATNFAR BSPTNFA CEU14683 HSTNFR SYNTNFTRP CATTNFAA CFTNFA RABTNFM RNTNFAA
Output of Pileup
TTCAG..... TTCAG..... TTCAA..... TTCAG..... CCCAG..... CCCAG..... CCCAG..... TCCAG..... CCCAGATGGT CCCAGACCCT .ACACTCAGG .ACACTCAGG .ACACTCAGG .ACCCTCAGG .GCAGTCAGA .GCAGTCAGA .ACACTCAGA .ACAGTCAAA CACCCTCAGA CACACTCAGA TCATCTTCTC TCATCTTCTC TCCTCTTCTC TCATCTTCTC TCATCTTCTC TCATCTTCTC TCATCTTCTC TCATCTTCTC TCAGCTTCTC TCATCTTCTC AAGC AAGC AAGC AAGC GAAC GAAC GAAC GAAC GGGC AAAA
Output of Pileup
PileUp considirations
PileUp does global multiple alignment, and therefore is good for a group of similar sequences. PileUp will fail to find the best local region of similarity (such as a shared motif) among distant related sequences. PileUp always aligns all of the sequences you specified in the input file, even if they are not related. The alignment can be degraded if some of the sequences are only distantly related.
ShadyBox
ShadyBox is a multiple alignment editor program which enables you to box and shade residues or segments of multiple aligned sequences. ShadyBox will work on a msf or pretty output file, and will produce a postscript output file. The original input file is not changed. ShadyBox enables you to save your work in the middle, exit the program, and resume at a later stage.
ShadyBox Output
Running ClustalW
[~]% clustalw
************************************************************** ******** CLUSTAL W (1.7) Multiple Sequence Alignments ******** **************************************************************
1. Sequence Input From Disc 2. Multiple Alignments 3. Profile / Structure Alignments 4. Phylogenetic trees S. Execute a system command H. HELP X. EXIT (leave program)
Your choice:
Running ClustalW
The input file for clustalW is a file containing all sequences in one of the following formats: NBRF/PIR, EMBL/SwissProt, Pearson (Fasta), GDE, Clustal, GCG/MSF, RSF.
Using ClustalW
****** MULTIPLE ALIGNMENT MENU ****** 1. Do complete multiple alignment now (Slow/Accurate) 2. Produce guide tree file only 3. Do alignment using old guide tree file 4. Toggle Slow/Fast pairwise alignments = SLOW 5. Pairwise alignment parameters 6. Multiple alignment parameters 7. Reset gaps between alignments? = OFF 8. Toggle screen display = ON 9. Output format options S. Execute a system command H. HELP or press [RETURN] to go back to main menu Your choice:
Output of ClustalW
CLUSTAL W (1.7) multiple sequence alignment HSTNFR SYNTNFTRP CFTNFA CATTNFAA RABTNFM RNTNFAA OATNFA1 OATNFAR BSPTNFA CEU14683
GGGAAGAG---TTCCCCAGGGACCTCTCTCTAATCAGCCCTCTGGCCCAG------G GGGAAGAG---TTCCCCAGGGACCTCTCTCTAATCAGCCCTCTGGCCCAG------G -------------------------------------------TGTCCAG------A GGGAAGAG---CTCCCACATGGCCTGCAACTAATCAACCCTCTGCCCCAG------A AGGAGGAAGAGTCCCCAAACAACCTCCATCTAGTCAACCCTGTGGCCCAGATGGTCA AGGAGGAGAAGTTCCCAAATGGGCTCCCTCTCATCAGTTCCATGGCCCAGACCCTCA GGGAAGAGCAGTCCCCAGCTGGCCCCTCCTTCAACAGGCCTCTGGTTCAG------A GGGAAGAGCAGTCCCCAGCTGGCCCCTCCTTCAACAGGCCTCTGGTTCAG------A GGGAAGAGCAGTCCCCAGGTGGCCCCTCCATCAACAGCCCTCTGGTTCAA------A GGGAAGAGCAATCCCCAACTGGCCTCTCCATCAACAGCCCTCTGGTTCAG------A **
ClustalW options
Your choice: 5 ********* PAIRWISE ALIGNMENT PARAMETERS ********* Slow/Accurate alignments:
1. Gap Open Penalty :15.00 2. Gap Extension Penalty :6.66 3. Protein weight matrix :BLOSUM30 4. DNA weight matrix :IUB
Fast/Approximate alignments: 5. Gap penalty :5 6. K-tuple (word) size :2 7. No. of top diagonals :4 8. Window size :4 9. Toggle Slow/Fast pairwise alignments = SLOW H. HELP Enter number (or [RETURN] to exit):
ClustalW options
Your choice: 6
********* MULTIPLE ALIGNMENT PARAMETERS *********
ClustalX
ClustalX
ClustalX
ClustalX
ClustalX
ClustalX
The BLOCKS WWW server can be used to create blocks of a group of sequences, or to compare a protein sequence to a database of blocks.
The Blocks Searcher tool should be used for multiple alignment of distantly related protein sequences.