Genomics
Lomugdang, Fiord Jogardy B.1
1Student, School of Chemical Engineering, Chemistry and Biotechnology, Mapua Institute of Technology
ABSTRACT
Determination of the complete nucleotide sequence of an organism is the basis for the field of genomics. In this experiment, it is required
to use Web resources to align DNA sequences, find open reading frames (ORFs) in a DNA sequence, and to perform a BLAST search of
Genbank. The bioinformatics web site NCBI was use in this experiment. A tool in the NCBI website ORF Finder was used to find the open
reading frames of the given DNA sequence. In this procedure, the DNA sequence of the pUC19 vector was used with the accession M77789.2.
The ORFs was located and visualized. In the next procedure, it is required to perform a BLAST Search on the given DNA sequence. A human
gene p53 with the accession AB082923.1 was used in this part. The results of the lab do indeed show that the analysis of the complete
nucleotide sequence is important in the field of genomics.
INTRODUCTION
II. Build a full-length DNA sequence 7. To see the amino acid sequence found in the ORF, click
on one of the green boxes that represents the ORF within
1. Build a full-length DNA sequence from fragments by the total sequence (represented by the black outlined
lining up the fragments end to end using regions of box). The amino acid sequence (in one letter code) will
overlap. appear beneath the nucleotide sequence.
2. Using the information on where two fragments overlap,
eliminate the overlapping sequence from one of the two IV. Interpret the ORFFinder results.
fragments, and then join the remaining sequence
together. 1. Select one of the open reading frames you think most
3. Do this for two overlapping sequences in your notebook. likely represents the protein sequence encoded by the
Then copy this sequence into one of the BLAST 2 DNA. In some cases, this is a simple choice. For
sequences boxes. Copy from your notebook another example, if the DNA sequence is from a cDNA, there
sequence from a fragment that overlaps either one of the should be no interruptions in the ORF. Usually, but not
ends of this sequence into the other box, and perform the always, the correct ORF from a cDNA is the longest
alignment. uninterrupted ORF that begins with a methionine.
4. Repeat the process of finding and eliminating overlaps 2. Once you have found what you think is the correct ORF,
until you have joined together all the fragments that came highlight it in on the Web page, and copy and paste it into
from the same original clone. your notebook.
5. Once you have assembled the DNA sequence for the 3. Once you have decided which ORF is likely the correct
same gene from a healthy individual and from someone one, click ACCEPT, below the graphical representation
with one of the diseases you are studying, you can use of the ORFs.
the BLAST 2 sequences tool to compare the two 4. On the next Web page, click VIEW just above the
sequences. graphical representation of the ORFs. This will give you
a text version of the ORF, which you can copy to your
III. Find open reading frames (ORFs). notebook or print.
5. Compare the ORF you have chosen with the amino acid
1. Open your notebook and find the DNA sequence you sequence of p53 or beta-globin found in the cDNA
wish to analyze. This could be the full-length sequence Cloning module.
you just built by joining overlapping fragments.
2. To find the open reading frames in this sequence, go to V. Perform a BLAST search on a DNA sequence.
the NCBI Web site on the Tools page:
http://www.ncbi.nlm.nih.gov:80/Tools/ 1. For most human genes, there exist related genes. These
3. Click on ORF Finder on the Tools page. can be either members of a human gene family, or
4. Copy the DNA sequence from your notebook and paste related genes in other species. To search the GenBank
it into the box that says Sequence in FASTA format. database, go to the NCBI Web site on the BLAST page:
5. Click on ORF Find. You will get a graphical depiction of http://www.ncbi.nlm.nih.gov:80/BLAST/.
the ORFs found in the sequence you entered, as well as 2. Click on Basic BLAST search.
a list of the ORFs with their lengths. 3. Open your notebook and copy the sequence that you
wish to BLAST. This can be either a nucleotide (DNA)
sequence, or an amino acid (protein) sequence.
database. These settings can be changed with the pull- 1. To access related genomics information, start by clicking
down menus in the middle of the page. on GeneMap99
5. Click SEARCH. You will get a page presenting a request http://www.ncbi.nlm.nih.gov:80/genemap/.
ID number and an estimate of the time it will take to give 2. Next, check out the page titled Genes and Disease
you the results of the search. http://www.ncbi.nlm.nih.gov:80/disease/ . On the left side
6. Click the Format results button to get the search results. of this page, you can access information on genes
7. To perform the BLAST search with an amino acid associates with specific diseases, such as cancer.
sequence, copy the amino acid sequence of p53 or 3. Follow the cancer link
globin into the box of the BLAST page and choose blastp http://www.ncbi.nlm.nih.gov:80/disease/Cancer.html . At
as the program. the top of the page are numbers referring to human
chromosomes.
VI. Read the BLAST search results. 4. Click on one of these to see a diagram of the
chromosome and the location of known genes that are
1. 1. The results of a BLAST search are presented in three implicated in cancer. For example, the p53 gene is
formats. At the top of the page is a graphical located on chromosome 17, as is the BRCA1 gene,
representation of the search results. The color of the lines mutations in which can lead to breast cancer.
tells you how good a match was found. Place the cursor 5. On the left side of the cancer page are links to specific
on any line to reveal the name of the sequence that was cancers and genes implicated in tumorigenesis. Click the
found to match. p53 tumor suppressor link
2. Below this list, each of the alignments is given. Click on http://www.ncbi.nlm.nih.gov:80/disease/p53.html.
any line in the graphical representation or on the score 6. On the left side of this page, you can follow links to
link in the list of sequences to see the relevant alignment. articles about p53. You can also search PubMed
http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Pu
VII. Interpret the BLAST search results. bMed/ for articles on any biomedical subject.
7. Click on LocusLink
1. In the list of the BLAST search results, find three human http://www.ncbi.nlm.nih.gov:80/LocusLink/ where you
genes that are closely related (but not identical) to the can find information on the p53 genetic locus, its map
sequence you entered. information, a list of GenBank sequences, and a list of
2. Highlight each of the sequence alignments, and copy and additional Web resources about p53.
paste them into your notebook.
3. If you copy them into a word processing program, set the
font to Courier 9 point to maintain the formatting of the
alignment.
4. Next, find three sequences from other organisms that are
closely related to the sequence you entered.
5. Highlight each of the sequence alignments, and copy and
paste them into your notebook.
6. Do a BLAST search of the p53 protein sequence against
the genome of Drosophila. Did you find a similar protein
in the Drosophila genome?
A tool in the NCBI website ORF Finder was used to find the open
reading frames of the given DNA sequence. In this procedure, the
DNA sequence of the pUC19 vector was used with the accession
M77789.2. By following the steps given in the procedure, the
ORFs of the given DNA was located and visualized.
The correct ORF was also found in the results. This ORF most
likely represents the protein sequence encoded by the DNA. The
correct ORF from a cDNA is the longest uninterrupted ORF that Figure 12. The longest, uninterupted ORF of pUC19.
begins with a methionine.
CONCLUSION