Purchase Solution

Deciphering the information provided in gene annotations

Not what you're looking for?

Ask Custom Question

See the attached file.

Describe the E-value distribution are they small (close to zero) large (close to 1).

Is the subject sequence from the top 2 BLAST hits from a genus and or a phylum that you would expect to be closely related to T. oceani?

Are there any inconsistencies found in the T-coffee results, such as large polypeptide regions that do not match? Where?

Identify 3 highly conserved amino acids using Weblogo, and identify their location, the amino acids, and the chemistry of each.

Do you have stretches of conservation? If so, where?

Purchase this Solution

Solution Summary

This solution contains a detailed description of how to understand some of the information provided in gene annotations. The problem includes annotations associated with a gene from T. oceani in the form of an .htm file. Specific queries about E-value distributions, relatedness and conservation using a BLAST, T-coffee, and WebLogo analysis are answered in reference to the annotated gene, as well as a general description of each of these modules in a .doc format. The problem and solution contain links to informational websites about gene annotations and to online sources for finding annotations for your own gene of interest. After working through this problem/solution pair, you should be able to easily make use the links to the online tools provided in the problem to find stretches of conservation in your gene/protein of interest, extract information about conserved amino acid residues, and determine the how closely related other genes are to your query.

Solution Preview

See the attached 'biotinformatics question.doc' file for the WebLogo images.

Q. Describe E-value distribution. Are they small (close to zero) large (close to1)?

http://www.ncbi.nlm.nih.gov/blast

A. The E-value, or the expect value, is a statistical representation of how similar two sequences are to each other. The E-value takes into account the likelihood of sequences being similar to your query by random chance. In the two examples you provided, 2e-131 and 7e-45 are very small and can be said to be close to zero. This indicates that the two sequences are very similar to the query as the smaller the E-value is, the larger the score or resemblance. It is interesting to note that the first query seems to be identical to the BLAST hit and should give you as low an E-value as you can get for the parameters you used.

For an explanation of the scoring statistics given in a BLAST search:
http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html

The following .pdf gives a nice plain explanation of E-value including the ranges you can get and how you analyze them.
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.clcbio.com%2Fsciencearticles%2FBE-blast.pdf&ei=WIjSTv-wNqaM2gXpupl9&usg=AFQjCNFdwmz18FcmGvQABwsmTNe0qr9p9w&sig2=VYlY1Y-7zo9V39BdPSOBvA

Q. Is the subject sequence from the top 2 BLAST hits from a genus and/or a phylum ...

Purchase this Solution


Free BrainMass Quizzes
BIOLOGY

Basics in biology

Understanding the Musculoskeletal system

Introduce and understand basic information how the skeletal system and muscular system work in close concert with one another. And how their interaction between muscle and bone, as they work together to allow us movement.

Basic Concepts in Neuroscience

This quiz provides a review of the basic concepts in neuroscience.

Light and Sight Vocabulary

This quiz introduces basic definitions of vocabulary related to light and how human eyes. This information is important for an understanding of sight.

The Heart

This quiz test the understanding of the heart and some of its parts. It is important to understand how the heart functions and what makes it function.