Bioinformatics questions

I am reviewing for a course in computational biochemistry (bioinformatics) and I have prepared a set of problems to help prepare me for this course. I would like assistance with this problem set, to make sure that I have fully reviewed the necessary information. Five sources are provided throughout.


1) Bioinformatics is simply the method of studying biology through computer science and information technology-based approaches. It is used to mine databases for information and patterns that the human eye cannot recognize as well as parse (sort and shorten) the massive volume of information that we have collected in over 100 years of biological study.

Second generation:
454- Pyrosequencing is used to rapidly determine DNA sequence. Pyrosequencing is where a set of nucleotides is added sequentially to a chain of DNA (for example, adenines then thymines, etc.). At each stage the amount of inorganic phosphate that is released is measured. This allows you to know which nucleotides are being incorporated and how much of them (if you see one unit of adenine incorporated, then no thymine, then two units cysteine, you can deduce the secquence as ACC). It is a rapid and effective high throughput methodology, but it is prohibitively expensive (right now, at least).

Illumina- Fluorescent tags allow for the real-time analysis of base-by-base incorporation of nucleotides. Different colors allow you distinguish the incorporation. It is a useful technique capable of analyzing small-size libraries and RNA, but it is less accurate than the SOLiD approach.

SOLiD- A library of fragmented DNA is created on magnetic beads, and a universal primer (P1 sequence is added prior to coupling to beads) allows for the hybridization to every fragment in the library. Fluorescently-labeled probes are then used to bind the primer. Bound probes are detected and then cleaved, allowing for the rapid generation of massive amounts of data. The main advantage is accuracy of the method, which is better than Illumina sequencing. It is not possible to perform the analysis on short DNA libraries, however, conferring a disdvantage over Illumina. SOLiD also requires a significant amount of hands-on input, which is much more than Illumina.

Third Generation: Here is a useful link for information about Pacific bio and 2nd gen sequencing in general:
Pacific Bio- utilizes tiny pores to trap single molecule interactions. Basically a polymerase is trapped inside a microchannel along with a template strand. The nucleotides with 4 different fluorescent tags are added, and they go into the channel. Using optics technology, a very tiny area is illuminated with excitation light, causing ...

