DNA Sequencing and Fragment Analysis

Posts tagged ‘Troubleshooting Sequencing’

Sanger Sequencing: Historical Development of Automated DNA Sequencing

During the 1970s, Frederick Sanger developed a new technique allowing the base sequence of DNA to be determined. The design of his method is still very popular today. Sanger employed dideoxynucleic acids, ddNTPs, in addition to deoxynucleic acids together in the amplification of DNA during the Polymerase Chain Reaction (PCR). Instead of amplifying a section of DNA, the ddNTPS would cause amplification to terminate in a random number of amplified products. The ddNTPs were also radioactively labeled for detection. The end result of the Sanger chemistry would be a series of new DNA products with sizes increasing by a single base.

The PCR products were separated on a medium, polyacrylamide gel, with smaller products migrating faster than the larger products. The end result appeared like a ladder when detected.

The basics of the chemistry remain the same today. What has changed in Sanger sequencing is the method of separating the products and detection.

Original Base Labels were Radioactive

Before development of the fluorophore, Sanger labeled ddNTPs with a radioactive tag on the 5 prime end of the base. Because there was no method that could identify the different bases, a sample was amplified by PCR in four separate tubes. Each tube represented one of the four bases making up DNA. The amplified products were loaded separately into four lanes and allowed to migrate. Once complete, the gel was photographed by x-ray to view the result as shown in figure 1.

Fluorescent Labeled ddNTPs Replaced Radioactive Labels

Automation replaced the manual sequencing method with development of fluorescent labeled ddNTPs. Slab gels were still poured. However, all four bases were combined into a single reaction and loaded together. Samples would electrophorese and separate in the gel. Once the amplified products reached a region on the bottom of the plates, a laser would excite the fluorescent labels and color would be recorded by camera. Resulting images were collected by computer and analyzed as shown in figure 2.

Development of the automated chemistry allowed sequencing to be performed much faster. First, one lane on the gel was required for each sample. Second, sequence was recorded as electrophoresis was performed. The smaller more quickly migrating bands could be run completely through the gel as larger bands were recorded. Therefore more bases could be determined.

The problem that potentially occurred was in the plates. Because sequence results were recorded through the plates, the plates needed to be clean from debris and clear of any scratches. Although plates also needed to be clean for the radioactive method, it was more stringent for automated sequencing.

A significant amount of sequencing was performed using automated slab gel sequencing. But researchers still needed to pour plates and electrophoresis was performed for 12 hours or longer.

Capillary Sanger Sequencing

Capillary development occurred during the 1990s beginning with a single capillary machine, the ABI 310. The new technology eliminated any need for pouring gels. Instead, semi-liquid polymer was injected into the capillary before each run as shown in figure 3. Individual runs reduced the length of time required for electrophoresis to less than 3 hours.

The automated process changed very little from slab gel machines to capillary. But capillary sequencing was faster and more sensitive. It required much less DNA added to the PCR amplification. Better chemistries were developed in conjunction with automated sequencing. Capillary sequencers today characteristically perform 4 to 96 samples in a single run. The runs generally require 2 to 2 ½ hours of electrophoresis. Automated injection of samples allows hands-off operation through multiple sets of samples.

Frederick Sanger developed an important method for sequencing DNA. It allowed early completion of the human genome project, a genome with 3 billion bases. Although next-generation sequencing has expanded sequencing capabilities, Sanger sequencing is used for small sections of DNA often used in medical research.

Please go here if you would like to download a

reprint for this article in pdf format

Double Sequence: A Common Problem in Sanger Sequencing

Sanger sequencing methodology requires template DNA to be relatively free from contaminating salts. Template quality is often determined electrophotometrically by loading the template on agarose gel. Agarose gel electrophoresis often shows the faster moving supercoiled DNA and slower moving nicked DNA found in a quality preparation. DNA samples produced by PCR amplification can be similarly tested by agarose gel electrophoresis. A good PCR product is viewed as a single clean band representing the desired length.

Another test is to measure absorbance spectrophotometrically at ultraviolet wavelengths 260 nm and 280 nm. A 260 nm / 280 nm ratio value of 1.8 is an indicator that the DNA is good.

Unfortunately, the result off the capillary sequencer shows double set of peaks. Because one set of peaks is directly over top the other, software cannot determine the correct bases to call as in figure 1. A result showing a double set of peaks is not as uncommon as many might believe.

What are the primary causes of double sequence?

Double Sequence Caused by Template Contamination

One of the early steps in plasmid DNA preparation is to select a desired colony and then inoculate liquid medium. In an overnight culture, the inoculated media would go through rapid bacterial cell division as the cells enter the log growth phase. Because the initial cells came from a single colony, it is believed that every cell is the identical product cloned from a single cell. However, this is not always the case. Colonies do not always come from a single cell. A colony could also be the product of multiple cells that failed to separate when spread on a plate. If the multiple cells are not identical, the sequence result could produce the double sequence shown in figure 1.

The region where double sequence begins is usually a restriction sight where a separate piece of DNA is inserted into a vector. One purpose of bacterial cloning is to produce copies of the inserted DNA. The clean sequence shown in figure 1 is vector sequence before the restriction sight. Despite differences in the two templates, the vector sequence is generally the same. The template contamination becomes apparent at the point where the new DNA has been inserted into the vector.

Often, the reverse or complimentary strand does not produce a double sequence and this is often confusing. If the DNA insert were relatively long, it would require sequencing the whole complimentary strand before double sequence is observed. The reason is simply because double sequence is caused by addition or deletion of an unknown number of bases at one restriction sight only. Therefore, both templates, the desired and contaminant have identical sequence in the complimentary direction.

PCR fragments could also produce double sequence. It is the result of errors in PCR amplification where two products are produced. The PCR fragments could produce two very similar sized products that appear as a single band once tested by agarose gel electrophoresis. Resolution using agarose gel is limited and does not separate DNA samples of very similar size.

Double Priming Results in Dual Amplification

Sanger sequencing is similar to PCR amplification. A primer anneals at the beginning of the region to be sequenced and Taq polymerase adds bases (dNTPs) in extension to produce and identical strand. A primer that matches two regions on either the insert or the plasmid could cause two separate amplifications simultaneously. The result of double priming is shown in figure 2.

Unlike template contamination, two separate products generally do not appear clean in the beginning. It does not require the primer to match identically to both annealing sights as long as the bases of the primer are similar with a match on the 3 prime ends. What happens if one annealing site is downstream from the other annealing site on the same strand of DNA? The downstream primer blocks bases extending from the first primer and the double sequence eventually becomes a single set of peaks.

Single Nucleotide Polymorphisms Could Also Cause Double Sequence

PCR amplification is used to determine potential heterozygous bases as single nucleotide polymorphisms (SNPs). It is one method for detecting mutations that could cause certain genetic diseases. The SNP likely could be observed as a base change in one of two identical alleles amplified together and is generally easy to spot from automated Sanger sequencing. A single base position is represented by two peaks to positively identify a SNP.

However, SNPs could also be the result of a base insertion or deletion (indel) as in figure 3. From the figure it is possible to follow two identical strands of sequence that are different by one base. Both alleles match until the GGCC region. However, one allele is missing the T-base and has shifted upstream by one base. The large clean peaks observed in the double peak region are simply the same base represented by two different products.

Preventing Double Sequencing in Sequencing Results

There are some cases where double sequence cannot be resolved as with PCR fragments that contain indels. However, certain steps could be taken to reduce conditions described in figure 1 and figure 2.

Plasmid preparations that produce double sequence at the point of insert can be prevented using a process called single colony isolation. Once the plasmid cells have been spread on a plate and grown, the select colony is isolated and spread on a second plate. Although the process requires additional time, it is worth the effort in cases when double sequence often occurs.

The process for preventing double sequence in PCR fragments is more complicated because PCR conditions need better optimization. It could be something as simple as reducing the amount of primer or number of cycles. Or require small adjustments in PCR conditions including temperature. Guidelines for PCR optimization will be a future topic.

Samples that produce double sequence as the result of double priming can be resolved by extending bases to the custom primer that was selected for sequencing. A better choice is to select a different primer even when double priming is caused from one of the possible universal primers present on a particular vector.

Double sequence result patterns are generally similar for each condition discussed here. Anyone is invited to present results similar to that presented. Questions are also welcome.

Please go here if you would like to download a

reprint for this article in pdf format

How to Determine if a Sequencing Template Meets Quality Requirements

Isolation of plasmid DNA in preparation for automated Sanger sequencing has been simplified by the development of commercial kits. The quality of the final sample is generally tested by spectrophotometer measuring absorbance at wavelengths 260 nm and 280 nm. A ratio value of 1.8 calculated for absorbance measurements at 260 nm by 280 nm indicates the DNA sample is good. However, 260 nm and 280 nm absorbance values do not provide a complete profile of whether DNA preparations meet necessary quality standards for Sanger sequencing applications. Samples may still fail. There are a number of reasons why sequencing reactions are not successful. Two well-known techniques eliminate template quality as a reason for failure.

Scanning Spectrophotometry Provides Important Information about Template Quality

Nanodrop technology has the advantage of highly sensitive spectrophotometric scanning of DNA samples using just 1 ul of volume. Most spectrophotometers available today are capable of performing a wavelength absorbance scan in the ultra violet spectrum. DNA is scanned between wavelengths ranging from 220 nm to 320 nm. The scan profile for quality DNA is shown in figure 1.

A typical DNA profile is a Gaussian (bell-shaped) curve with the maximum peak height absorbing at 260 nm. A secondary peak will also begin to form around 220 nm. Deviations in the shape of either curve could indicate the presence of salts or other contaminants.

Figure 2 provides a profile of plasmid DNA containing salts compared to clean DNA. The Gaussian curve should drop almost to the baseline when DNA is free of contaminating salts in the range 230 nm to 240 nm. Salts present in the sample will absorb wavelengths in this range. Excessive amounts of salt in DNA samples may show little or no dip in the curve between 220 nm and 260 nm.

Figure 3 represents a profile of DNA that contains either protein (DNA from tissue) or phenol (plasmid DNA). Phenol residue in DNA can result from phenol/ chloroform extraction (a method of extraction that was widely used prior to the development of commercial kits). Despite the availability of kit-based methods, many researchers opt to use phenol/ chloroform instead. It is important to remove phenol residues with ethanol precipitation in the final purification step. Phenol residue can negatively affect the PCR amplification in Sanger sequencing.

Scanning spectrophotometry successfully identifies salt contaminants in plasmid DNA and DNA samples amplified in the Polymerase Chain Reaction (PCR). PCR fragments will have a lower concentration. But purified PCR products should still show a scan similar to plasmid preparations.

Agarose Gel Electrophoresis Identifies Nicked DNA

Once plasmid DNA has been determined to be free from salt contaminants it should work well for Sanger sequencing. However, this is not always the end result. Scanning spectrophotometry does not reveal whether DNA is supercoiled or nicked as described previously (Nicked Plasmid DNA Prevents Automated Sanger Sequencing). Nicked DNA has been damaged mechanically. Aggressive vortexing during DNA purification is one cause of DNA damage. Nicked DNA does not amplify in Sanger sequencing applications because the double stranded helix does not maintain a tight formation. Taq polymerase is unable to fulfill a lock-key attachment to the DNA and catalyze extension. For automated sequencing to work, plasmid DNA must maintain a supercoiled structure.

Agarose gel electrophoresis successfully separates nicked DNA (slow migrating) and supercoiled (fast migrating) as shown in figure 4. Nicked DNA moves more slowly through the gel because it is a larger molecule than supercoiled DNA. It should be noted that plasmid preparations typically have a mixture of nicked and supercoiled DNA. Presence of some nicked DNA should not cause concern. It is the lack of supercoiled DNA that causes sequencing failures.

Scanning spectrophotometry and agarose gel electrophoresis combined provide a good assessment of whether template DNA purification has been successful. However, it does not guarantee Sanger sequencing will be successful every time. Primer selection, GC content and presence of poly base regions and hairpins could also affect sequencing. Problems of this nature are related more to base content and not preparation of the sample.

Please go here if you would like to download a

reprint for this article in pdf format

Poor Primer Selection

What would cause a sequencing primer to anneal in the correct site, but extend in the wrong direction? The end result was compliment to the previous sequence from which the primer was chosen.


The researcher was baffled with the result. It was a primer-walking project to complete the sequence for a 2kb insert. The first sequence was performed using universal M13(-21) and produced a clean 600 plus base read length. The primer was selected 450 bases downstream and was expected to continue in the same direction. Instead the result was a complimentary sequence covering the same area as previously sequenced (see figure).


The primer selected consisted of 20 bases. Seventeen of the bases were directly complimentary to each other forming a primer that was essentially reversible. The primer simply annealed to the same region of the reverse strand of DNA predominantly matching the 17-base compliment. The resulting sequence was very clean showing no hint of a problem. The cause was simply poor primer selection. Nowadays primer orders are generally submitted on-line and include automated evaluation of secondary structures in the primer so this problem shouldn’t happen anymore. This was an interesting phenomenon to see.

Please go here if you would like to download a

reprint for this article in pdf format