DNA Sequencing and Fragment Analysis

Archive for the ‘DNA Sequencing’ Category

Gene Assembly: Extending Sequence Results by Primer Walking

Researchers have completed sequencing the entire human genome. The genome consists of more than 3 billion bases and was completed ahead of schedule. Technological advancements from slab gel to capillary sequencing combined with data management allowed scientists to process tremendous amounts of information. What once required years to complete now takes weeks as development of Next-generation sequencing increased sequencing capacity.

Despite abilities to sequence whole genomes, researchers also compare individual genes that could consist of 5,000 (5 kb) base fragments. One example would be comparison of a gene isolated from wild type versus a mutant isolate. The project could be designed to determine function of a gene or how mutation affects the function in an effected individual. Sanger sequencing remains a principle method for comparing fragments much smaller than an entire chromosome or genome.

Isolating the Gene from Genomic DNA

Genomic DNA does not provide a good source of template DNA for Sanger sequencing applications. The Sanger dye-terminator method is a linear amplification requiring sufficient copy numbers off the original template. The Polymerase Chain Reaction (PCR) is used to amplify copies of a particular region of genomic DNA and isolate these smaller fragments from genomic DNA. Primers flanking the region of interest determine what region is amplified. PCR amplification extends from the primers in forward and reverse directions. The region of DNA between the primers is multiplied logarithmically into smaller and more manageable fragments of DNA.

Researchers could sequence directly from the PCR fragment or choose to insert the fragment into a bacterial plasmid for cloning. Plasmid DNA provides certain advantages over direct sequencing from a PCR fragment. Plasmids typically include universal priming sites flanking inserted DNA that allow complete sequencing of the insert. Direct sequencing from a PCR fragment requires that the original PCR primers be used. Approximately 50 bases of the fragment would not be sequenced as with plasmid DNA.

Sequence Results and Primer Walking

Both PCR fragments and plasmid DNA are double stranded. Researchers could sequence using both strands of DNA. Sequence from the forward primer extends in direction of the reverse primer. Reverse sequence likewise extends toward the forward primer. Eventually forward and reverse sequence meet in the middle to complete sequencing the 5 kb insert.

Sanger sequencing capillary technology generates 800 to 1,000 bases of sequence data. A gene consisting 5 kb would not be covered from one set of sequence data. The newly generated sequence results provide the known sequence for designing additional primers (primer walking) for another set of sequences as shown in figure 1.

GeneAssembly1
Gene Assembly: Assembling Sequence Results

The primer walking process continues until generated sequence data covers the entire DNA insert or fragment. Once sequencing is complete, results are assembled into a contiguous (contig) sequence using one of several available software programs as shown in figure 2. Assembly programs offer the advantage of using electropherogram data showing peak quality. This helps reduce potential errors in final results. The final assembled product is summed into a single contiguous sequence for further comparison.

GeneAssembly2

One comparison researchers often perform is results of a wild type gene to one with potential mutation. Mutations could consist of single bases changes, insertions or deletions (indels). Because gene transcription is based on a three base code, indels could be particularly problematic by causing a shift (frameshift) in bases coding amino acids. The protein product of a frameshift could completely eliminate the function of the protein.

Of course this is only one example where gene assembly is used for research. Gene assembly could be used for many applications including a better understanding of the functions of certain proteins. Mutation could alter the protein to a degree where the protein in incapable of performing a necessary function.

Please go here if you would like to download a

reprint for this article in pdf format

Sanger Sequencing: Historical Development of Automated DNA Sequencing

During the 1970s, Frederick Sanger developed a new technique allowing the base sequence of DNA to be determined. The design of his method is still very popular today. Sanger employed dideoxynucleic acids, ddNTPs, in addition to deoxynucleic acids together in the amplification of DNA during the Polymerase Chain Reaction (PCR). Instead of amplifying a section of DNA, the ddNTPS would cause amplification to terminate in a random number of amplified products. The ddNTPs were also radioactively labeled for detection. The end result of the Sanger chemistry would be a series of new DNA products with sizes increasing by a single base.

The PCR products were separated on a medium, polyacrylamide gel, with smaller products migrating faster than the larger products. The end result appeared like a ladder when detected.

The basics of the chemistry remain the same today. What has changed in Sanger sequencing is the method of separating the products and detection.

Original Base Labels were Radioactive

Before development of the fluorophore, Sanger labeled ddNTPs with a radioactive tag on the 5 prime end of the base. Because there was no method that could identify the different bases, a sample was amplified by PCR in four separate tubes. Each tube represented one of the four bases making up DNA. The amplified products were loaded separately into four lanes and allowed to migrate. Once complete, the gel was photographed by x-ray to view the result as shown in figure 1.

Fluorescent Labeled ddNTPs Replaced Radioactive Labels

Automation replaced the manual sequencing method with development of fluorescent labeled ddNTPs. Slab gels were still poured. However, all four bases were combined into a single reaction and loaded together. Samples would electrophorese and separate in the gel. Once the amplified products reached a region on the bottom of the plates, a laser would excite the fluorescent labels and color would be recorded by camera. Resulting images were collected by computer and analyzed as shown in figure 2.

Development of the automated chemistry allowed sequencing to be performed much faster. First, one lane on the gel was required for each sample. Second, sequence was recorded as electrophoresis was performed. The smaller more quickly migrating bands could be run completely through the gel as larger bands were recorded. Therefore more bases could be determined.

The problem that potentially occurred was in the plates. Because sequence results were recorded through the plates, the plates needed to be clean from debris and clear of any scratches. Although plates also needed to be clean for the radioactive method, it was more stringent for automated sequencing.

A significant amount of sequencing was performed using automated slab gel sequencing. But researchers still needed to pour plates and electrophoresis was performed for 12 hours or longer.

Capillary Sanger Sequencing

Capillary development occurred during the 1990s beginning with a single capillary machine, the ABI 310. The new technology eliminated any need for pouring gels. Instead, semi-liquid polymer was injected into the capillary before each run as shown in figure 3. Individual runs reduced the length of time required for electrophoresis to less than 3 hours.

The automated process changed very little from slab gel machines to capillary. But capillary sequencing was faster and more sensitive. It required much less DNA added to the PCR amplification. Better chemistries were developed in conjunction with automated sequencing. Capillary sequencers today characteristically perform 4 to 96 samples in a single run. The runs generally require 2 to 2 ½ hours of electrophoresis. Automated injection of samples allows hands-off operation through multiple sets of samples.

Frederick Sanger developed an important method for sequencing DNA. It allowed early completion of the human genome project, a genome with 3 billion bases. Although next-generation sequencing has expanded sequencing capabilities, Sanger sequencing is used for small sections of DNA often used in medical research.

Please go here if you would like to download a

reprint for this article in pdf format

Primer Selection Guidelines: Good Primers Important for PCR and Automated Sequencing

DNA synthesis is the production of short, single-stranded DNA molecules (called primers or oligonucleotides) often used in the polymerase chain reaction (PCR) and DNA amplification for Sanger sequencing applications. The region of DNA amplified is determined by an exact match of the primer to its complimentary bases on a given DNA strand. Primer sequences are determined from known sequence since there must be a match to the region of DNA to be amplified.

PCR amplification requires 2 primers that determine the region of sequence amplified in the forward and reverse direction. The forward primer is designed along one strand in the direction toward the reverse primer. Likewise, the reverse primer is designed from the complimentary strand. PCR is exponential amplification in which the newly generated PCR fragment from one cycle also acts as a template for the next cycle (Figure 1A).

Amplification of DNA for Sanger sequencing differs from PCR in that a single primer is used. Amplification is the reproduction of one strand using the compliment from the original strand. Newly generated PCR fragments are single stranded and do not provide a complimentary strand that could act as a template for additional amplification (Figure 1B).

Differences in amplification between PCR and Sanger sequencing were discussed previously (Sanger Sequencing Amplification Compared to Basic PCR).

Simply designing the sequence of the primer from known sequence does not ensure the primer will anneal to the desired region and initiate amplification. The primer should be designed following a set of given standards that improve the chances of success. The guidelines for designing primers used in PCR and sequencing are fairly similar. Primers are designed in the 5’ to 3’ direction to compliment the direction of amplification.

For those utilizing PCR and Sanger sequencing in everyday applications, primer design could seem like yesterday’s news. However, we still believe reviewing good primer design guidelines is helpful to any researcher involved in genetic research.

Guidelines for Primer Design

G1. Primer length should be in the range of 18 to 22 bases. Primers less than 18 bases will have a low melting temperature (Tm values) and might not anneal to the template. There is some flexibility for designing primers longer than 18 bases. Longer primers are frequently designed from template regions that are AT-rich and need additional bases to increase the Tm value.

G2. The primer should have GC content of 50% to 55%.   This is the equivalent of 9 or 10 GC bases included in an 18 base primer. Sometimes there are regions on a template that are AT-rich which prevents meeting this guideline. In those cases it is recommended to design a primer longer than 18 bases.

G3. Primers should have a GC-lock on the 3’ end. A GC-lock is designed when 2 of the final 3 bases is a G or a C. The 3’ base should always be a G or a C.

G4. The melting temperature of any good primer should be in the range of 50OC to 55OC. However, guidelines particularly related to Tm value have some flexibility. Melting temperatures are directly related to the PCR cycle annealing temperature. Tm values that are too low may not anneal well during PCR. High values could be too stringent causing difficulty locating the correct annealing site on the template.

G5. The primer should not include poly base regions. This is when 4 or more bases in a row are the same. This guideline helps prevent potential slippage in which the primer shifts from the annealed position.

G6. Four or more bases that compliment either direction of the primer should be avoided. This prevents the primer from annealing to itself and forming what is referred to as primer-dimer. Primer-dimers have the capability of amplifying the primer itself causing short secondary sequence.

PCR Specific Guidelines

G7. Forward and reverse primers used in PCR amplification should have similar melting temperatures (+/- 2OC). This allows a 4OC difference in total melting temperatures. Researchers involved in using PCR amplification will use primer Tm values in an effort to optimize PCR cycles. Similar Tm values for forward and reverse primers aid optimization efforts. Multiplex PCR applications using multiple primer pairs should all have similar Tm values. A wide range in primer melting temperature complicates PCR optimization.

G8. Forward and reverse primers should not have regions 4 bases or longer that compliment. Just like a primer used in Sanger sequencing, forward and reverse primers used in PCR can anneal to each other and form primer-dimers.

G9. The Tm values for tailed primers should include the tail in calculating melting temperature. Yes, melting temperatures will be greater than 55OC. However, the additional bases in the tail will add to the amplified PCR fragment and become part of the priming site. Tailed primers are often used to add restriction sites to an amplified product.

Primer design is an important aspect relating to many forms of PCR including basic PCR, fragment analysis, quantitative analysis and Sanger sequencing. Below is a link that offers free primer design software…

http://frodo.wi.mit.edu/

We invite any additional guidelines or comments that may be useful to both experienced researchers and those new to primer design.

Please go here if you would like to download a

reprint for this article in pdf format

How to Determine if a Sequencing Template Meets Quality Requirements

Isolation of plasmid DNA in preparation for automated Sanger sequencing has been simplified by the development of commercial kits. The quality of the final sample is generally tested by spectrophotometer measuring absorbance at wavelengths 260 nm and 280 nm. A ratio value of 1.8 calculated for absorbance measurements at 260 nm by 280 nm indicates the DNA sample is good. However, 260 nm and 280 nm absorbance values do not provide a complete profile of whether DNA preparations meet necessary quality standards for Sanger sequencing applications. Samples may still fail. There are a number of reasons why sequencing reactions are not successful. Two well-known techniques eliminate template quality as a reason for failure.

Scanning Spectrophotometry Provides Important Information about Template Quality

Nanodrop technology has the advantage of highly sensitive spectrophotometric scanning of DNA samples using just 1 ul of volume. Most spectrophotometers available today are capable of performing a wavelength absorbance scan in the ultra violet spectrum. DNA is scanned between wavelengths ranging from 220 nm to 320 nm. The scan profile for quality DNA is shown in figure 1.

A typical DNA profile is a Gaussian (bell-shaped) curve with the maximum peak height absorbing at 260 nm. A secondary peak will also begin to form around 220 nm. Deviations in the shape of either curve could indicate the presence of salts or other contaminants.

Figure 2 provides a profile of plasmid DNA containing salts compared to clean DNA. The Gaussian curve should drop almost to the baseline when DNA is free of contaminating salts in the range 230 nm to 240 nm. Salts present in the sample will absorb wavelengths in this range. Excessive amounts of salt in DNA samples may show little or no dip in the curve between 220 nm and 260 nm.

Figure 3 represents a profile of DNA that contains either protein (DNA from tissue) or phenol (plasmid DNA). Phenol residue in DNA can result from phenol/ chloroform extraction (a method of extraction that was widely used prior to the development of commercial kits). Despite the availability of kit-based methods, many researchers opt to use phenol/ chloroform instead. It is important to remove phenol residues with ethanol precipitation in the final purification step. Phenol residue can negatively affect the PCR amplification in Sanger sequencing.

Scanning spectrophotometry successfully identifies salt contaminants in plasmid DNA and DNA samples amplified in the Polymerase Chain Reaction (PCR). PCR fragments will have a lower concentration. But purified PCR products should still show a scan similar to plasmid preparations.

Agarose Gel Electrophoresis Identifies Nicked DNA

Once plasmid DNA has been determined to be free from salt contaminants it should work well for Sanger sequencing. However, this is not always the end result. Scanning spectrophotometry does not reveal whether DNA is supercoiled or nicked as described previously (Nicked Plasmid DNA Prevents Automated Sanger Sequencing). Nicked DNA has been damaged mechanically. Aggressive vortexing during DNA purification is one cause of DNA damage. Nicked DNA does not amplify in Sanger sequencing applications because the double stranded helix does not maintain a tight formation. Taq polymerase is unable to fulfill a lock-key attachment to the DNA and catalyze extension. For automated sequencing to work, plasmid DNA must maintain a supercoiled structure.

Agarose gel electrophoresis successfully separates nicked DNA (slow migrating) and supercoiled (fast migrating) as shown in figure 4. Nicked DNA moves more slowly through the gel because it is a larger molecule than supercoiled DNA. It should be noted that plasmid preparations typically have a mixture of nicked and supercoiled DNA. Presence of some nicked DNA should not cause concern. It is the lack of supercoiled DNA that causes sequencing failures.

Scanning spectrophotometry and agarose gel electrophoresis combined provide a good assessment of whether template DNA purification has been successful. However, it does not guarantee Sanger sequencing will be successful every time. Primer selection, GC content and presence of poly base regions and hairpins could also affect sequencing. Problems of this nature are related more to base content and not preparation of the sample.

Please go here if you would like to download a

reprint for this article in pdf format

Preparing Plasmid DNA for Automated Sanger Sequencing

Cesium chloride and phenyl chloroform were two early methods used for isolating a sample of DNA from a bacterial vector. Cesium chloride produced a sample that was very clean. However, the method was labor intensive. Phenyl chloroform was less labor intensive, but phenyl contamination was difficult to remove even using the final alcohol precipitation step. Fortunately, commercial kits were developed around the same time Sanger sequencing became automated. Presently there are numerous commercial kits from which a researcher could choose.

However, using a commercial kit does not guarantee that a resulting DNA sample will be clean and free from contaminants. The final step of the sample preparation often includes Tris or another buffer (salt) for elution. Other times the DNA may incur physical damage and become nicked. Preparing a sample for automated sequencing includes following the recommended guidelines provided for the commercial kit. However, low yields can require additional steps to concentrate the sample. Simple drying methods concentrate everything in the sample including buffer salts.

Steps Involved in Commercial DNA Preparation Kits

The general protocol for commercial kits used to isolate purified DNA from plasmids is fairly similar regardless of the kit. After plasmid bacteria have been grown overnight in liquid media, the cells are pelleted by centrifugation into a solid mass. The media is then discarded before beginning the kit provided procedure (Figure 1).

The pellet of cells is resuspended in buffer and transferred into a 1.5 or 2 ml tube (Step 1). Lysis reagent is added (Step 2). The lysis reagent disrupts the bacterial cell membranes freeing internal components. The lysis buffer is neutralized with neutralizing solution and cell components form a precipitate with the DNA still in solution (Step 3). The resulting solution is centrifuged to pellet the cell waste leaving the DNA in solution.

DNA is captured on a filter or resin by transferring the DNA solution to a collection vessel (Step 4). Excess liquid is pulled through the filter by centrifugation or vacuum filtration. The remaining DNA on the filter is then washed removing any cell waste left (step 5). The DNA is then collected in a clean tube using water or elution buffer (step 6). Elution buffer is provided in the commercial kits. It is comprised of Tris buffer in water. Fortunately, manufacturers no longer include EDTA in elution buffer. EDTA interferes with magnesium chloride, a necessary component in any PCR amplification. Water is the preferred media for DNA to be used for Sanger sequencing as no salts are being introduced to the sample.

The process may take as little as 15 minutes after bacterial cells are grown overnight. The final DNA sample is relatively pure and clean from other cellular debris that could interfere with Sanger sequencing.

Quality Testing of the DNA Sample

DNA sample quality can be determined using two simple methods. The methods will be discussed in detail in the next article, but are summarized below.

Scanning spectrophotometry using ultraviolet wavelengths between 220 nm and 310 nm provides a general profile of the overall quality of the DNA (Figure 2). DNA absorbs light in the range of 240 nm to 300 nm with the maximum peak at 260 nm. Absorbance at 260 nm is also used to calculate the concentration of the DNA. Another factor often used when measuring DNA quality is calculating the 260 nm / 280 nm absorbance ratio. A good value for this ratio is 1.8 to 1.9. Ratio values below 1.6 indicate the DNA sample contains contaminants.

A good scanning profile does not always prove the DNA is good. Nicked DNA cannot be identified by scanning and may inhibit quality Sanger sequencing. (https://agctsequencing.wordpress.com/2012/02/16/nicked-plasmid-dna-prevents-automated-sanger-sequencing/)

Agarose gel electrophoresis will separate DNA into bands indicating whether the DNA has remained supercoiled.  This supercoiled DNA is required for sequencing. It will also show if the DNA has been damaged and the supercoil has loosened when the DNA is nicked. Nicked DNA migrates more slowly through an agarose gel and will separate from the supercoiled DNA.

Both quality tests provide necessary data to show whether the final DNA sample is clean and of high quality.

Additional Concentration Required

Once the DNA sample has been isolated and appears clean using test procedures, it may be necessary to adjust the concentration to meet submission guidelines. Samples with a concentration higher than required are diluted in water. Although some buffer is still present, the buffer is diluted along with the DNA. Generally dilute buffers (without EDTA) do not interfere with amplification for Sanger sequencing. What if the sample needs to be more concentrated?

Drying a sample in elution buffer is not an effective means for concentrating a sample. Ethanol precipitation is a preferred method because salts are removed along with excess water. Scanning spectrophotometry is an effective means of quantifying the DNA.

What is most important for researchers to remember when isolating plasmid DNA with a commercial kit is to thoroughly read the directions. The kits use enzymes to disrupt the bacterial cell membrane and remove components such as RNA. Enzymes are fragile. Excess shaking or vortexing could damage the enzymes. Commercial kits often include precautions to use care when working with these important enzymes.

Please go here if you would like to download a

reprint for this article in pdf format

DNA Fragments Resolve Better on Correct Percent Agarose Gel

An interesting article was posted March 25, 2011 on BitesizeBio.com titled 5 Ways to Destroy Your Agarose Gel. Every researcher may have made some of these common mistakes at one time. The five ways provided are…

  1. Use water instead of buffer for the gel or running buffer.
  2. Forget to add ethidium bromide
  3. Use the wrong percentage (or type) of agarose.
  4. Switch the leads from the power source.
  5. Drop the gel on the way to the imager.

The focus of this article is to explain the importance of using the correct percentage gel. In many genetic analysis applications a 1% agarose gel is commonly used to test plasmid preparations and PCR fragments. However, the resolution of the 1% gel may not sufficiently resolve smaller DNA products.

Percent Agarose Determines Pore Size

Agarose gel electrophoresis is a form of chromatography. The gel provides the stationary phase and electrical current provides the mobile phase. Charged particles such as DNA will migrate towards the positively charged anode in response to an electrical current across the gel. The gel provides the resistance against DNA migration. Smaller fragments move more rapidly than larger fragments.

Resistance is directly proportional to the porous nature of an agarose gel. Smaller pores provide more resistance. Increasing the percent of the gel decreases the size of the pores. When the pore sizes are too large small DNA fragments migrate together and do not become separated (figure 1). This figure illustrates why large DNA fragments should not be run on an agarose gel with small fragments of DNA.

Correct Percent of Agarose Depends on the Size Products Tested

The correct percent agarose gel is dependant on the size of the fragment that will be tested. Plasmid DNA preparations that are 5 kb to7 kb resolve well on a 1% gel. Large PCR fragments that are similar in size to plasmid DNA could also resolve on a 1 % percent gel. However, small PCR fragments that require smaller pore size for better resolution require a higher percent gel. General guidelines for mixing the correct percent gel are provided in table 1.

For small PCR fragments less than 500 bases in size, it is best to use a two percent gel. This will increase the run time. However, it will also improve resolution of fragments that are similar in size and may not resolve on lower percentage gels.

Credits:

http://bitesizebio.com/articles/5-ways-to-destroy-your-agarose-gel/

Sanger Sequencing Amplification Compared to Basic PCR

Sanger sequencing, the process used for automated sequencing, requires a DNA template to be amplified by the Polymerase Chain Reaction (PCR). Despite similarities between the processes, a sequencing amplification is different than basic PCR.

Sanger sequencing utilizes linear amplification

PCR produces millions of copies of a DNA region from a single copy of template DNA. Each copy produced during PCR in one cycle becomes a new template for the next cycle. PCR uses forward and reverse primers. The forward primer anneals to a complimentary site on one strand of DNA and extends toward the reverse primer. In turn, the reverse primer similarly extends towards the forward primer. What results is a copy of the desired region of DNA to be amplified. The new copy contains priming sites so it can be used as a template for future amplifications (figure 1). One copy of the original template produces two copies; two copies produce four in the next cycle; and so on. A twenty-five cycle PCR will produce 2E24 copies from a single template.

Sanger sequencing uses one primer instead of two. The amplification process copies one strand but not the reverse strand. The copy is the same direction as the primer and cannot be used as a template for later cycles. All amplification is directly from the original template DNA in the reaction. Therefore, amplification is linear, not exponential. It is the reason that Sanger sequencing amplification must include sufficient copies of the original template DNA to be visualized by automated sequencing equipment (figure 2).

Dideoxynucleotide bases are included in Sanger sequencing

The components of basic PCR include buffer, the enzyme Taq polymerase, deoxynucleotides (dNTPs), template, and forward and reverse primers. Sanger sequencing includes an additional component called dideoxynulceotides (ddNTPs). The ddNTPs are terminating bases that include a fluorescent tag for automated sequencing equipment. For this reason, Sanger sequencing is also called dye-terminator sequencing. During amplification, the ddNTPs will randomly sit on the DNA template and terminate the extension. The dNTPs sit on the remaining templates and continue extending. The end product is a size ladder of PCR products that increase by a single base (figure 2). Each terminating base is tagged with fluorescent dye. This dye provides a unique color representing the A, G, C, and T bases in DNA.

The DNA ladder is separated by electrophoresis

Once PCR is complete, the products produced in the Sanger reaction are loaded on an automated slab gel or capillary analyzer. Products will separate by size with smaller products moving faster through the medium. As the products near the end of the medium, the fluorescent tags are excited by light and recorded to a computer with a digital camera (figure 3). The computer records the color for each band and assigns the correct base to complete dye-terminator sequencing.

 

Please go here if you would like to download a

reprint for this article in pdf format