|Publication number||WO2008127901 A1|
|Publication date||23 Oct 2008|
|Filing date||7 Apr 2008|
|Priority date||13 Apr 2007|
|Publication number||PCT/2008/59532, PCT/US/2008/059532, PCT/US/2008/59532, PCT/US/8/059532, PCT/US/8/59532, PCT/US2008/059532, PCT/US2008/59532, PCT/US2008059532, PCT/US200859532, PCT/US8/059532, PCT/US8/59532, PCT/US8059532, PCT/US859532, WO 2008/127901 A1, WO 2008127901 A1, WO 2008127901A1, WO-A1-2008127901, WO2008/127901A1, WO2008127901 A1, WO2008127901A1|
|Inventors||George M. Church, Kun Zhang, Jay Shendure|
|Applicant||President And Fellows Of Harvard College|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (3), Non-Patent Citations (1), Referenced by (1), Classifications (4), Legal Events (3)|
|External Links: Patentscope, Espacenet|
REGION-SPECIFIC HYPERBRANCHED AMPLIFICATION
STATEMENT OF GOVERNMENT INTERESTS
 This application was funded in part by grant number DE-FG02-02ER63445 from the United States Department of Energy and grant number HG003170 from the National Institutes of Health. The Government has certain rights in the invention.
 This application claims priority from U.S. provisional patent application number 60/911,526, filed April 13, 2007 which is hereby incorporated herein by reference in its entirety for all purposes.
FIELD OF THE INVENTION
 The present invention relates to methods of amplifying large genomic nucleic acid sequences.
BACKGROUND OF THE INVENTION
 Existing DNA amplification methods either have very high specificity towards relatively short DNA sequences (e.g., PCR, which is generally limited to amplifying regions less than 50 kb) or have very little specificity (e.g., degenerated oligonucleotide primed PCR (DOP-PCR), primer extension PCR/improved primer extension PCR (PEP/IPEP), ligation-mediated PCR, multiple displacement amplification (MDA)). Currently, the only way to specifically amplify large genomic regions in the range of hundreds of kilobases to several megabases is to use in vivo cloning methods such as bacterial artificial chromosome/yeast artificial chromosome (BAC/YAC) cloning or transformation-associated recombination (TAR) cloning. The major disadvantage of such in vivo methods is that they generally involve screening of hundreds (TAR cloning) to hundreds of thousands (BAC/YAC cloning) of clones to identify those containing the target inserts.
 In the genetic mapping of human diseases, there is a growing need to efficiently amplify/capture megabase-sized genome regions. In performing positional cloning of genes involved in Mendelian diseases, researchers often reduce the search of candidate genes to a chromosomal region several megabases in size by linkage analysis. To identify the causative mutations, PCR-based re-sequencing of coding regions is commonly used but has very limited capabilities due to the fact that causative mutations do not necessarily locate within coding sequences. Complete re- sequencing of the entire candidate region is extraordinary costly and labor intensive because of the difficulty of selecting a specific genomic region.
SUMMARY OF THE INVENTION
 The present invention is based in part on the discovery of novel selective hyperbranched amplification methods that enable specific amplification of large genomic regions (e.g., genomic regions in the range of hundreds of kilobases to several megabases or more in length).
 In certain embodiments, methods for selective amplification of a genomic nucleic acid sequence of at least 100 kilobases in length including providing a plurality of locus- specific oligonucleotides, annealing the plurality of locus-specific oligonucleotides to the genomic sequence, and amplifying the genomic sequence using a strand displacing DNA polymerase are provided. In certain aspects, the amplifying is performed at a temperature of between 50 0C and 65 0C or at a temperature of greater than about 50 0C, greater than about 55 0C or greater than about 60 0C. In other aspects, the genomic nucleic acid sequence is at least 500 kilobases in length, at least one megabase in length or between one megabase and five megabases in length. In other aspects, the strand displacing DNA polymerase is Bst DNA polymerase, phi-29 DNA polymerase, bacteriophage T5 DNA polymerase, Vent DNA polymerase, Vent (exo~) DNA polymerase, Deep Vent DNA polymerase, Deep Vent (exo~) DNA polymerase, 9°Nm DNA polymerase, Therminator DNA polymerase, ThermoPhi DNA polymerase, TopoTaq DNA polymerase, TH DNA polymerase, Klenow fragment DNA polymerase I and/or Klenow fragment 3' to 5' exo" DNA polymerase I. In certain exemplary aspects, the strand displacing DNA polymerase is Bst DNA polymerase and/or Vent (exo") DNA polymerase. In yet other aspects, the genomic nucleic acid sequence is genomic DNA.  In other embodiments, methods for selective amplification of a genomic nucleic acid sequence of at least 100 kilobases in length including providing a plurality of amplification primers comprising an amplification oligonucleotide sequence and a locus-specific oligonucleotide sequence, amplifying the plurality of amplification primers, releasing locus-specific oligonucleotide sequences from the plurality of amplification primers, annealing the locus-specific oligonucleotide sequences to the genomic sequence, and amplifying the genomic sequence using a strand displacing DNA polymerase are provided. In certain aspects, the locus-specific oligonucleotide sequences are released by enzymatic digestion, e.g., by using a uracil-specific extension reagent enzyme. In other aspects, the released locus-specific oligonucleotide sequences comprise a tag, e.g., biotin. In other aspects, the locus- specific oligonucleotide sequences are purified by binding the tag to an immobilized binding partner, e.g., avidin or streptavidin. In yet other aspects, the genomic nucleic acid sequence is genomic DNA. In still other aspects, the amplifying is performed using a DNA polymerase having a proofreading 3'-exonuclease activity, e.g., Phusion DNA polymerase, Pfu DNA polymerase, Pfu Turbo DNA polymerase, Pfu Ultra DNA polymerase, KOD DNA polymerase, phi-29 DNA polymerase, T4 DNA polymerase, DNA polymerase I, DNA polymerase I (Klenow fragment), T7 DNA polymerase, Vent DNA polymerase, Deep Vent DNA polymerase and/or 9°Nm DNA polymerase. In still other aspects, each amplification primer comprises two amplification oligonucleotide sequences that can be the same or different sequences.
 In other embodiments, methods for synthesizing a library of oligonucleotides for selective amplification of a genomic nucleic acid sequence of at least 100 kilobases in length including providing a plurality of amplification primers comprising an amplification oligonucleotide sequence and a locus-specific oligonucleotide sequence complementary to a portion of the genomic nucleic acid sequence, amplifying the plurality of amplification primers, converting the plurality of amplification primers to single-stranded form, and releasing locus-specific oligonucleotide sequences from the plurality of amplification primers are provided. BRIEF DESCRIPTION OF THE DRAWINGS
 The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings in which:
[Oil] Figure 1 schematically depicts selective hyperbranched amplification.
 Figures 2A-2B schematically depict methods for making locus-specific primers from programmable DNA chips. (A) detailed design of chip-synthesis oligonucleotides. Each locus-specific primer is flanked by two amplification adaptors, both containing Type Hs restriction enzyme recognition sites. To normalize the annealing temperature, the actual priming sequences are different in length. A stretch of deoxythymidine fill-ins is added to the 3' ends to normalize the length of chip- synthesis oligonucleotides. (B) probe production procedures that include amplifying chip-synthesized oligos, making single-stranded DNA and releasing inserts.
 Exemplary embodiments of the present invention are directed to novel methods for the selective amplification of large genomic regions. In certain embodiments, a large number of short oligonucleotide sequences (e.g., DNA sequences, e.g., primers) covering the candidate genomic region(s) in roughly even spacing are designed and synthesized (e.g., using a nucleotide array). These sequences can then be amplified and converted to locus-specific primers as described further herein. A library of amplification primers is then hybridized to a template nucleic acid sequence (e.g., genomic DNA) and amplified, e.g., by thermophilic DNA polymerases that have the strand-displacement activity (such as Bst, Vent exo- and the like) as described further herein. Amplification using thermophilic DNA polymerases having strand- displacement activities is both isothermal and exponential. A major distinction over other isothermal, exponential methods such as multiple displacement amplification (MDA), is that the methods described herein rely on a library of region-specific primers to achieve selective amplification, rather than non-specific primers such as random hexamers that are used in methods such as MDA. The combination of region-specific primers, a high reaction temperature (e.g., approximately 50 0C to 65 0C), and the use of enzymes that are active at such a temperature further increases the specificity.
 As used herein, the terms "genomic region" and "genomic nucleic acid sequence" are intended to include, but are not limited to, a region of the hereditary information of an organism encoded by DNA or RNA, including both genes and non-coding sequences. A genomic region can vary in size from a few kilobases to several megabases or more. In certain embodiments, a genomic region is at least 100, 200, 300, 400, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 15,000, 20,000, 25,000, 30,000, 35,000, 40,000, 45,000, 50,000, 55,000, 60,000, 65,000, 70,000, 75,000, 80,000, 85,000, 90,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000 or more kilobases in length. In other embodiments, a genomic region is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more megabases in length.
 As used herein, the terms "nucleic acid molecule," "nucleic acid sequence," "nucleic acid fragment" and "polynucleotide" are used interchangeably and are intended to include, but are not limited to, a polymeric form of nucleotides that may have various lengths, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Non- limiting examples of polynucleotides include a gene, a gene fragment, an exon, an intron, intergenic DNA (including, without limitation, heterochromatic DNA), messenger RNA (mRNA), transfer RNA, ribosomal RNA, siRNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of a sequence, isolated RNA of a sequence, nucleic acid probes, and primers. Polynucleotides useful in the methods of the invention may comprise natural nucleic acid sequences and variants thereof, artificial nucleic acid sequences, or a combination of such sequences.
 A polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA). Thus, the term "polynucleotide sequence" is the alphabetical representation of a polynucleotide molecule; alternatively, the term may be applied to the polynucleotide molecule itself. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching. Polynucleotides may optionally include one or more non- standard nucleotide(s), nucleotide analog(s) and/or modified nucleotides.
 Examples of modified nucleotides include, but are not limited to 5-fluorouracil, 5- bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xantine, 4-acetylcytosine, 5- (carboxyhydroxylmethyl)uracil, 5 -carboxymethylaminomethyl-2-thiouridine, 5 - carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7- methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2- methylthio-D46-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, A- thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, 2,6- diaminopurine and the like. Nucleic acid molecules may also be modified at the base moiety, sugar moiety or phosphate backbone.
 In certain exemplary embodiments, a set of universal primers (e.g., amplification oligonucleotide sequences) may be used to amplify a plurality of locus-specific primers. In exemplary embodiments, amplification oligonucleotide sequences may be designed to be temporary, e.g., to permit removal of the amplification oligonucleotide sequences from the amplification primer at a desired stage after amplification. Amplification oligonucleotide sequences may be designed so as to be removable by chemical, thermal, light based, or enzymatic cleavage. Cleavage may occur upon addition of an external factor (e.g., an enzyme, chemical, heat, light, etc.) or may occur automatically after a certain time period (e.g., after n rounds of amplification).
 Amplification oligonucleotides may be prepared by any method known in the art for the preparation of oligonucleotides having a desired sequence. For example, oligonucleotides may be isolated from natural sources or purchased from commercial sources. In an exemplary embodiment, amplification oligonucleotides may be synthesized on a solid support in an array format, e.g., a microarray of single stranded DNA segments synthesized in situ on a common substrate wherein each oligonucleotide is synthesized on a separate feature or location on the substrate.
 Arrays may be constructed, custom ordered, or purchased from a commercial vendor. Various methods for constructing arrays are well known in the art. For example, methods and techniques applicable to synthesis of construction and/or selection oligonucleotide synthesis on a solid support, e.g., in an array format have been described, for example, in WO 00/58516, U.S. Pat. Nos. 5,143,854, 5,242,974, 5,252,743, 5,324,633, 5,384,261, 5,405,783, 5,424,186, 5,451,683, 5,482,867, 5,491,074, 5,527,681, 5,550,215, 5,571,639, 5,578,832, 5,593,839, 5,599,695, 5,624,711, 5,631,734, 5,795,716, 5,831,070, 5,837,832, 5,856,101, 5,858,659, 5,936,324, 5,968,740, 5,974,164, 5,981,185, 5,981,956, 6,025,601, 6,033,860, 6,040,193, 6,090,555, 6,136,269, 6,269,846 and 6,428,752, Fodor et al. (1991) Science 251 :767, Zhou et al. (2004) Nucleic Acids Res. 32:5409 and Tian et al. (2004) Nature 432:1050.
 Amplification primers may be designed with the aid of a computer program, such as, for example, DNA Works (Hoover and Lubkowski (2002) Nucleic Acids Res. 30:e43) or Gene2Oligo (Rouillard et al. (2004) Nucleic Acids Res. 32:W176). Typically, amplification primers are from about 5 to about 500, about 10 to about 100, about 10 to about 50, or about 10 to about 30 nucleotides in length. In exemplary embodiments, a set of amplification primers and/or locus-specific oligonucleotide sequences or a plurality of sets of primers and/or locus-specific oligonucleotide sequences may be designed so as to have substantially similar melting temperatures to facilitate manipulation of a complex reaction mixture. The melting temperature may be influenced, for example, by primer length and nucleotide composition.
 Typically, selective hybridization occurs when two nucleic acid sequences are substantially complementary, i.e., at least about 65% 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% or 100% complementary over a stretch of at least 14 to
25 nucleotides. See Kanehisa (1984) Nucleic Acids Res. 12: 203.
 Overall, five factors influence the efficiency and selectivity of hybridization of the primer to a second nucleic acid molecule. These factors, which are (i) primer length, (ii) the nucleotide sequence and/or composition, (iii) hybridization temperature, (iv) buffer chemistry and (v) the potential for steric hindrance in the region to which the primer is required to hybridize, are important considerations when non-random priming sequences are designed.
 There is a positive correlation between primer length and both the efficiency and accuracy with which a primer will anneal to a target sequence; longer sequences have a higher Tm than do shorter ones, and are less likely to be repeated within a given target sequence, thereby cutting down on promiscuous hybridization. Primer sequences with a high G-C content or that comprise palindromic sequences tend to self-hybridize, as do their intended target sites, since unimolecular, rather than bimolecular, hybridization kinetics are generally favored in solution; at the same time, it is important to design a primer containing sufficient numbers of G-C nucleotide pairings to bind the target sequence tightly, since each such pair is bound by three hydrogen bonds, rather than the two that are found when A and T bases pair.
 Hybridization temperature varies inversely with primer annealing efficiency, as does the concentration of organic solvents, e.g., formamide, that might be included in a hybridization mixture, while increases in salt concentration facilitate binding. Under stringent hybridization conditions, longer probes hybridize more efficiently than do shorter ones, which are sufficient under more permissive conditions. Stringent hybridization conditions typically include salt concentrations of less than about 1 M, less than about 500 mM, or less than about 200 mM. Hybridization temperatures range from as low as 0 0C to greater than 22 0C, greater than about 30 0C, and (most often) in excess of about 37 0C. In certain exemplary embodiments, hybridization temperatures will be greater than 50 0C, e.g., between about 50 0C and 65 0C or between about 55 0C and 60 0C. Longer fragments may require higher hybridization temperatures for specific hybridization. As several factors affect the stringency of hybridization, the combination of parameters is more important than the absolute measure of any one alone. Hybridization conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.
 In certain embodiments, it may be desirable to utilize locus-specific oligonucleotide sequences having one or more modifications such as a cap (e.g., to prevent exonuclease cleavage), a linking moiety (such as those that facilitate immobilization of an oligonucleotide onto a substrate), or a label (e.g., to facilitate detection, isolation and/or immobilization of a nucleic acid construct, e.g., via a binding partner). Suitable modifications include, for example, various enzymes, prosthetic groups, luminescent markers, bio luminescent markers, fluorescent markers (e.g., fluorescein), radiolabels (e.g., 32P, 35S, etc.), biotin, polypeptide epitopes, etc. Based on the disclosure herein, one of skill in the art will be able to select an appropriate primer modification for a given application.
 As used herein, the term "binding partner" is intended to include, but is not limited to, two or more moieties that bind to one another. Non-limiting examples of binding partners include biotin-avidin and biotin-streptavidin; ligands such as digoxigenin, fluorescein, nitrophenyl moieties and a number of peptide epitopes for which selective monoclonal antibodies exist; metal ion binding ligands such as hexahistidine, which readily binds ions; and the like. In an exemplary embodiment, the locus-specific oligonucleotide sequences described herein include biotin-modifϊed nucleotides. The biotin-labeled, locus-specific oligonucleotide sequences can then be retrieved by binding immobilized avidin or streptavidin.
 In certain embodiments, a binding partner may be immobilized on a substrate. Substrates include, but are not limited to, columns, beads, cells (e.g., S. aureus), agarose, slides, chips, particles, strands, gels, sheets, tubing, spheres, containers, capillaries, pads, slices, films, plates and the like.
 As used herein, the terms "bind," "binding," "interact" and "interacting" refer to both covalent interactions and noncovalent interactions. A covalent interaction is a chemical linkage between two atoms or radicals formed by the sharing of a pair of electrons (i.e., a single bond), two pairs of electrons (i.e., a double bond) or three pairs of electrons (i.e., a triple bond). Covalent interactions are also known in the art as electron pair interactions or electron pair bonds. Noncovalent interactions, which are typically much weaker than covalent interactions, include, but are not limited to, van der Waals interactions, hydrogen bonds, weak chemical bonds (i.e., via short-range noncovalent forces), hydrophobic interactions, ionic bonds and the like. A review of noncovalent interactions can be found in Alberts et al., in Molecular Biology of the Cell, 3d edition, Garland Publishing, 1994.
 In certain embodiments, methods of amplifying amplification primers are provided. Such methods include, but are not limited to, polymerase chain reaction (PCR), bridge PCR, thermophilic helicase-dependent amplification (tHDA), linear polymerase reactions, strand displacement amplification (e.g., multiple displacement amplification), RCA (e.g., hyperbranched RCA, padlock probe RCA, linear RCA and the like), nucleic acid sequence-based amplification (NASBA) and the like, which are disclosed in the following references: Nilsson et al. supra; Schweitzer et al. (2002) Nat. Biotech. 20:359; Demidov (2002) Expert Rev. MoI. Diagn. 2(6):89 (RCA); Mullis et al, U.S. Patent Nos. 4,683,195; 4,965,188; 4,683,202; 4,800,159 (PCR); Gelfand et al., U.S. Patent No. 5,210,015 (real-time PCR with "Taqman" probes); Wittwer et al., U.S. Patent No. 6,174,670; Kacian et al., U.S. Patent No. 5,399,491 (NASBA); Lizardi, U.S. Patent No. 5,854,033; Aono et al., Japanese Patent Pub. JP 4- 262799 (rolling circle amplification); Church, U.S. Patent Nos. 6,432,360, 6,511,803 and US 6,485,944 (replica amplification (e.g., polony amplification"); and the like.
 Certain exemplary embodiments pertain to methods of amplifying amplification primers by circularizing the amplification primers and performing rolling circle amplification (RCA). Several suitable RCA methods are known in the art. For example, linear RCA amplifies circular DNA by polymerase extension of a complementary primer. This process generates concatemerized copies of the circular DNA template such that multiple copies of a DNA sequence arranged end to end in tandem are generated. Exponential RCA is similar to the linear process except that it uses a second primer of identical sequence to the DNA circle (Lizardi et al. (1998) Nat. Genet. 19:225). This two-primer system achieves isothermal, exponential amplification. Exponential RCA has been applied to the amplification of non-circular DNA through the use of a linear probe that binds at both of its ends to contiguous regions of a target DNA followed by circularization using DNA ligase (i.e., padlock
RCA) (Nilsson et al. (1994) Science 265(5181):2085). Hyperbranched RCA uses a second primer complementary to the rolling circle replication (RCR) product. This allows RCR products to be replicated by a strand-displacement mechanism, which can yield a billion-fold amplification in an isothermal reaction (Dahl et al. (2004) Proc. Natl Acad. Sci. U.S.A. 101(13):4548).
 In certain embodiments, amplification primers are contacted with one or more polymerases having a strand-displacement activity (e.g., during or after amplification). In certain exemplary embodiments, a polymerase having a strand- displacement activity is a thermophilic DNA polymerase. Suitable polymerases include, but are not limited to, polymerases having a strand displacement activity such as Bst DNA polymerase, phi-29 DNA polymerase, bacteriophage T5 DNA polymerase, Vent DNA polymerase, Vent (exo~) DNA polymerase, Deep Vent DNA polymerase, Deep Vent (exo~) DNA polymerase, 9°Nm DNA polymerase, THERMINATOR™ DNA polymerase (New England Biolabs, Beverly, MA), THERMOPHI™ DNA polymerase (Prokaria, Reykjavik, Iceland), TOPOTAQ™ DNA polymerase (Fidelity Systems, USA), TH DNA polymerase (Promega, WI), MMuLV reverse transcriptase, Klenow fragment DNA polymerase I, Klenow fragment 3' to 5' exo" and the like.
 In certain embodiments, amplification primers are contacted with one or more high fidelity polymerases containing a proofreading 3'-exonuclease activity (e.g., during or after amplification) alone or in conjunction with one or more of the polymerases described above. Suitable polymerases include, but are not limited to, polymerases having a 3'-exonuclease activity such as PHUSION™ DNA polymerase (New England Biolabs, Beverly, MA), Pfu DNA polymerase, PFU TURBO® (Stratagene, La Jolla, CA), PFU ULTRA™ DNA polymerase (Stratagene, La Jolla, CA), KOD DNA polymerase (Novagen, San Diego, CA), phi-29 DNA polymerase, T4 DNA polymerase, DNA polymerase I, DNA polymerase I (Klenow fragment), T7 DNA polymerase, Vent DNA polymerase, Deep Vent DNA polymerase, 9°Nm DNA polymerase and the like.  In one embodiment, amplification oligonucleotide sequences may be removed from an amplification primer by chemical cleavage. For example, amplification oligonucleotide sequences having acid labile or base labile sites may be used for amplification. The amplified pool may then be exposed to acid or base to remove the amplification oligonucleotide sequences from the amplification primers such that locus-specific oligonucleotide sequences are released from the amplification primers. Alternatively, the amplification oligonucleotide sequences may be removed by exposure to heat and/or light. For example, amplification oligonucleotide sequences having heat labile or photolabile sites may be used for amplification. The amplified pool may then be exposed to heat and/or light to remove the amplification oligonucleotide sequences from the amplification primer. In another embodiment, RNA may be used for amplification oligonucleotide sequences thereby forming short stretches of RNA/DNA hybrids at the ends of the nucleic acid molecule. The amplification oligonucleotide sequences may then be removed by exposure to an RNase (e.g., RNase H).
 Exemplary chemically cleavable internucleotide linkages for use in the methods described herein include, for example, β-cyano ether, 5'-deoxy-5'-aminocarbamate, 3'- deoxy-3'-aminocarbamate, urea, 2'-cyano-3', 5'-phosphodiester, 3'-(S)- phosphorothioate, 5'-(S)-phosphorothioate, 3'-(N)-phosphoramidate, 5'-(N)- phosphoramidate, α-amino amide, vicinal diol, ribonucleoside insertion, 2'-amino- 3',5'-phosphodiester, allylic sulfoxide, ester, silyl ether, dithioacetal, 5'-thio-furmal, α -hydroxy-methyl-phosphonic bisamide, acetal, 3'-thio-furmal, methylphosphonate and phosphotriester. Internucleoside silyl groups such as trialkylsilyl ether and dialkoxysilane are cleaved by treatment with fluoride ion. Base-cleavable sites include β-cyano ether, 5'-deoxy-5'-aminocarbamate, 3'-deoxy-3'-aminocarbamate, urea, 2'-cyano-3', 5'-phosphodiester, 2'-amino-3', 5'-phosphodiester, ester and ribose. Thio-containing internucleotide bonds such as 3'-(S)-phosphorothioate and 5'-(S)- phosphorothioate are cleaved by treatment with silver nitrate or mercuric chloride. Acid cleavable sites include 3'-(N)-phosphoramidate, 5'-(N)-phosphoramidate, dithioacetal, acetal and phosphonic bisamide. An α-aminoamide internucleoside bond is cleavable by treatment with isothiocyanate, and titanium may be used to cleave a 2'-amino-3',5'-phosphodiester-O-ortho-benzyl internucleoside bond. Vicinal diol linkages are cleavable by treatment with periodate. Thermally cleavable groups include allylic sulfoxide and cyclohexene while photo-labile linkages include nitrobenzylether and thymidine dimer. Methods of synthesizing and cleaving nucleic acids containing chemically cleavable, thermally cleavable, and photo-labile groups are described for example, in U.S. Patent No. 5,700,642.
 In other embodiments, amplification oligonucleotide sequences may be removed using enzymatic cleavage. For example, amplification oligonucleotide sequences may be designed to include a restriction endonuclease cleavage site. After amplification, the pool of nucleic acids may be contacted with one or more endonucleases to produce double stranded breaks thereby removing the amplification oligonucleotide sequences. In certain embodiments, the right amplification oligonucleotide sequences and the left amplification oligonucleotide sequences may be removed by the same or different restriction endonucleases. Any type of restriction endonuclease may be used to remove the primers/primer binding sites from nucleic acid sequences. A wide variety of restriction endonucleases having specific binding and/or cleavage sites are commercially available, for example, from New England Biolabs (Beverly, MA). In various embodiments, restriction endonucleases that produce 3' overhangs, 5' overhangs or blunt ends may be used. When using a restriction endonuclease that produces an overhang, an exonuclease (e.g., RecJf, Exonuclease I, Exonuclease T, Si nuclease, Pi nuclease, mung bean nuclease, CEL I nuclease, etc.) may be used to produce blunt ends. In an exemplary embodiment, amplification oligonucleotide sequences are removed by digestion with USER (Uracil-Specific Excision Reagent) (New England Biolabs, Beverly, MA).
 Exemplary embodiments are directed to the use of strand displacement amplification (SDA) to amplify genomic nucleic acid sequences using the locus-specific oligonucleotide sequences described herein. SDA is an isothermal, in vitro method of amplifying DNA. A variety of SDA methods are described in the art (Walker et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 20:1691; Dean et al., (2002) Proc. Natl. Acad. Sci. U.S.A. 99:5261; Hafner et al. (2001) Biotechniques 30:852; Lizardi et al. (1998) Nat. Genet. 19:225; U.S. Patent Application Serial No. 11/066,559, filed February 28, 2005). The use of SDA to amplify genomic nucleic acid sequences e.g., genomic DNA, can lead to the generation of hyperbranched DNA sequences, which can be undesirable.
 In certain embodiments, the presence of hyperbranched DNA may be reduced using an enzyme, e.g., a polymerase, that reduces the density of branch junctions (i.e., of hyperbranched sequences) in the amplified nucleic acid pool. In certain aspects, one or more polymerases having a strand-replacement activity are contacted to an amplified nucleic acid pool (during or after amplification) in order to reduce the presence of branching junctions in the amplified nucleic acid pool. Suitable polymerases include, but are not limited to, polymerases having a strand displacement activity such as Bst DNA polymerase, phi-29 DNA polymerase, bacteriophage T5 DNA polymerase, Vent DNA polymerase, Vent (exo~) DNA polymerase, Deep Vent DNA polymerase, Deep Vent (exo~) DNA polymerase, 9°Nm DNA polymerase, THERMINATOR™ DNA polymerase (New England Biolabs, Beverly, MA), THERMOPHI™ DNA polymerase (Prokaria, Reykjavik, Iceland), TOPOTAQ™ DNA polymerase (Fidelity Systems, USA), TH DNA polymerase (Promega, WI), MMuLV reverse transcriptase, Klenow fragment DNA polymerase I, Klenow fragment 3' to 5' exo" and the like.
 Incubation of a nucleic acid pool in the presence of a polymerase having a strand- replacement activity may not remove all branch junctions, however. Accordingly, certain embodiments are directed to the use of one or more nucleases (during or after amplification) to further reduce branch junctions and/or digest single stranded overhangs in a pool of amplified DNA molecules. In other aspects, the nuclease enzymatically removes 3' overhangs. Suitable nucleases include, but are not limited to, single-stranded DNA endonucleases such as Sl nuclease and other nucleases described further herein. Methods for reducing hyperbranched sequences are described in U.S.S.N. 60/801,340.
 In certain embodiments, methods of determining the nucleic acid sequence of one or more amplified genomic regions are provided. Determination of the nucleic acid sequence of an amplified genomic region can be performed using variety of sequencing methods known in the art including, but not limited to, sequencing by hybridization (SBH), quantitative incremental fluorescent nucleotide addition sequencing (QIFNAS), stepwise ligation and cleavage, fluorescence resonance energy transfer (FRET), molecular beacons, TaqMan reporter probe digestion, pyrosequencing, fluorescent in situ sequencing (FISSEQ), allele-specific oligo ligation assays (e.g., oligo ligation assay (OLA), single template molecule OLA using a ligated linear probe and a rolling circle amplification (RCA) readout, ligated padlock probes, and/or single template molecule OLA using a ligated circular padlock probe and a rolling circle amplification (RCA) readout) and the like. A variety of light-based sequencing technologies are known in the art (Landegren et al. (1998) Genome Res. 8:769-76; Kwok (2000) Pharmocogenomics 1 :95-100; and Shi (2001) Clin. Chem. 47:164-172)
 Examples of detectable markers include various radioactive moieties, enzymes, prosthetic groups, fluorescent markers, luminescent markers, bioluminescent markers and the like. Examples of fluorescent proteins include, but are not limited to, yellow fluorescent protein (YFP), green fluorescence protein (GFP), cyan fluorescence protein (CFP), umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride, phycoerythrin and the like. Examples of bioluminescent markers include, but are not limited to, luciferase (e.g., bacterial, firefly, click beetle and the like), luciferin, aequorin and the like. Examples of enzyme systems having visually detectable signals include, but are not limited to, galactosidases, glucorinidases, phosphatases, peroxidases, cholinesterases and the like. Identifiable markers also include radioactive compounds such as 1251, 35S, 14C, or 3H. Identifiable markers are commercially available from a variety of sources.
 It is to be understood that the embodiments of the present invention which have been described are merely illustrative of some of the applications of the principles of the present invention. Numerous modifications may be made by those skilled in the art based upon the teachings presented herein without departing from the true spirit and scope of the invention. The contents of all references, patents and published patent applications cited throughout this application are hereby incorporated by reference in their entirety for all purposes.
 The following examples are set forth as being representative of the present invention. These examples are not to be construed as limiting the scope of the invention as these and other equivalent embodiments will be apparent in view of the present disclosure, figures, tables, and accompanying claims.
EXAMPLE I Selective Hyperbranched Amplification Method
 The selective hyperbranched amplification method relies on specific annealing of a complex library (e.g., approximately 10,000 species per megabases) of single- stranded oligonucleotides (e.g., primers) to genomic DNA, and selective amplification of the target region by hyperbranched amplification through a strand displacement mechanism (Figure 1). To ensure high-specificity towards the target genomic region, each primer will have a unique binding site in the target genome. In addition, all primers will have similar annealing temperature (Tm), e.g., close to 60 0C, and the amplification will be performed at approximately 60 0C to 65 0C using thermophilic DNA polymerases having a strand-displacement activity (such as Bst DNA polymerase, Vent exo" DNA polymerase and the like). To reduce amplification error and improve amplification yield, one or more high-fidelity DNA polymerases containing a proof-reading 3'-exonuclease activity (e.g., Phusion DNA polymerase, Pfu/Pfu Turbo/Pfu Ultra, KOD and the like) will be included in the reactions in addition to Bst or Vent exo". For degraded or fragmented genomic DNA templates, the amplification yield will be limited by the length of the template DNA molecules. Pre-amplification ligation will be performed on such templates to link template DNA fragments into concatenated sequences and/or circular sequences. Chimeric artifacts generated by ligation will easily detected and resolved computationally by performing deep (e.g., redundant, e.g., greater than 4x) shotgun sequencing.
EXAMPLE II Generating Locus-Specific Primers
 A key component of the selective hyperbranched amplification method described herein is the ability to generate a library of single-stranded oligonucleotides. Synthesizing 1x104 to 1x105 oligonucleotides using traditional column-based solid- phase DNA synthesis method is cost prohibitive (i.e., approximately $25,000 per 10,000 25-mers). As described herein, a programmable DNA chip will be used to synthesize a large number of oligonucleotides at the atto- to femto-mole scale, and nucleotide probes will be generated (see U.S. S.N. 60/846,256) to produce an oligonucleotide library in large quantities (Figure 2).
Amplification primers containing two amplification oligonucleotide sequences (LAA and RAA) and one or more locus-specific oligonucleotide sequences will be synthesized on programmable DNA chips. Chip-synthesized amplification primers will be amplified as a pool by either PCR (Figure 2B, Method A, left panel) or circularization/hyperbranched RCA (hRCA) (Figure 2B, Method B, right panel). The resulting double-stranded amplicons will be converted into single-stranded forms by single-stranded exonuclease (e.g., T7 exonuc lease, lambda exonuclease or the like). The locus-specific oligonucleotide sequences will be released by oligonucleotide- guided restriction enzyme digestion and/or USER™ enzyme (New England Biolabs, Beverly, MA) digestion. To achieve clean separation of locus-specific oligonucleotide sequences from the genomic template during selective hyperbranched amplification, biotin-modified nucleotides will be incorporated into the locus-specific primers during the synthesis phage so that the amplicons can be captured with streptavidin-coated magnetic beads.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US20050158834 *||15 Dec 2004||21 Jul 2005||New England Biolabs, Inc.||Method for engineering strand-specific nicking endonucleases from restriction endonucleases|
|US20060172289 *||5 Jun 2002||3 Aug 2006||Ray Jill M||Combinatorial oligonucleotide pcr|
|US20060292611 *||6 Jun 2006||28 Dec 2006||Jan Berka||Paired end sequencing|
|1||*||'New England Biolabs technical bulletin #E5500' 29 November 2006,|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US20150353921 *||16 Apr 2013||10 Dec 2015||Jingdong Tian||Method of on-chip nucleic acid molecule synthesis|
|International Classification||C12Q1/68, C12P19/34|
|10 Dec 2008||121||Ep: the epo has been informed by wipo that ep was designated in this application|
Ref document number: 08745207
Country of ref document: EP
Kind code of ref document: A1
|14 Oct 2009||NENP||Non-entry into the national phase in:|
Ref country code: DE
|5 May 2010||122||Ep: pct app. not ent. europ. phase|
Ref document number: 08745207
Country of ref document: EP
Kind code of ref document: A1