ZFN-Site Searches Genomes For Zinc Finger Nuclease Target .

3y ago
22 Views
2 Downloads
3.90 MB
10 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Brenna Zink
Transcription

ZFN-Site searches genomes for zinc fingernuclease target sites and off-target sitesCradick et al.Cradick et al. BMC Bioinformatics 2011, 2 (13 May 2011)

Cradick et al. BMC Bioinformatics 2011, 2SOFTWAREOpen AccessZFN-Site searches genomes for zinc fingernuclease target sites and off-target sitesThomas J Cradick1*†, Giovanna Ambrosini2,3†, Christian Iseli2,4, Philipp Bucher2,3 and Anton P McCaffrey1AbstractBackground: Zinc Finger Nucleases (ZFNs) are man-made restriction enzymes useful for manipulating genomes bycleaving target DNA sequences. ZFNs allow therapeutic gene correction or creation of genetically modified modelorganisms. ZFN specificity is not absolute; therefore, it is essential to select ZFN target sites without similargenomic off-target sites. It is important to assay for off-target cleavage events at sites similar to the targetsequence.Results: ZFN-Site is a web interface that searches multiple genomes for ZFN off-target sites. Queries can be basedon the target sequence or can be expanded using degenerate specificity to account for known ZFN bindingpreferences. ZFN off-target sites are outputted with links to genome browsers, facilitating off-target cleavage sitescreening. We verified ZFN-Site using previously published ZFN half-sites and located their target sites and theirpreviously described off-target sites. While we have tailored this tool to ZFNs, ZFN-Site can also be used to findpotential off-target sites for other nucleases, such as TALE nucleases.Conclusions: ZFN-Site facilitates genome searches for possible ZFN cleavage sites based on user-definedstringency limits. ZFN-Site is an improvement over other methods because the FetchGWI search engine uses anindexed search of genome sequences for all ZFN target sites and possible off-target sites matching the half-sitesand stringency limits. Therefore, ZFN-Site does not miss potential off-target sites.BackgroundThe ability to create double-stranded DNA breaks atspecific genomic sequences is important for gene correction therapeutics, targeted gene integration and genemodification for research models as well as gene disruption [1]. Zinc Finger Nucleases (ZFNs) are promisingcandidates for such specific nucleases. ZFNs consist ofthe sequence-independent FokI nuclease domain fusedto zinc finger proteins (ZFPs). ZFPs can be altered tochange their sequence specificity. Cleavage of targetedDNA requires binding of two ZFNs (designated left andright) to adjacent half-sites on opposite strands withcorrect orientation and spacing, thus forming a FokIdimer [2]. The requirement for dimerization increasesZFN specificity significantly. Three or four finger ZFPstarget 9 or 12 bases per ZFN, or 18 or 24 bases for* Correspondence: tj@alum.mit.edu† Contributed equally1University of Iowa School of Medicine, Department of Internal Medicine,Iowa City, Iowa, 52245, USAFull list of author information is available at the end of the articlethe ZFN pair. ZFN pairs have been used for gene targeting at specific genomic loci in insect, plant, animal andhuman cells [3-10] (and reviewed in [11,12]). Methodsare available to measure general ZFN toxicity or theamount of unrepaired DNA ends resulting from ZFNtreatment [13-16]; however, determining all possible offtarget cleavage sites may be challenging, as some possible cleavage sites can be missed by BLAST and similarmethods. ZFN-Site determines the most probable offtarget sites for further analysis or testing. Several ZFNdesign web tools exist that offer BLAST-based searchesfor potential ZFN off-target sites [17-22]. BLASTsearches, which implement a local alignment search, arenot optimal for finding ZFN off-target sites and maymiss some sites because they utilize seed-based methodswith a non-overlapping word index to search only forperfect matches, rather than longer imperfect matches.BLAST also uses an E-value threshold that does notdirectly correspond to a “# of mismatches” threshold.ZFN-Site is more thorough because it scans one indexentry for each nucleotide in the genome, ensuring that 2011 Cradick et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.

Cradick et al. BMC Bioinformatics 2011, 2no matches are missed. ZFN-Site was created to providea simple, easy-to-use interface that does not require theend user to possess specialized bioinformatics or searchalgorithm expertise. ZFN-Site provides an interface thatsearches multiple genomes for sites with ambiguities,mismatches, multiple spacings, hetero-dimeric bindingsites and homo-dimeric binding sites composed of twoleft or two right ZFN half-sites. Changing these parameters can expand the number of possible off-targetsites returned to match the purpose. A larger listenables thorough screening for potential ZFN off-targetsites using new methods, such as high-throughputsequencing or mutation screens.ImplementationZFN-Site was developed to quickly locate all possibleZFN target and off-target sites that might be cleaved.Based on the tailoring of search parameters, ZFN-Sitegenerates sets of search strings. To ensure that all sitesmatching these criteria are found in the requested genomes, ZFN-Site employs the FetchGWI search engine[23]. The input can be either the nucleotide sequence ofthe intended target site of each ZFN (basic search) orinformation about each ZFN’s binding specificity (relaxedspecificity). The number of possible sites is expanded bychoice of ZFN spacing, the possibility of ZFN homodimerization (see below) and the number of allowed mismatches. The output from ZFN-Site aids in the choice ofZFN pairs that minimize potential off-target sites andallows experimental testing of each ZFN pairs’ off-targetsites in cells or in mutated animals. Experimentally testing the list of found sites under a series of different conditions may determine the conditions favoring morespecific targeting and less off-target cleavage events.Basic Target SearchThe simplest search method uses the intended targetsite to scan whole genomes. This type of search is valuable when choosing prospective target sites or whenthere is no available ZFP mismatch specificity data.ZFN-Site allows searches for off-target sites containingup to two mismatches per half-site. ZFN-Site outputs alltarget and off-target sites matching the selection criteria.The genome or genomes to be searched are chosen byclicking on the species list on the left side of the ZFNSite web page. Scrolling down reveals the full list. Usecommand-click (mac) or control-click (pc) to choosemultiple genomes to be searched simultaneously. A clickon ALL searches the entire list of genomes shown inTable 1.Half-sites are entered without spaces, 5’ to 3’, as theyoccur on the opposite strand of a ZFN target. The following sequence is an example of the top DNA strandof a three finger ZFN pair target site: 5’-CGGAGC-Page 2 of 9Table 1 List of Genomes Scanned by ZFN-SiteGenome Release (Code)SpeciesHomo sapiens (HS)HumanMus musculus (MM)MouseDanio rerio Zv6 (DR)ZebrafishDanio rerio Zv5 (DR5)ZebrafishDrosophila melanogaster (DM)Fruit FlyApis mellifera (AME)BeeBos taurus (BT)CowCaenorhabditis elegans NCBIWS170(CE)Canis familiaris (CFA)NematodeDogPan troglodytes (PTR)ChimpanzeeRattus norvegicus (RN)RatSaccharomyces cerevisiae (SCE)YeastTribolium castaneum (TCA)BeetleAll genomes (ALL)All of the aboveCGCTTTaacccACTCTGTGGAAG-3’[3]. The right ZFNhalf-site is underlined and should be entered into theprogram 5’-3’ as ACTCTGTGGAAG. The left ZFN halfsite is the reverse complement of the bold sequence andshould be entered 5’-3’ as AAAGCGGCTCCG (Figure 1).The sequence of the DNA spacer between ZFN halfsites (lower case, above) does not greatly influence ZFNspecificity, but the length of the spacer between halfsites influences how well a site is cleaved [24]. Theallowed number of spacer nucleotides depends on theZFP-to-FokI linker and is usually five or six nucleotides,although ZFNs with altered linkers have differentnucleotide length preferences [25,26]. Genome searchescan be run on ZFN-Site with one allowed spacingbetween half-sites or two spacings if entered separatedby a comma (e.g., 5,6). Searches can be repeated usingalternate spacings if searching with more than two spacings is required.In addition to a left ZFN and a right ZFN binding ashetero-dimers, two left or two right ZFNs can bind correctly spaced sites to form homo-dimers and cleave offtarget sites [16]. If the “Allow Left and Right ProteinHomo-dimerization” box is checked, ZFN-Site alsosearches for homo-dimeric sites. Use of modified FokIdomains may prevent cleavage at most homo-dimericsites [13,27]. However, identification of homo-dimericsites and experimental testing for cleavage at each siteon these output lists may be necessary to quantitate lowlevels of cleavage and generate further predictive rulesfor off-target cleavage events. The specificity of nucleasevariants can be experimentally tested using cleavageanalysis on the sites comprising the lists of possible offtarget sites generated by ZFN-Site [13,25,27-29].ZFN-Site expands the query targets into a list ofqueries (or tags) based on the half-sites and inputs.Using increased ambiguities broadens the search.

Cradick et al. BMC Bioinformatics 2011, 2Page 3 of 9Figure 1 ZFN-Site genome scan using Basic Target Search. ZFN-Site search for Sequence 1 using the half-sites described in the text, whichare the ZFN target sites found in IL2R-g [1]. The inputs are set to search the human genome allowing five and six base pair spacing, twomismatches and homo and hetero-dimerization of the half-sites.Degenerate nucleotides (specified by standard IUPACcodes) are allowed in the half-site queries because theyare then expanded into all possible matching tags. Thesequeries are submitted to an exact search algorithm(described in [23]). The number of such queriesincreases with the required mismatches and ambiguities(such as Ns and nucleotide IUPAC codes), thus increasing RAM and search time required. Very complexsearches may be achieved by breaking the search intoparts to speed processing and prevent stalling.The number of mismatches per half-site (0, 1 or 2) isinputted into the last box. Use 0 to scan only for sitesexactly matching the half-sites. This mode is useful forverifying the location of target sites in one or more genomes. The number of off-target sites returned can begreatly increased by allowing 1 or 2 mismatches perhalf-site. The use of ambiguous nucleotides in the halfsites does not count as a mismatch, and both can beused if needed. Mismatches are allowed in degeneratepositions as well. If the user specifies a search with oneor two mismatches, ZFN-Site will generate all possiblesequence tags that match the target up to the specifiednumber of mismatches.Once the information above is entered, clicking runwill display the query sequences on the next web page,while the genome searches are performed using theFetchGWI program (see paragraph on FetchGWIbelow). ZFN-Site outputs a list of half-site matchessorted by genome position. This list is scanned by a second program that extracts all combinations on eachDNA strand that have the required spacing. For fastperformance on the Web, we have limited the numberof possible mismatches per ZFN half-site to two. Thetotal number of degenerate nucleotides is also limited totwo, such that the computational complexity ismanageable.Based on these inputs, ZFN-Site generates a list ofgenomic sequences that are exact or near-exact matchesto the input query set, along with chromosomal coordinates (including NCBI chromosomal accession numberand the start and end positions within the chromosome),DNA strand and HTML links to their exact location onENSEMBL, UCSC and NCBI browsers [23] (Figure 2).Results are output under “WORD MATCHES” in a twoline format for each genomic sequence returned. The topline of each pair of lines depicts the genomic sequence.The lower line displays the differences from the querysequence. Spacer nucleotides are indicated in blue, andin cases where there are ambiguous nucleotides, genomicnucleotides matching an unambiguous portion of thequery sequence are in blue. The number of nucleotidesin the spacer is indicated by the number of green Ns inthe lower line. Red nucleotides depict mismatches. Thenumber of mismatches is displayed, not including positions with degenerate nucleotides (unless mismatchesoccur at degenerate positions). The next four columnslist the matched sequence’s “Species”, “ChromosomalCoordinates [start.end]”, “Strand” and “Links to GenomeBrowsers”. Clicking on the HTML links to the right of amatched genomic sequence will open a browser in eitherthe ENSEMBL, UCSC or NCBI genome browsers. Thiswill direct the user to that exact location, allowing one toidentify whether that targeted sequence is in an annotated gene, intron, exon or regulatory sequence.ZFN-Site can be used to determine if ZFNs may be usedto specifically target sites in multiple different genomes.ZFN-Site can scan multiple genomes simultaneously usingthe same settings or can be run sequentially.Relaxed Specificity SearchPrevious in vitro and cellular ZFP specificity studies mayhelp determine other sequences that may be possibly

Cradick et al. BMC Bioinformatics 2011, 2Page 4 of 9Figure 2 ZFN-Site Results. ZFN-Site output listing the IL2R-g target sequence, in row 1, and other genomic sequences matching the searchcriteria in Figure 1. Non-matching bases are shown in red below the correct base. Between each pair of target sequences is a spacer with itsgenomic sequence shown in blue. The number of nucleotides in the spacer is indicated by the number of green Ns. Each sequence row alsolists the number of mismatches, chromosomal location, DNA strand and HTML links to their exact location on ENSEMBL, UCSC, NCBI and NCBIbrowsers. The link to results in text format provides sequences in the list ordered by increasing number of mismatches.cleaved by a ZFN pair. This information can come fromstudies of individual fingers [30-32]. Without SystematicEvolution of Ligands by Exponential Enrichment (SELEX)or similar data (described below), the specificity of a ZFNcan be approximated by combining the specificity of theindividual fingers, even though this fails to account for theeffects of adjacent fingers. There are many manuscriptsdetailing individual ZFP specificity; non-exhaustive examples include [30-35]. Approximating the specificity of thewhole ZFN by compiling the relaxed specificity of the constituent ZFPs may provide more predictive results thanusing the basic target search, as the individual finger datamay help determine the non-specified bases. If there areindividual nucleotide positions where the ZFPs can bindseveral nucleotides, standard IUPAC ambiguity codesshould be entered in the half-site.More specific information comes from binding studiesof full ZFPs or ZFNs using SELEX. Searches based on

Cradick et al. BMC Bioinformatics 2011, 2experimentally determined specificity are more informative than searches with increased mismatches. If there isSELEX or similar data describing each ZFN’s bindingspecificity, it is also entered in 5’ to 3’ orientation usingstandard IUPAC ambiguity codes (as in Figure 3). Thisallows relaxed specificity searches. For example, anucleotide in a half-site that can be bound if it is eitherG or T can be entered as a K. Any non-specified position can be represented by an N (N A, C, G or T). Ifscanning with two mismatches, the pair of half-sitesshould contain less than three ambiguities to preventcomputational stalling (see above).FetchGWIZFN-Site uses FetchGWI to perform rapid and accuratesearches of the large sequence databases comprising fullgenomes. FetchGWI is a C program that relies on precomputed genome indices and is best used in caseswhere queries must be mapped very rapidly and efficiently. To get maximal search speed, FetchGWI onlysearches within the index files that represent the genome sequences. There is one index entry for eachnucleotide in the genome. This exhaustive index alsoensures that no match can possibly be missed. Otherprograms, such as BLAST, occasionally scan non-overlapping words and thus can miss possible off-target sites(see below) [20].Testing Located Off-Target SitesPredicted genomic off-target sites should be tested forcleavage. The HTML links are used to download thesequences flanking the site, for use in designing amplification primers for either mutation or sequence analysis.The listed potential off-target sites can be assayed byPCR and mutation detection [7] or deep sequencing [5]to determine ZFN specificity.Page 5 of 9If ZFN-Site locates more sites that match the selectedcriteria than can be tested, the criteria may be narrowedby using less mismatches or using less ambiguousnucleotides for relaxed searches. The list of found sitescan also be narrowed using the text output. If the textoutput link is clicked, the found sites are outputted inanother screen in order of increasing number of totalmismatches. If a search is conducted using two mismatches per half-site, the output can be greatly narrowed by selecting the genomic sequences at the top ofthe list with three or fewer total mismatches.This list of possible target sequences can be furtheranalysed using other computer programs. For example,the output can be ranked using an excel spreadsheetcontaining a positional weight matrix based on experimentally determined specificity data as described below.ResultsZFN-Site was validated by comparing our results to apreviously published study by Perez et al. [7]. Perez etal. looked for off-target cleavage by a pair of ZFNsspecific for the gene coding for human C-C chemokinereceptor type 5 (CCR5). This study used an unpublished algorithm to identify potential off-target sites byscanning the human genome using in vitro SELEXselection specificity data [7]. Their sequencing of theidentified off-target sites revealed that a site in therelated CCR2 gene was also cleaved at a low frequency.The left and right ZFN half-sites, including ambiguitiessuggested from their SELEX data, were compiled andentered into ZFN-Site (Figure 3). ZFN-Site found theCCR5 target site and each of the off-target sites ontheir list, including the experimentally verified CCR2off-target cleavage site (Figure 4). Additional file 1, Figure S1 contains ZFN-Site output with less than threetotal mismatches.Figure 3 Benchmarking ZFN-Site against a published CCR5 ZFN off-target analysis. Previously, Perez et al. used SELEX to determine therelaxed specificity of a ZFN pair targeting the CCR5 gene and used this data to scan the genome. We scanned the human genome with ZFNSite, configured as shown, using the CCR5 ZFN half-sites from Perez et al. with ambiguities matching their SELEX data. The bases allowingsubstitutions are shown in lower case letters. ZFN-Site found each site they listed, paired with their results in Figure 4.

Cradick et al. BMC Bioinformatics 2011, 2Page 6 of 9Figure 4 ZFN-Site returns sites found in previous CCR5 ZFN off-target analysis. The sequences returned by ZFN-Site were matched to thesequences found by Perez et al. For clarity of presentation, the ZFN-Site output was arranged to match the order of Perez et al. ZFN-Site foundall the sites found by the unpublished algorithm of Perez et al., thus validating ZFN-Site. We replaced the column containing the genomebrowser

The number of mismatches per half-site (0, 1 or 2) is inputted into the last box. Use 0 to scan only for sites exactly matching the half-sites. This mode is useful for verifying the location of target sites in one or more gen-omes. The number of off-target sites returned can be greatly increased by allowing 1 or 2 mismatches per half-site.

Related Documents:

Chapter 21: Genomes & Their Evolution 1. Sequencing & Analyzing Genomes 2. How Genomes Evolve. 1. Sequencing & Analyzing Genomes Chapter Reading – pp. 437-447. Whole Genome Shotgun Sequencing Cut the DNA into overlapping frag-ments short enough for sequencing. 1 Clone the fragments in plasmid or phage vectors. 2 Sequence each

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

At Google I/O in 2016, when the company unveiled its new Google Home device, they announced that over 20 percent of searches have voice intent.1 And, the number of voice searches is increasing every day. According to ComScore, 40 percent of adults use voice search once per day.2 And, by 2020, 50 percent of all searches will be voice searches.3

Text and illustrations 22 Walker Books Ltd. Trademarks Alex Rider Boy with Torch Logo 22 Stormbreaker Productions Ltd. MISSION 3: DESIGN YOUR OWN GADGET Circle a word from each column to make a name for your secret agent gadget, then write the name in the space below. A _ Draw your gadget here. Use the blueprints of Alex’s past gadgets on the next page for inspiration. Text and .