MICROSATELLITE LOCI BASED ON THE CHICKEN GENOME
CHAPTER 5 CHARACTERISATION OF GRUS MICROSATELLITE LOCI
5.2 METHODS
5.2.1 Pre-characterisation procedures
Checking for additional inserts in crane sequences
The restriction enzyme Mho] (Qbiogene), used to restrict the blue crane DNA during library development,cleaves the DNA such that a cleaved DNA fragment can be identified by having a 'G ATC' at the 5' and 3' fragment end. Therefore, this 'GATC' sequence can be used to identify chimeric DNA and the site at which the sequence was ligated into the plasmid vector. To identify these sequences,blue crane sequences (Appendix Ill) were first arranged in aFASTA-formatted document (the symbol '>' used before the sequence name to signal to the search engine that this line contains non-sequence data). The 5' and 3' 'GATC' sequence was located in each sequence, and the sequences upstream and downstream from the 5' and 3' 'GATC',respectively, were highlighted and recognised as being either chimeric or plasmid. Plasmid DNA was excluded from further analysis ensuring a) primers flank a true microsatellite locus and b) chromosomal locations of microsatellites were identified based onGrusmicrosatellite sequences.
Checkingfor duplicate sequences within and between microsatellite libraries
To identify all duplicate sequences, a stand-alone BLAST (following the protocol given below by Leviston et al. 2004) was performed for all sequences obtained from the blue crane as well as the unpublished sequences provided by Travis Glenn for the whooping crane (Appendix Ill). This protocol uses MS Dos as the running program to BLAST each sequence against all other sequences in the FASTA-formatted query file. Provided in the output file are aligned sequences and similar matches, if any, identified by the BLAST search. The output of the BLAST assigns an E-value to those sequences identified as having high sequence similarity to the query sequence. Sequences of high similarity (those sequences given an E-value < e-10) were imported in to the sequence alignment program MEGA v3.1 (Kumar et al. 2004). The flanking regions and microsatellite repeat regions
were compared to determine whether sequences were 100% identical or two alleles of the same heterozygous locus.
Prefixes used for Grus loci
When submitting sequences to EMBL, sequences generated for the blue crane from the Grus loci previously developed,Gamu andGjloci,were submitted under the original locus names. The novel blue crane microsatellite sequences, developed for this study, were submitted using the prefix 'Gpa'(Grusparadisea).
5.2.2 Samples used
Blood samples from 103 (102 + 1 duplicate sample) blue cranes were provided by WBRC and SACWG (section 2.2.1, and listed in Appendix I) and DNA extraction followed the ammonium acetate protocol (section 2.2.2). A selection of microsatellite loci originally developed in whooping crane (Glenn 1997), red-crowned crane (Hasegawaet al. 2000) and blue crane (developed for this study),respectively,were tested for amplification in the blue crane. A section of unrelated individuals for allele statistic analyses was achieved by excluding individuals that were ringed at the same GPS co-ordinates (the same nest site).
5.2.3 General locus characteristics
peR conditions and genotyping
PCR reactions were carried out using the optimisation procedure described previously (section 2.2.6). Genotyping was carried out described elsewhere (section 2.3.6).
From the blue crane genomic microsatellite library, 19 microsatellite loci were tested to work with time constraints in finding a suitable set of microsatellite loci for parentage analysis. This number was selected because a large number of whooping crane and red- crowned crane loci had been characterised, and found to be highly informative, in the blue crane prior to the availability of species-specificblue crane microsatellite loci.
Locusstatistics
Locus statistics refer to the number of alleles, Hardy-Weinberg equilibrium (HWE) estimates, null allele frequencies and exclusionary powers calculated for each locus.
Exclusionary powers were calculated for each polymorphic locus as well as for all the polymorphic loci combined.
Genotypes from all loci that amplified in unrelated blue crane individuals tested were analysed with the software program Cervus 2.0 (Marshallet al. 1998) to calculate locus statistics. HWE was calculated by comparing observed heterozygosities with expected heterozygosities, and those that differed significantly were identified as showing significant deviations from HWE. However Cervus 2.0 was unable to calculate HWE estimates for many loci, and the reason given by Cervus 2.0 was due too few individuals present to allow the test to proceed. Therefore,HWE was also calculated using Genepop 3.4 (Raymond et al. 1995).
The frequency of null alleles was calculated by comparmg expected and observed homozygosities (Marshallet al. 1998). Those loci with an excess of homozygotes are given a positive allele frequency, and contrastingly those loci with a deficit of homozygotes are given a negative allele frequency. Only those loci having a positive allele frequency greater that 0.1 were considered as having null alleles.
Mode ofinheritance
One blue crane family comprising two parental individuals (mother, 4615; father: 4636) and two offspring (male offspring, 4626; female offspring, 4633) was analysed to confirm that each marker was inherited in a Mendelian fashion. Unfortunately,only a single blue crane family was available for this study. This must be taken into consideration, as the results obtained from the inheritance study using only one family may not be as accurate as a study with a sample size greater than one. The genotypes from each of the family members across all loci tested were entered into a spreadsheet. A locus was confirmed to be inherited in a Mendelian fashion when both alleles from each offspring could be assigned a
parental origin. Importantly, each allele was checked to have a different parental origin i.e.
one allele identified as being paternally inherited and the other maternally, in order to fulfil the conditions of Mendelian inheritance.
Linkage disequilibrium
Linkage disequilibrium between all possible pairs of 28 polymorphic Grus loci in 56 unrelated blue crane individuals was calculated using Genepop 3.4 (Raymond et al. 1995).
Two-digit Genepop input files were creating using the file conversion tool available in Cervus 2.0 (Marshall et al. 1998). The online version of Genepop was used with the following criteria: 'Test for each pair of loci in each population' with 'Demorization number', 'Number of batches', and 'Number of iterations per batch' set to 1000, 500, and 10000, respectively. Significant (P < 0.05) linkage disequilibrium P-values were corrected for multiple comparisons using the sequential Bonferroni method (Rice 1989).
Chromosomal location ofgenetic markers
The full description of analysis procedures and outcomes of mapping Grus microsatellite loci are provided in Chapter 4. Results obtained from the predicted chromosome map are compared here with the results from the linkage disequilibrium to determine the level of agreement between these two independent methods of identifying linked loci.
5.2.4 Further cross-species amplification ofGrus loci
Twenty-eight polymorphic Grus loci (identified as polymorphic in section 5.3.2) were tested for utility in two other African crane species: grey-crowned and wattled crane. In addition, cross-species analysis was examined in non-Gruidae species including houbara bustard(Chlamydotis undulata), grey-headed albatross (Diomedea chrysostma), Seychelles warbler (Acrocephalus sechellensis), Cape parrot (Poicephalus robustus) and red jungle fowl (Gallus gallus). Two non-avian species tested included the salt-water crocodile (Crocodylus porosus) and human (Homo sapiens).
PCR reactions for non-Grus species took place using the optimised conditions for Grus loci (section 5.3.2). Touchdown PCR was performed for wattled and grey-crowned crane (both being Grus species) samples. High quality PCR products were required for genotyping to determine levels of polymorphism of Grus loci in these two species. Touchdown PCR (Don et al. 1991) is a method used to reduce the amount of spurious amplifications by using a range of annealing temperatures to amplify the PCR product, starting with the highest Ta. It begins to exponentially amplify the product which amplified at higher annealing temperatures as the T, cycles through the lower temperatures. Products amplified at the higher temperatures have less non-specific amplification due to more stringent amplification conditions, resulting in the amplification of clean PCR products that are suitable for genotyping.
The temperature range used in the Touchdown profile for each primer pair was 3 QC above and below the previously optimised Ta. For example, a marker previously optimised at T, of 52 QC had a Touchdown profile with the highest and lowest T, of 55 QC and 49 QC, respectively. The PCR profile used was 94 QC for 3 min; 5 cycles of 94 QC for 30 s, highest T, decreasing with 1 QC increments per cycle, 72 QC for 30 s; followed by 25 cycles of 94 QC for 30 s, lowest T, for 30 s, 72 QC for 30 s, completing the profile with an extension at 72 QC for 4 min. Genotyping took place as described in section 2.3.6, however samples were analysed using an ABI 3100 DNA analyser, and not an ABI 3730.
Human DNA was initially tested for PCR suitability using unpublished human-derived primers HuMIF 1 and HuMIR3 provided by M. Vaez (2006). Each 20 III PCR reaction contained 2 III reaction buffer, 0.2 III 50mMMgCb, 0.8 III 5 mM dNTP, 0.4 III 10 uM of each primer, 0.4 III Taq DNA polymerase (BIOTAQ, Bioline Ltd., London, UK) in the manufacturers buffer (Final constituents: 16 mM (NH4)2S04, 67 mM Tris-HCI (pH 8.8 at 25 QC), 0.01% Tween-20), 15.2 III ddH20, 0.6 III l Ong/ul DNA. PCR conditions were 94 QC for 2 min; 40 cycles of 94 QC for 30 s, 58 QC for 27 s, 72 QC for 27 s; followed by a final extension time of 72 QC for 5 min.