The Use of Retrotransposon-Based Molecular Markers to Analyze Genetic Diversity

Molecular markers play an essential role in all aspects of genetics, modern plant breeding, in human forensics, for map-based cloning of genes, ranging from the identification of genes responsible for the desired traits to the management of backcrossing programs. Retrotransposons are well suited as molecular markers. As dispersed and ubiquitous transposable elements, their “copy and paste” life cycle of replicative transposition leads to new genome insertions without excision of the original element. Both the overall structure of retrotransposons and the domains responsible for the various phases of their replication are highly conserved in all eukaryotes. Following the demonstration that retrotransposons are ubiquitous, active, and abundant in plant genomes, various marker systems were developed to exploit polymorphisms in retrotransposon insertion patterns. This review provides an insight into the spectrum of retrotransposon-based marker systems developed for plant species and evaluates the contributions of retrotransposon markers to the analysis of genetic diversity in plants and the way for the rapid isolation of retrotransposon termini.


Introduction
DNA markers play an essential role in the study of genetic variability and diversity, in the construction of linkage maps, and in the diagnosis of individuals or lines carrying certain linked genes.The emergence of DNA marker systems has, for the last 40 years (Hubby & Lewontin 1966), closely tracked developments in biochemistry and molecular biology.The shortcomings of biochemically based markers, such as isozymes, drove the development of markers based on DNA polymorphisms (Kan & Dozy 1978).These marker types generate "fingerprints", distinctive patterns of DNA fragments resolved by electrophoresis and detected by staining or labelling.A molecular marker in essence detects nucleotide sequence variation at a particular location in the genome if this nucleotide sequence is different between the parents of the chosen cross to be distinguishable between plant accessions and to finally study its pattern of inheritance.The advent of the polymerase chain reaction (PCR) was a breakthrough for molecular marker technologies, and made many fingerprinting methods possible.
These fall into two broad categories, namely methods that detect single loci and multiplex methods that detect multiple loci simultaneously.
Interspersed repetitive sequences comprise a large fraction of the genome of many eukaryotic organisms and they are predominantly comprised of transposable elements (TEs).In most species that have been studied interspersed repeats are distributed unevenly across the nuclear genome and some repeats have a tendency to cluster around the centromeres or telomeres.Following the induction of recombinational processes during the meiotic prophase, variation in the copy number of repeat elements and internal rearrangements on both homologous chromosomes can ensue.The resulting heterogeneity in the arrangement of distinguishable repeats has been exploited for specific molecular markers technique targeted this repeat element.
Numerous methods have been developed that exploit repeated sequences as molecular markers.In an early example, Restriction Fragment Length Polymorphism (RFLP), probes derived from repetitive sequences were hybridized to Southern blots of restriction-digested genomic DNA to produce a highly variable pattern (Lee et al. 1990).The RFLP technique was used extensively in the past, but has been replaced by PCRbased methods due to the slowness of Southern blotting.
Alternatively, repeats can be used as single primers in the polymerase chain reaction.The first multiplex methods to be developed were named Randomly Amplified Polymorphic DNA (RAPD; Williams et al. 1990;Welsh & McCleland 1990) and DNA Amplification Fingerprinting (DAF; Caetano-Anollés et al. 1991) respectively, and involve amplification of random repetitious sites in the genome using short primers, typically 8 -12 nt in length.The approaches involve quick and easy reaction set-up and no genome sequence information is needed to design the primers.However, problems in reproducibility due to the presence of huge numbers of potential priming sites in the genome and the low annealing temperatures in the reactions, derived from the nature of the primers themselves, have led to this method largely disappearing from the molecular marker toolkit today.
Nucleotide sequences matching repetitive sequences showing polymorphism in RFLP analyses have also been used as PCR primers for the inter-repeat amplification polymorphism marker method (Meyer et al. 1993, Sivolap et al. 1994, Salimath et al. 1995).Such repetitive sequences include microsatellites, such as (CA/ GT)n or (CAC/GTG)n which are distributed throughout the genome.A derived approach was developed to generate PCR markers based on amplification of microsatellites near the 3' end of the Alu (SINE) transposable elements (TEs), called Alu-PCR or SINE-PCR (Chariieu et al. 1992).The successful application of microsatellitespecific oligonucleotides as PCR primers was first described by Tautz (1989), Meyer et al. (1993), Sivolap et al. (1994) and Zietkiewicz et al. (1994) who amplified DNA from different sources, for example, with primers (GATA)n, (GACA)n.

Retrotransposons
Retrotransposons present one of the most fluid genomic components, varying greatly in copy number over relatively short evolutionary timescale and represent a major component of the structural evolution of plant genomes (Finnegan 1989, Flavell et al. 1992, Voytas et al. 1992, Suoniemi et al. 1998).
Retrotransposons are one of the two major groups of transposable elements in eukaryotic genomes and are defined according to their mode of propagation.Retrotransposons belongs to class I TEs and transpose via an RNA intermediate in contrast to other transposons (class II) that do not have an RNA intermediate (Finnegan 1989) (Fig. 1).Retrotransposons are separated in two major subclasses that differ in their structure and transposition cycle.These are the LTR retrotransposons and the non-LTR retrotransposons (long interspersed repetitive elements (LINE) and short interspersed nuclear elements (SINE)), are distinguished by the respective presence or absence of long terminal repeats (LTRs) at their ends.All groups are complemented by their respective nonautonomous forms which lack one or more of the genes essential for transposition: MITEs (Miniature Inverted-Repeat Tandem Elements) for Class II, SINEs for non-LTR retrotransposons, and TRIMs (Terminal-Repeat Retrotransposons in Miniature) and LARDs (Large Retrotransposon Derivatives) for LTR retrotransposons (Kalendar et al. 2004).
LTR retrotransposons are transcribed from one LTR of an integrated element to produce a nearly full length RNA copy containing a single copy of the LTR split between its two Figure 1.Organization of an LTR retrotransposon.The retrotransposon is bounded by long terminal repeats (LTRs) which contain the transcriptional promoter and terminator.The LTRs contain short inverted repeats at either end, shown as filled triangles.Reverse transcription is primed at the PBS and PPT domains, respectively for the (−) and (+) strands of the cDNA.The internal region of the retrotransposon codes for the proteins necessary for the retrotransposon life cycle: the capsid protein (GAG), aspartic proteinase (AP), which cleaves the polyprotein (AP); integrase (IN), which inserts the cDNA copy into the genome; reverse transcriptase (RT) and RNaseH (RH), which together copy the transcript into cDNA.The internal region contains evolutionarily conserved domains (noted below the element as black boxes), necessary for function that can be used to isolate retrotransposons from previously unstudied plant species.The LTRs are generally well-conserved within families, and can serve for the design of primers to generate DNA footprints.ends (the LTR provides both the start site and polyadenylation signal for the element; Fig. 1).This RNA is then reverse-transcribed into an extrachromosomal cDNA, reconstituting the full length element that is ultimately integrated back into the genome.Immediately internal to the LTRs are the priming sites for reverse transcription.The large central part of the retrotransposon encodes the structural components of a virus-like particle into which the RNA is inserted, together with reverse transcriptase and integrase enzymes.
Both the overall structural features as well as the basic stages of the life cycle are shared by the retrotransposons and the retroviruses (Frankel & Young 1998, Kim et al. 2004, Wicker et al. 2007).
However, rather than escaping the genome to infect new individuals as do retroviruses, retrotransposons insert the new copies only into their host genomes.If the integration takes place within a cell lineage from which pollen or egg cells are ultimately derived, then a new polymorphism is contributed to the gene pool.
These new copies are useful for distinguishing breeding lines, varieties, or populations of plants from each other.
In plants, the LTR retrotransposons are typically more plentiful and active that their non-LTR relatives (e.g.Arabidopsis Genome Initiative 2000; Rice Chromosome 10 Sequencing Consortium 2003, Hill et al. 2005, Macas et al. 2007, Paterson et al. 2009, International Brachypodium Initiative 2010).In many crop plants between 40-70% of the total DNA is comprised of LTR retrotransposons (Pearce et al. 1996, SanMiguel et al. 1996, Shirasu et al. 2000).Although most prevalent retrotransposons are dispersed throughout the genome, at least in the cereals and citrus they are often locally nested one into another and in extensive domains that have been referred to as "retrotransposon seas" surrounding gene islands (SanMiguel et al. 1996, Ramakrishna et al. 2002, Bernet & Asins 2004, Gu et al. 2004, Kong et al. 2004).Their abundance, general dispersion, and activity make them ideal sources for the development of molecular markers.

Retrotransposons as Molecular Markers
The emergence of retrotransposon-based methods followed the basic research that demonstrated their ubiquity and activity in the plants (Grandbastien et al. 1989, Flavell et al. 1992, Voytas et al. 1992, Suoniemi et al. 1998, Innes et al. 2000).The most recent marker methods based on retrotransposons rely on PCR.Transposable elements have been exploited as molecular markers in various ways.For example, mammals SINE like Alu-repeats are dispersed throughout their genomes.Primer sequences complementary to any of these repeats may produce many nonspecific bands from single-primer amplification and be used as markers for detecting Alu-repeat polymorphisms (Nelson et al. 1989, Sinnet et al. 1990).
It has been proved that TE families evolve with different profiles, so TE marker systems based on different TEs show different levels of resolution and can be chosen to fit with the required analysis (Leigh et al. 2003, Schulman & Kalendar 2005, Teo et al. 2005, Kwon et al. 2005, Antonius-Klemola et al. 2006, Grzebelus et al. 2007, Petit et al. 2009, Vukich et al. 2009, Kalendar et al. 2010, Konovalov et al. 2010).Retrotransposons insertions behave as Mendelian loci (Manninen et al. 2000(Manninen et al. , 2006;;Huo et al. 2009).Hence, retrotransposon-based markers would be expected to be co-dominant and involve a different level of genetic variability, i.e. transposition events, than arbitrary markers systems such as RAPD or AFLPs, which detect polymorphism from simple nucleotide changes to genomic rearrangements.Nearby TEs may be found in different orientations in the genome (head-to-head, tail-to-tail, or head-to-tail) increasing the range of tools available to detect polymorphism depending on the method and primer combinations.
Most of the retrotransposon techniques are anonymous, producing fingerprints from multiple sites of retrotransposon insertion in the genome.They all exploit the combination of a known retrotransposon sequence and a variety of adjacent sequences.Primers are generally designed to the LTRs near to the joint, in domains that are conserved within families but that differ between families (Fig. 2).Although regions internal to the LTR that also contain conserved segments can be used for this purpose, generally the LTRs are chosen to minimize the size of the target to be amplified.Because the LTRs are direct repeats, a primer facing outward from the left or 5′ LTR will necessarily face inward from the right, or 3′ LTR.
Depending on the nature of the second primer, the inward facing primer will either not amplify a product, produce a monomorphic band, or will detect polymorphism resulting from a nested insertion pattern.The internal amplicon can also be removed by judicious use of an infrequent cutting enzyme (Vershinin et al. 2003).For retrotransposons with relatively short LTRs the transposon specific primer can be derived from an internal sequence present only once per element, simplifying this process.For S-SAP with low copy number elements it is also possible to use simplified digestion and amplification protocols.
The various retrotransposon marker systems differ in the nature of the second primer used in the amplification reactions (Fig. 2).The second primer can be any feature in the genome that is dispersed and conserved.

S-SAP/TD
The Amplified Fragment Length Polymorphism method (AFLP), introduced in the mid 1990s, is an anonymous marker method.It detects restriction sites by amplifying a subset of all the sites for a given enzyme pair in the genome by PCR between ligated adapters (Vos et al. 1995).S-SAP (Sequence-Specific Amplified Polymorphism), a modified AFLP method based on BARE-1 retroelement, was described by Waugh et al. (1997).This method is based on the shredding of genomic DNA with two different enzymes to generate a template for the specific primer PCR: amplification between retrotransposon and adaptors ligated at restriction sites (usually MseI and PstI or any other restriction enzyme) using selective bases in the adaptor primer.The S-SAP method can be considered as a modification of AFLP but it usually displays a higher level of polymorphism than AFLPs.Primers are usually designed in the LTR region, but could also correspond to internal part of the element, like to the The flanking primers are able to amplify the empty site, right, depicted as a bar beneath the diagram polypurine tract (PPT) which is found internal to the 3'-LTR in retrotransposons.Nonselective primers could be used when enzymes used for digestion have a larger recognition sequence, or when the copy number of the TE is lower.For high-copy-number families, the number of selective bases may be increased.The use of two enzymes in S-SAP corresponds to a reduction in genomic complexity as does the use of selective bases on the primers associated with the adapters.Low copy number TEs are not well suited to methods that involve such reduction in genomic complexity, the use of single enzyme digests with selective bases (or infrequent cutting enzymes) allows the survey of all insertion sites for a given TE, and can be considered as a variant of anchored PCR.
The same technique was named Transposon Display (TD) when applied to DNA transposons rather than retrotransposons (Broeck et al. 1998).Rim2/Hipa-TD produced highly polymorphic profiles with ample reproducibility within a species as well as between species in the Oryza genus (Shcherban et al. 2000, Kwon et al. 2005).
Usually, S-SAP shows more polymorphism, more co-dominance and more chromosomal distribution than AFLP.But S-SAP also requires restriction digestion of genomic DNA to provide sites for adapter ligation as in AFLP method.Sensitivity of commonly used restriction enzymes to DNA methylation could provide false genotyping results.

IRAP/REMAP
In plants, the inter-repeat amplification polymorphism techniques such as interretrotransposon amplified polymorphism (IRAP), retrotransposon microsatellite amplification polymorphisms (REMAP) or inter-MITE amplification (Bureau & Wessler 1992, Kalendar et al. 1999, Provan et al. 1999, Kalendar & Schulman 2006) have exploited the highly abundant dispersed repeats such as the LTRs of retrotransposons and SINE-like sequences.The association of these sequences with each other makes possible to amplify a series of bands (DNA fingerprints) using primers homologous to these high copy number repeats.The markers generated are very informative genetic markers.IRAP detects retrotransposon insertional polymorphisms by amplifying the portion of DNA between two retroelements (Kalendar et al. 1999).One or two primers are used pointing outwards from an LTR, and therefore amplifies the tract of DNA between two nearby retrotransposons.IRAP can be carried out with a single primer matching either the 5' or 3' end of the LTR but oriented away from the LTR itself, or with two primers.The two primers may be from the same retrotransposon element family or may be from different families.The PCR products, and therefore the fingerprint patterns, result from amplification of hundreds to thousands of target sites in the genome (Fig. 3).Retrotransposons generally tend to cluster together in "repeat seas" surrounding "genome islands", and may even nest within each other.Hence, the pattern obtained will be related to the TE copy number, insertion pattern and size of the TE family.
The REMAP (Retrotransposon-Microsatellite Amplified Polymorphism) method is similar to IRAP, but one of the two primers matches a microsatellite motif (Kalendar et al. 1999).Founded throughout genomes, microsatellites appear to be associated with retrotransposons and have high mutation rates due to polymerase slippage.Therefore they may show much variation at individual loci within a species.In REMAP, anchor nucleotides are used at the 3' end of the SSR primer to avoid slippage of the primer within the SSR.It also prevents the detection of variation in repeat numbers within the microsatellite.

RBIP/ TAM
RBIP (Retrotransposons-based insertion polymorphism) was described as a simple PCRbased detection of retrotransposon insertions using PCR between primers flanking the insertion site and primers from the insertion itself.The basic RBIP method has been developed for high-throughput applications by replacing gel electrophoresis with hybridization to a filter (Flavell et al. 1998).PCR reactions detecting the occupied sites and unoccupied sites are carried out together, the products spotted onto membranes, and probed with a locus-specific probe.By using sensitive, oligobased hybridisation to spotted PCR products, TAM, has allowed the dot blot approach to be scaled down to microarrays with the attendant advantages in throughput, efficiency and data collection (Flavell et al. 2003).Using three primers, RBIP can detect both the presence and absence of the TE insertion and generates single-locus codominant markers.In the case of a retrotransposon, a primer designed in the LTR is used together with a primer designed in the flanking region and both allow the amplification of an insertion site, when primers specific for both 5' and 3' flanking regions are used to score the corresponding empty site.TE insertions are usually more than thousands of bases long so the empty site primers do not generate an amplicon from the occupied site.Hence, RBIP detects both the presence and absence of the insertion but requires that the sequence of the 5' and 3' flanking sequences of the TE insertions are known.RBIP analysis was used to show evolutionary history in pea (Flavell et al. 1998, Jing et al. 2005, Vershinin et al. 2003) and rice (Vitte et al. 2004).
TAM (Tagged Microarray Marker) is a microarray-based method developed from RBIP for scoring thousands of DNAs for a co-dominant molecular marker on a glass microarray slide.RBIP also works well with single nucleotide polymorphism (SNP) markers (Flavell et al. 2003;Jing et al. 2007).In this approach, biotin-terminated allelespecific PCR products are spotted unpurified onto streptavidin-coated glass slides and visualised by hybridisation of fluorescent detector oligonucleotides to tags attached to the allele-specific PCR primers.Two tagged primer oligonucleotides are used per locus and each tag is detected by hybridisation to form a concatameric DNA probe labelled with multiple copies of a fluorochrome.

Inter PBS Amplification (iPBS), a Universal Method for Isolating and Displaying Retrotransposon Polymorphisms
A major disadvantage of all retrotransposonbased molecular markers techniques is the need for sequence information to design element-specific primers.Although rapid retrotransposon isolation methods based on PCR with conservative primer for TE have been designed, it maybe still necessary to clone and sequence hundreds of clones to obtain a few good primers sequences.The LTRs contain no conserved motifs, which would allow their direct amplification by PCR.
There are several restriction and adaptor-based methods for LTR cloning, based on conservancy of reverse transcriptase domain, especially for Ty1-copia type (Pearce et al. 1999).Major classes of retroelements include the Pseudoviridae (Ty1-copia), the Metaviridae (Ty3-gypsy) and the Retroposineae LINE (non-LTR) groups.All reverse transcribing elements can be obtained by PCR with degenerate primers.For example, Ty1-copia two degenerate primers were designed for RT domain encoding TAFLHG and reverse site YVDDML also encoding QMDVKT and reverse YVDDML (Hirochika & Hirochika 1993, Flavell et al. 1992, Ellis et al. 1998).For Ty3-gypsy element degenerate primers were designed for RT domain encoding RMCVDYR or LSGYHQI or YPLPRID and reverse encoding site YAKLSKC and LSGYHQI.The reverse transcriptase based isolation method is limited to the families of retrotransposons, which contain this sequence.Thus, for example TRIM or LARDs and unknown yet classes LTR-retrotransposons cannot be found using this approach (Kalendar et al. 2008, Witte et al. 2001).
LTR retrotransposons and all retroviruses contain tRNA conservative primer binding site for tRNA iMet , tRNA Lys , tRNA Pro , tRNA Trp , tRNA Asn , tRNA Ser , tRNA Arg , tRNA Phe , tRNA Leu and tRNA Gln .Elongation from the 3'-terminal nucleotides of the respective tRNA results in the conversion of the viral/retrotransposon RNA genome to double-stranded DNA prior to its integration into the host DNA.While the process of reverse transcription is conserved among all retroviruses, the specific tRNA capture varies for different retroviruses and retroelements.The primer binding sequences (PBS) are universally present in all LTRretrotransposons sequences.Hence an isolation method for retrotransposon LTRs, which is based on the PBS sequence, has potential for cloning all possible LTR-retrotransposons. Kalendar et al. (2010) describes the development of exceedingly universal and efficient method, which utilizes the conserved parts of PBS sequences, both for direct visualization of polymorphism between individuals, transcription profile polymorphism, as for fast cloning of LTR parts from genomic DNA, or in a form of database search.In this way any eukaryotic organism possessing LTR type of retrotransposon can be investigated.Primers, which were designed to match the conserved regions of the primer binding sequences in LTR retrotransposons, proved to be very efficient in PCR amplification of eukaryotic genomic DNA.Single PBS primers can only amplify nested inverted retrotransposons or related elements' sequences dispersed through genome DNA.PCR amplification occurs between two nested PBS and contains two LTR sequences.The PBS sequences are nested near each other in all eukaryotes (Fig. 4).
Most of retrotransposons are nested, mixed, inverted or truncated in chromosomal sequences, and can be easily amplified using conservative PBS primers in any plant species tested.Fragments of LTR with retrotransposons internal part are located near other retrotransposons.Therefore PBS sequences are very often located near to each other.This situation allows the use PBS sequences for cloning LTR.
Where the retrotransposon density is high within genome, PBS sequences can be exploited for detection of their chance association with other retrotransposons.When retrotransposon activity or recombination has led to new genome integrations, this can be used to distinguish reproductively isolation plant line.In this case, amplified bands derived from new insert or recombination will be polymorphic, appearing only in plant lines in which the insertions or recombination have taken place.
After retrieving LTR sequences of a selected family of retrotransposon, alignment is made of them to find out the most conserved region in them.The related plant species have conservative regions in LTR for identical retroelement, thus alignments of several LTR sequences from one species or mixing with sequences from the related species will identify conservative regions.Subsequently this conservative parts of LTR regions are used for inverted primers design for long distance PCR, for cloning of whole element and also for IRAP, REMAP or S-SAP techniques.
The iPBS amplification technique shows about the same level of polymorphism in comparisons with IRAP and REMAP techniques and it is an efficient method for the detection of cDNA polymorphism and clonal differences resulting from retrotransposon activities or retrotransposon recombination after crossing-over (Kalendar et al. 2010).

Retrotransposon-Based Molecular Markers to Analyze Genetic Diversity
The analysis of genetic diversity and relatedness between or within different populations, species and individuals is a central task in genetics.The combination of different LTR primers or with combinations with microsatellite primers (REMAP) allows the generation of almost unlimited number of unique markers.Banding patterns were completely different if the same primers were use alone or in combinations, indicating that the majority of IRAP/REMAP bands were derived from sequences bordered by other LTR or a microsatellite on one side, and by an LTR on the other.Usually, the REMAP pattern was considerably more variable than the corresponding ISSR pattern; and often (but not always, depending from LTR sequence) IRAP pattern with primer combinations shown more variability than a single priming PCR (Kalendar et al. 1999(Kalendar et al. , 2004;;Leigh et al. 2003).
Related species have phylogenetically related TE sequences (retroelements or transposons) meaning that PCR primers from one species can be used in another.In this case, primers designed to conservative TE sequences are advantageous.TEs are dispersed at whole chromosomes and most often mixed with other elements and repeats, that is the combinations of PCR primers from different repeats help to improving PCR fingerprint.
To study closely related varieties or breeding lines, one should develop a native retrotransposon system.This requires the cloning and sequencing of elements from the new species by using iPBS amplification method or technique based on conservancy of reverse transcriptase domain.This process begins with amplification and cloning of segments between retrotransposon domains that are highly or universally conserved, development of new primers specific for the retrotransposon families found, and testing these for their efficacy as markers (Pearce et al. 1999, Jing et al. 2005).
The genome size of studied organism is positively correlated with the efficiency of repeat-based amplification techniques; the larger genome the most easy developing good primers for revealing multiple bands for polymorphism detection (barley, wheat); small genome organism like Brachypodium dystachyon or Vitis vinifera is most difficult to PCR marker development.
S-SAP is generally carried out on sequencing gels due to the large number of products generated, whereas IRAP and REMAP are used on agarose systems.However, IRAP and REMAP can be adapted to sequencers.These methods generate tens to hundreds of products in each amplification reaction, depending on the prevalence of the retrotransposon family, the selection of the second primer -the restriction site and number of selective bases in S-SAP -, and the organization of the plant genome.
A marker from any of the multilocus, anonymous systems (S-SAP, IRAP, and REMAP) can be converted into a corresponding RBIP marker and vice versa.Markers from the former methods are very easy to harvest and they can be quickly examined for their informativeness before taking on the investment of developing a corresponding RBIP marker.Electrophoretically resolved bands from S-SAP, IRAP, and REMAP are derived from one side of a retrotransposon insertion.Sequencing of the isolated, informative bands will enable the design of a PCR primer corresponding to the flanking genomic DNA at one side of the insertion, assuming that the sequence is not repetitive and therefore unusable.However, the genomic sequence flanking the other side of the element needs to be found in order to score the empty site.This can be obtained by screening germplasm accessions that are polymorphic for the original band, then carrying out S-SAP reaction on these, where the LTR primer is replaced with a primer designed to the known flank that is facing toward the insertion site.

Conclusions
Many features of retrotransposons make them appealing as the basis of molecular marker systems.They are ubiquitous, abundant and dispersed components of eukaryotic genomes.Their activity simultaneously leads to genome diversification and provides a means of its detection.Retrotransposons are long and produce a large genetic change at the point of insertion, thereby providing conserved sequences that can be used to detect their own integration.This event is not linked to removal of the transposable element from another locus, as it is for DNA transposons.Even the loss of the core domain of a retrotransposon by LTR-LTR recombination (Shirasu et al. 2000) is invisible to the marker methods using outward-facing LTR primers.The ancestral state of a retrotransposon insertion is obvious -it is the empty site.This is very helpful in pedigree and phylogenetic analyses.Later recombination events at a full site are highly unlikely to regenerate the original empty site.In contrast, microsatellites, SNPs, and methods relying on gain or loss of restriction sites (in essence SNPs), suffer from a lack of temporal directionality in the changes they detect, resulting in the problem of homoplasy.For example, SINE elements have served to trace human roots to Africa (Batzer et al. 1994, Watkins et al. 2003), to determine the relationship of whales to even-toed ungulates (Shimamura et al. 1997), and to clarify the relationships between wild rice species (Cheng et al. 2003).
DNA markers based on LTR retrotransposons, in one or other of the manifestations described above generally referred to as "transposon display".The applications range from investigations of retrotransposon activation and mobility to studies of biodiversity, genome evolution, chromatin modification, epigenetic reprogramming, mapping of genes and the estimation of genetic distance, to assessment of essential derivation of varieties, detection of somaclonal variation and cDNA fingerprinting.The retrotransposon insertions that provide useful polymorphisms are, of course, only those that are passed into the egg cells and pollen.One can thus think of the retrotransposons as sexually transmitted diseases, albeit ones that moves by a cellular, rather than extracellular, pathway into the new host.
Because LTR retrotransposons are ubiquitous, these methods are generic.Furthermore, similar approaches have been applied to the non-LTR retrotransposons in the plants, in particular to the SINE elements (Cheng et al. 2002(Cheng et al. , 2003;;Prieto et al. 2005).The insertion pattern of the human Alu, a SINE and the most prevalent transposable element in the human genome, has not only served as a tool in many studies of human population structure (Watkins et al. 2003), but also been linked to various heritable diseases (Deininger & Batzer 1999, Jurka 2004).In principle, retrotransposon-or endogenous retrovirus-based molecular markers could prove highly useful in animals, including mammals and birds.
Commercial platforms for SNP detection (e.g.Illumina) have been developed and garnered much popularity for major crops, domestic animals, and humans.Development of SNPs depends on having abundant sequence data.The costs of acquiring this data, as well as of applying commercial assays, represent a barrier for research on underfunded tropical crops and wild species.Furthermore, evolutionary studies with SNPs are affected by the problems of homoplasy in SNP state, the lack of neutrality of genic markers, and the uneven chromosomal distribution of the highly expressed genes that are used to generate SNPs.While genetic analysis by shotgun sequencing remains a tantalizing possibility, the cost is still prohibitive.For these reasons, cheap, generic, easily applied retrotransposon marker systems will remain a viable choice for genetic markers for the foreseeable future.

Figure 2 .
Figure 2. Retrotransposon-based molecular marker methods.Multiplex products of various lengths from different loci are indicated by the bars above or beneath the diagrams of each reaction.Primers are indicated as arrows (a) The S-SAP method.Primers used for amplification match the adapter (empty box) and retrotransposon (LTR box).(b) The IRAP method.Amplification takes place between retrotransposons (left and right LTR boxes) near each other in the genome (open bar), using retrotransposon primers.The elements are shown oriented head-to-head, using a single primer.(c) The REMAP method.Amplification takes place between a microsatellite domain (vertical bars) and a retrotransposon, using a primer anchored to the proximal side of the microsatellite and a retrotransposon primer.(d) RBIP.The alternative reaction between the primers for the left and right flanks is inhibited in the full site by the length of the retrotransposon.The flanking primers are able to amplify the empty site, right, depicted as a bar beneath the diagram

Figure 3 .
Figure 3. Utility of IRAP for a diversity analysis of plant species.The phenogram of 30 genotypes of populations of H. spontaneum based on IRAP analysis are shown as negative images of ethidium bromide -stained agarose gels following electrophoresis.Results for BARE-1 LTR primer 1369 (5'-TGCCTCTAGGGCATATTTCCAACAC) are shown.A 100 bp DNA ladder is present on the left.

Figure 4 .
Figure 4.The inter PBS amplification (iPBS) scheme and LTR retrotransposon structure.Two nested LTR retrotransposons in inverted orientations amplified from single primer or two different primers from primer binding sites.PCR product contains both LTRs and PBS sequences as PCR primers in the termini.In figure general structure for PBS and LTR sequences, and several nucleotides long spacer between 5'LTR (5'-..CA) and PBS (5'-TGG..3') are schematically shown.