Mobile elements have largely contributed to shape mammalian genomes. In humans, LINE1 (L1) retrotransposons account for around 17% of the genome. The vast majority (99.8%) of L1s is not mobile. Nevertheless, their is approximately 60 to 100 L1 elements still capable of retrotransposing in human and 3000 in mouse. L1 contains two ORFs, one of which has endonuclease and reverse transcriptase activity. These proteins have a cis preference for the L1 mRNA, but are also able to mobilize other RNAs (trans activity) such as SINEs. L1s are also responsible for the insertion of many processed pseudogenes. Sequences mRNAs, devoid of introns and with a polyA, can be found throughout the genome and are usually easily recognized as retrotransposed processed pseudogenes. PolyA tail is usually regarded as important for retrotransposition through L1 mechanism, for this reason a poly A is normally found at the 3' end of the insertion. Another L1 trademark, the target site duplication (TSD), is explainable by the endonuclease cleavage, typically 15 nt apart on the two strands. Thus, the TSD is created after the overhangs are filled on each side of the insertion. Finally, L1 endonuclease has some preference for a cleavage site with two purines followed by four pyrimidines, especially TT/AAAA.
The study of small non-coding RNA genes is often complicated by the presence of pseudogenes. Many such RNAs, like U snRNAs, have many copies of their gene in the genome, but also have pseudogenes. Pseudogene study can also be impaired by the size of inserted elements, in many cases a 5' portion of variable length is missing from the RNA sequence that was inserted because of reverse transcription stops. This combined to accumulated mutations will often imply it will be missed by a common BLAST search. Lowering BLAST criteria at this level can lead to numerous hits, most of which might represent false positives. The tool we designed can be especially efficient in this regard because it will take other characteristics in consideration.
Moreover, there are currently many major mammalian genome sequencing projects and consequently many new elements retrotransposed by L1 to look for and analyze. For this purpose, RTAnalyzer should reveal an efficient tool. This software will be useful for indepth analysis of non-autonomous retrotransposons in sequenced mammals to help us understand retrotransosition in our evolutionary tree.