LINE1 (also L1 and LINE-1) is a family of related class I
transposable elements in the
DNA of some organisms, classified with the
long interspersed elements (LINEs). L1 transposons comprise approximately 17% of the
human genome.[1] These active L1s can interrupt the genome through insertions, deletions, rearrangements, and
copy number variations.[2] L1 activity has contributed to the instability and evolution of genomes and is tightly regulated in the germline by
DNA methylation,
histone modifications, and
piRNA.[3] L1s can further impact genome variation through mispairing and
unequal crossing over during meiosis due to its repetitive DNA sequences.[2]
L1 gene products are also required by many non-autonomous
Alu and SVA
SINE retrotransposons.
Mutations induced by L1 and its non-autonomous counterparts have been found to cause a variety of heritable and somatic diseases.[4][5]
A typical L1 element is approximately 6,000
base pairs (bp) long and consists of two non-overlapping
open reading frames (ORFs) which are flanked by
untranslated regions (UTRs) and target site duplications. In humans, ORF2 is thought to be translated by an unconventional termination/reinitiation mechanism,[8] while mouse L1s contain an
internal ribosome entry site (IRES) upstream of each ORF.[9]
5' UTR
The 5' UTRs of mouse L1s contain a variable number of GC-rich
tandemly repeated monomers of around 200 bp, followed by a short non-monomeric region. Human 5’ UTRs are ~900 bp in length and do not contain repeated motifs. All families of human L1s harbor in their most 5’ extremity a binding motif for the transcription factor
YY1.[10] Younger families also have two binding sites for
SOX-family transcription factors, and both YY1 and SOX sites were shown to be required for human L1 transcription initiation and activation.[11][12] Both mouse and human 5’ UTRs also contain a weak
antisense promoter of unknown function.[13][14]
The first
ORF of L1 encodes a 500-amino acid, 40-
kDa protein that lacks homology with any protein of known function. In vertebrates, it contains a conserved
C-terminus domain and a highly variable coiled-coil
N-terminus that mediates the formation of ORF1 trimeric complexes. ORF1 trimers have RNA-binding and nucleic acid chaperone activity that are necessary for retrotransposition.[15]
The second
ORF of L1 encodes a protein that has
endonuclease and
reverse transcriptase activity. The encoded protein has a molecular weight of 150
kDa. The structure of the ORF2 protein was solved in 2023. Its protein core contains three domains of unknown functions, termed "tower/EN-linker" and "wrist/RNA-binding domain" that bind Alu RNA's polyA tail and C-terminal domain that binds Alu RNA stem loop.
The
nicking and
reverse transcriptase activities of L1 ORF2p are boosted by
single-stranded DNA structures likely present on the active
replication forks. Unlike viral RTs, L1 ORF2p can be primed by RNA, including RNA hairpin primers produced by the Alu element.
As with other transposable elements, the host organism keeps a heavy check on LINE1 to prevent it from becoming overly active. In the primitive eukaryote Entamoeba histolytica, ORF2 is massively expressed in
antisense, resulting in no detectable amounts of its protein product.[16]
Roles in disease
Cancer
L1 activity has been observed in numerous types of
cancers, with particularly extensive insertions found in colorectal and lung cancers.[17] It is currently unclear if these insertions are causes or secondary effects of cancer progression. However, at least two cases have found somatic L1 insertions causative of cancer by disrupting the coding sequences of genes
APC and
PTEN in colon and
endometrial cancer, respectively.[2]
Quantification of L1 copy number by
qPCR or L1 methylation levels with
bisulfite sequencing are used as diagnostic biomarkers in some types of cancers. L1 hypomethylation of colon tumor samples is correlated with cancer stage progression.[18][19] Furthermore, less invasive blood assays for L1 copy number or methylation levels are indicative of breast or bladder cancer progression and may serve as methods for early detection.[20][21]
Neuropsychiatric disorders
Higher L1 copy numbers have been observed in the human
brain compared to other organs.[22][23] Studies of animal models and human cell lines have shown that L1s become active in neural progenitor cells (NPCs), and that experimental deregulation of or overexpression of L1 increases somatic
mosaicism. This phenomenon is negatively regulated by
Sox2, which is downregulated in NPCs, and by
MeCP2 and methylation of the L1 5' UTR.[24] Human cell lines modeling the neurological disorder
Rett syndrome, which carry MeCP2 mutations, exhibit increased L1 transposition, suggesting a link between L1 activity and neurological disorders.[25][24] Current studies are aimed at investigating the potential roles of L1 activity in various neuropsychiatric disorders including
schizophrenia,
autism spectrum disorders,
epilepsy,
bipolar disorder,
Tourette syndrome, and drug
addiction.[26] L1s are also highly expressed in octopus brain, suggesting a convergent mechanism in complex cognition.[27]
Retinal disease
Increased RNA levels of
Alu, which requires L1 proteins, are associated with a form of age-related
macular degeneration, a neurological disorder of the
eyes.[28]
The naturally occurring mouse retinal degeneration model rd7 is caused by an L1 insertion in the
Nr2e3 gene.[29]
COVID-19
In 2021, a study proposed that L1 elements may be responsible for potential
endogenisation of the
SARS-CoV-2 genome in
Huh7 mutant cancer cells,[30] which would possibly explain why some patients test PCR positive for SARS-CoV-2 even after clearance of the virus. These results however have been criticized as not reproducible,[31] misleading and infrequent[32] or artefactual.[33]
See also
L1Base, a database of functional annotations and predictions of active LINE1 elements[34]
^Chen J, Rattner A, Nathans J (July 2006). "Effects of L1 retrotransposon insertion on transcript processing, localization and accumulation: lessons from the retinal degeneration 7 mouse and implications for the genomic ecology of L1 elements". Human Molecular Genetics. 15 (13): 2146–56.
doi:
10.1093/hmg/ddl138.
PMID16723373.
Zheng F, Kawabe Y, Murakami M, Takahashi M, Nishihata K, Yoshida S, et al. (July 2021). "LINE-1 vectors mediate recombinant antibody gene transfer by retrotransposition in Chinese hamster ovary cells". Biotechnology Journal. 16 (7): e2000620.
doi:
10.1002/biot.202000620.
PMID33938150.
S2CID233484152.
Jachowicz JW, Bing X, Pontabry J, Bošković A, Rando OJ, Torres-Padilla ME (October 2017). "LINE-1 activation after fertilization regulates global chromatin accessibility in the early mouse embryo". Nature Genetics. 49 (10): 1502–1510.
doi:
10.1038/ng.3945.
PMID28846101.
S2CID5213902.