More "Junk DNA" (lncRNA) discovered
The paper "Deep annotation of long noncoding RNAs by assembling RNA-seq and small RNA-seq data" by Tang et al. (2023) describes a new computational pipeline called RSCS (RNA-seq and small RNA-seq combined strategy) for annotating long noncoding RNAs (lncRNAs)(aka Junk DNA). The RSCS pipeline combines data from RNA-seq and small RNA-seq to identify full-length lncRNA transcripts, including those that are expressed at low levels or that are associated with transposable elements.
The RSCS pipeline was first validated in mouse early embryos, where it identified thousands of novel lncRNA transcripts. These novel transcripts were further characterized by analyzing their transcript structure, base composition, and sequence complexity. The results showed that the RSCS pipeline is able to generate a more complete and precise transcriptome than traditional RNA-seq approaches.
In addition to identifying novel lncRNA transcripts, the RSCS pipeline was also used to identify a large number of endogenous retrovirus-associated lncRNAs (ERV-lncRNAs). ERV-lncRNAs are lncRNAs that are derived from endogenous retroviruses. These lncRNAs are thought to play a role in regulating gene expression and in immune responses. The RSCS pipeline identified a novel ERV-lncRNA that was functionally involved in control of Yap1 expression and essential for early embryogenesis.
The RSCS pipeline is a powerful tool for annotating lncRNAs. It is able to identify full-length lncRNA transcripts, including those that are expressed at low levels or that are associated with transposable elements. The RSCS pipeline is also able to identify ERV-lncRNAs, which are thought to play important roles in various biological processes.
Here are some additional details about the RSCS pipeline:
The RSCS pipeline first performs RNA-seq to identify all possible transcripts.
Next, the RSCS pipeline performs small RNA-seq to identify small RNAs that are complementary to the transcripts identified by RNA-seq.
The RSCS pipeline then uses the small RNAs to assemble the full-length transcripts.
The RSCS pipeline also performs a variety of quality checks to ensure that the assembled transcripts are accurate.
The RSCS pipeline is a valuable tool for researchers who are interested in studying lncRNAs. It is able to identify full-length lncRNA transcripts, including those that are expressed at low levels or that are associated with transposable elements. The RSCS pipeline is also able to identify ERV-lncRNAs, which are thought to play important roles in various biological processes.
Article snippets:
Long noncoding RNAs (lncRNAs) are increasingly being recognized as modulators in various biological processes.
However, due to their low expression, their systematic characterization is difficult to determine.
we performed transcript annotation by a newly developed computational pipeline, termed RNA-seq and small RNA-seq combined strategy (RSCS),
Thousands of high-confidence potential novel transcripts were identified by the RSCS,
taking advantage of our strategy, we identified a large number of endogenous retrovirus-associated lncRNAs (ERV-lncRNAs), and a novel ERV-lncRNA that was functionally involved in control of Yap1 expression and essential for early embryogenesis was identified
the RSCS can generate a more complete and precise transcriptome, and our findings greatly expanded the transcriptome annotation for the mammalian community.
Comments
Post a Comment