close
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr;26(1):4-18.
doi: 10.7171/jbt.15-2601-001.

Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA

Affiliations

Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA

Savita Shanker et al. J Biomol Tech. 2015 Apr.

Abstract

This article includes supplemental data. Please visit http://www.fasebj.org to obtain this information.Multiple recent publications on RNA sequencing (RNA-seq) have demonstrated the power of next-generation sequencing technologies in whole-transcriptome analysis. Vendor-specific protocols used for RNA library construction often require at least 100 ng total RNA. However, under certain conditions, much less RNA is available for library construction. In these cases, effective transcriptome profiling requires amplification of subnanogram amounts of RNA. Several commercial RNA amplification kits are available for amplification prior to library construction for next-generation sequencing, but these kits have not been comprehensively field evaluated for accuracy and performance of RNA-seq for picogram amounts of RNA. To address this, 4 types of amplification kits were tested with 3 different concentrations, from 5 ng to 50 pg, of a commercially available RNA. Kits were tested at multiple sites to assess reproducibility and ease of use. The human total reference RNA used was spiked with a control pool of RNA molecules in order to further evaluate quantitative recovery of input material. Additional control data sets were generated from libraries constructed following polyA selection or ribosomal depletion using established kits and protocols. cDNA was collected from the different sites, and libraries were synthesized at a single site using established protocols. Sequencing runs were carried out on the Illumina platform. Numerous metrics were compared among the kits and dilutions used. Overall, no single kit appeared to meet all the challenges of small input material. However, it is encouraging that excellent data can be recovered with even the 50 pg input total RNA.

Keywords: cDNA synthesis kits; polyA; ribodepletion.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Read composition comparison. The percentage of all recovered reads falling into particular categories is shown: red indicates all aligning reads, yellow represents all aligning reads mapping to a unique genomic location, and blue shows the uniquely mapping reads with putative PCR duplicates (i.e., reads starting at the same coordinate) removed. The numbers represent the different sites providing the starting cDNA from the indicated kits; RNA concentrations from high to low are represented by the blue triangles. RZ, Ribo-Zero; TS, TruSeq.
Figure 2
Figure 2
(A) Comparison of RNA category percentages. The percent composition of several broad categories of RNAs is displayed; 100% represents the total number of uniquely aligning reads for each sample set. Blue indicates exonic reads, red shows intronic, yellow represents ribosomal, purple indicates intergenic, aqua shows novel splice junctions, and green indicates known splice junctions. The kits used and participating sites are indicated, with RNA input amounts from highest to lowest represented by blue triangles. (B) Comparison of RNA category percentages. Designations shown are the same as those in (A), except the putative PCR duplicates have been removed from these data sets prior to percent category calculations.
Figure 3
Figure 3
(A) Detection sensitivity. The number of genes represented by 2 alignment hits is shown for the different construction kits and protocols; coding genes are shown in blue, ncRNAs in green, and pseudogenes in red. RNA concentrations going from high to low are represented by blue triangles, and the site numbers are indicated. The queried set is the Ensembl version 66 compilation of 53,599 genes. (B) Sensitivity of gene detection by gene type. The different biotypes are indicated on the left-hand side. For each library, detected gene percentages represent the mean from 5 different 10 million read down samplings. Heatmap values are detected genes as percent detectable, detection requires ≥2 reads, and “detectable” genes are those with ≥2 reads in the 1 μg Ribo-Zero control. Kit types and dilutions are indicated as in the other figures.
Figure 4
Figure 4
RPKM measurements of detected reads. Gene abundance was quantified and divided into quantiles based on their relative levels. RNA concentrations going from high to low are represented by blue triangles, and the site numbers are indicated.
Figure 5
Figure 5
(A) Gene expression comparisons. Pearson correlations among kits and across dilutions were graphed based on gene expression values. These values were derived from the Log2(Median Library RPKM) per gene. (B) Pairwise correlations of gene expression measurements. Values are displayed as box plots when sufficient data points are used, otherwise individual values are displayed.
Figure 6
Figure 6
Gene body coverage. The mean coverage from all reporting genes and all samples from the different kits is displayed (i.e., all input concentrations and site data were pooled for a given kit). The figure shows the Z score across each bin, going from 5′ to 3′ left to right. Exonic reads within detected genes were used for the calculations.
Figure 7
Figure 7
(A) ERCC standards expression. Pearson correlations among kits and across dilutions were graphed based on the recovered RPKM values of ERCC standards. Blue triangles indicate dilution series. (B) Log linearity of ERCC RPKM values. Box plots indicate expression levels relative to the known concentrations of the original standards present. A value of 1.0 indicates equivalence between the published standard values and the recovered values.

References

    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009;10:57–63. - PMC - PubMed
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 2008;5:621–628. - PubMed
    1. Ozsolak F, Milos PM. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 2011; 12: 87–98. - PMC - PubMed
    1. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. Alternative isoform regulation in human tissue transcriptomes. Nature 2008;456:470–476. - PMC - PubMed
    1. Jiang H, Wong WH. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics 2009;25:1026–1032. - PMC - PubMed

Publication types