Introduction
Here we show two options for using limorhyde2
to analyze
RNA-seq data: limma-voom
and DESeq2.
The two approaches give very similar results.
This vignette assumes you are starting with the output of tximport.
Load the data
You will need two objects:
txi
, a list from tximport
metadata
, a data.frame
having one row per
sample
The rows in metadata
must correspond to the columns of
the elements of txi
.
library('limorhyde2')
# txi = ?
# metadata = ?
Filter out lowly expressed genes
There are many reasonable strategies to do this, here is one.
keep = rowSums(edgeR::cpm(txi$counts) >= 0.5) / ncol(txi$counts) >= 0.75
txiKeep = txi
for (name in c('counts', 'length')) {
txiKeep[[name]] = txi[[name]][keep, ]}
Fill in counts for sample-gene pairs having zero counts
This avoids unrealistically low log2 CPM values and thus artificially
inflated effect size estimates.
for (i in seq_len(nrow(txiKeep$counts))) {
idx = txiKeep$counts[i, ] > 0
txiKeep$counts[i, !idx] = min(txiKeep$counts[i, idx])}
Option 1: limma-voom
y = edgeR::DGEList(txiKeep$counts)
y = edgeR::calcNormFactors(y)
fit = getModelFit(y, metadata, ..., method = 'voom') # replace '...' as appropriate for your data
Option 2: DESeq2
The second and third arguments to
DESeqDataSetFromTxImport()
are required, but will not be
used by limorhyde2
.
y = DESeq2::DESeqDataSetFromTximport(txiKeep, metadata, ~1)
fit = getModelFit(y, metadata, ..., method = 'deseq2') # replace '...' as appropriate for your data
Continue using limorhyde2
Regardless of which option you choose, the next steps are the same:
getPosteriorFit()
, getRhythmStats()
, etc.