Title: | Quantify Rhythmicity and Differential Rhythmicity in Genomic Data |
---|---|
Description: | Fit linear models based on periodic splines, moderate model coefficients using multivariate adaptive shrinkage, then compute properties of the moderated curves. |
Authors: | Jake Hughey [aut, cre], Dora Obodo [aut], Elliot Outland [aut] |
Maintainer: | Jake Hughey <[email protected]> |
License: | GPL-2 |
Version: | 0.1.0 |
Built: | 2025-03-06 05:21:08 UTC |
Source: | https://github.com/hugheylab/limorhyde2 |
This function computes differences in rhythmicity between fitted curves for a given pair of conditions.
getDiffRhythmStats(fit, rhyStats, conds = fit$conds, dopar = TRUE)
getDiffRhythmStats(fit, rhyStats, conds = fit$conds, dopar = TRUE)
fit |
A |
rhyStats |
A |
conds |
A character vector indicating the conditions to compare
pairwise, by default all conditions in |
dopar |
Logical indicating whether to run calculations in parallel if
a parallel backend is already set up, e.g., using
|
A data.table
containing the following differential rhythm
statistics:
mean_mesor
mean_peak_trough_amp
mean_rms_amp
(only calculated if rms
to getRhythmStats()
was TRUE
)
diff_mesor
diff_peak_trough_amp
diff_rms_amp
(only calculated if rms
to getRhythmStats()
was TRUE
)
diff_peak_phase
: circular difference between -fit$period/2
and
fit$period/2
diff_trough_phase
: circular difference between -fit$period/2
and
fit$period/2
diff_rhy_dist
: Euclidean distance between polar coordinates
(peak_trough_amp
, peak_phase
)
rms_diff_rhy
: root mean square difference in mean-centered fitted curves
(only calculated if rms
to getRhythmStats()
was TRUE
)
The stats will be based on the value for cond2
minus the value for cond1
.
The rows of the data.table
depend on the 'fitType' attribute of rhyStats
:
'fitType' is 'posterior_mean' or 'raw': one row per feature per pair of conditions.
'fitType' is 'posterior_samples': one row per feature per posterior sample per pair of conditions.
getRhythmStats()
, getStatsIntervals()
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
This function computes expected measurements (corresponding to the fitted curves) for the specified times and features in all combinations of conditions and covariates (if they exist).
getExpectedMeas( fit, times, fitType = c("posterior_mean", "posterior_samples", "raw"), features = NULL, dopar = TRUE )
getExpectedMeas( fit, times, fitType = c("posterior_mean", "posterior_samples", "raw"), features = NULL, dopar = TRUE )
fit |
A 'limorhyde2' object. |
times |
Numeric vector of times, in units of
|
fitType |
String indicating which fitted models to use to compute the
expected measurements. A typical analysis using |
features |
Vector of names, row numbers, or logical values for
subsetting the features. |
dopar |
Logical indicating whether to run calculations in parallel if
a parallel backend is already set up, e.g., using
|
A data.table
.
getModelFit()
, getPosteriorFit()
, getPosteriorSamples()
,
getExpectedMeasIntervals()
library('data.table') y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) measObs = mergeMeasMeta(y, metadata, features = c('13170', '12686')) measFitMean = getExpectedMeas( fit, times = seq(0, 24, 0.5), features = c('13170', '12686'))
library('data.table') y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) measObs = mergeMeasMeta(y, metadata, features = c('13170', '12686')) measFitMean = getExpectedMeas( fit, times = seq(0, 24, 0.5), features = c('13170', '12686'))
This functions uses posterior samples to quantify uncertainty in the expected measurements from fitted models.
getExpectedMeasIntervals(expectedMeas, mass = 0.9, method = c("eti", "hdi"))
getExpectedMeasIntervals(expectedMeas, mass = 0.9, method = c("eti", "hdi"))
expectedMeas |
A |
mass |
Number between 0 and 1 indicating the probability mass for which to calculate the intervals. |
method |
String indicating the type of interval: 'eti' for equal-tailed
using |
A data.table
containing lower and upper bounds of the expected
measurement for each combination of feature, time, and possibly condition
and covariate.
getExpectedMeas()
, getStatsIntervals()
library('data.table') y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) fit = getPosteriorSamples(fit, nPosteriorSamples = 10L) measFitSamps = getExpectedMeas( fit, times = seq(0, 24, 0.5), fitType = 'posterior_samples', features = c('13170', '12686')) measFitInts = getExpectedMeasIntervals(measFitSamps)
library('data.table') y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) fit = getPosteriorSamples(fit, nPosteriorSamples = 10L) measFitSamps = getExpectedMeas( fit, times = seq(0, 24, 0.5), fitType = 'posterior_samples', features = c('13170', '12686')) measFitInts = getExpectedMeasIntervals(measFitSamps)
This is the first step in an analysis using limorhyde2
, the second is to
moderate the fits using getPosteriorFit()
.
getModelFit( y, metadata, period = 24, nKnots = 3L, degree = if (nKnots > 2) 3L else 2L, sinusoid = FALSE, timeColname = "time", condColname = NULL, covarColnames = NULL, sampleColname = "sample", nShifts = 3L, method = c("trend", "voom", "deseq2"), lmFitArgs = list(), eBayesArgs = if (method == "trend") list(trend = TRUE) else list(), DESeqArgs = list(), keepLmFits = FALSE )
getModelFit( y, metadata, period = 24, nKnots = 3L, degree = if (nKnots > 2) 3L else 2L, sinusoid = FALSE, timeColname = "time", condColname = NULL, covarColnames = NULL, sampleColname = "sample", nShifts = 3L, method = c("trend", "voom", "deseq2"), lmFitArgs = list(), eBayesArgs = if (method == "trend") list(trend = TRUE) else list(), DESeqArgs = list(), keepLmFits = FALSE )
y |
Matrix-like object of measurements, with rows corresponding to features and columns to samples. |
metadata |
data.frame containing experimental design information for
each sample. Rows of |
period |
Number specifying the period for the time variable, in the same
units as the values in the |
nKnots |
Number of internal knots for the periodic spline for the time variable. |
degree |
Integer indicating degree of the piecewise polynomial for the spline. |
sinusoid |
Logical indicating whether to fit a cosinor-based model instead of a spline-based model. |
timeColname |
String indicating the column in |
condColname |
String indicating the column in |
covarColnames |
Character vector indicating the columns in |
sampleColname |
String indicating the column in |
nShifts |
Number of shifted models to fit. Only used for periodic splines, not for cosinor. Do not change from the default unless you know what you're doing. |
method |
String indicating method to estimate model coefficients. For microarray data, use 'trend'. For RNA-seq count data, use 'voom' or 'deseq2'. |
lmFitArgs |
List of arguments passed to |
eBayesArgs |
List of arguments passed to |
DESeqArgs |
List of arguments passed to |
keepLmFits |
Logical indicating whether to keep the complete fit objects
from |
A limorhyde2
object with elements:
metadata
: As supplied above, converted to a data.table
.
timeColname
: As supplied above.
condColname
: As supplied above.
covarColnames
: As supplied above.
coefficients
: Matrix with rows corresponding to features and columns to
model terms, including all shifted models.
shifts
: Numeric vector indicating amount by which timepoints were shifted
for each shifted model.
period
: As supplied above.
conds
: If condColname
is not NULL
, a vector of unique values of
the condition variable.
nKnots
: Number of knots.
degree
: As supplied above.
sinusoid
: As supplied above.
nConds
: Number of conditions.
nCovs
: Number of covariates.
lmFits
: If keepLmFits
is TRUE
, a list of objects from limma
or
DESeq2
, with length equal to length of the shifts
element.
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
This is the second step in an analysis using limorhyde2
, the first is to
fit linear models using getModelFit()
. This function obtains posterior
estimates of coefficients using multivariate adaptive shrinkage (mash), which
learns patterns in the data and accounts for noise in the original fits. The
defaults for arguments should work well in most cases, so only change them if
you know what you're doing.
getPosteriorFit( fit, covMethod = c("data-driven", "canonical", "both"), getSigResArgs = list(), npc = fit$nKnots, covEdArgs = list(), overwrite = FALSE, ... )
getPosteriorFit( fit, covMethod = c("data-driven", "canonical", "both"), getSigResArgs = list(), npc = fit$nKnots, covEdArgs = list(), overwrite = FALSE, ... )
fit |
A |
covMethod |
String indicating the type(s) of covariance matrices to use for the mash fit. |
getSigResArgs |
List of arguments passed to
|
npc |
Number of principal components passed to |
covEdArgs |
List of arguments passed to |
overwrite |
Logical for whether to recompute the mash fit if it already exists. |
... |
Additional arguments passed to |
A limorhyde2
object containing everything in fit
with added or
updated elements:
mashData
: list of mash
data objects
mashFits
: list of mash
fit objects
mashCoefficients
: Matrix of posterior mean coefficients, with rows
corresponding to features and columns to model terms.
mashIdx
: Vector indicating which model terms were included in the mash
fit.
getModelFit()
, getRhythmStats()
, getExpectedMeas()
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
This is an optional step in an analysis using limorhyde2
, and is useful for
quantifying uncertainty in posterior estimates of fitted curves and rhythmic
statistics. The function calls mashr::mash_compute_posterior_matrices()
.
getPosteriorSamples(fit, nPosteriorSamples = 200L, overwrite = FALSE)
getPosteriorSamples(fit, nPosteriorSamples = 200L, overwrite = FALSE)
fit |
A ‘limorhyde2’ object containing posterior fits. |
nPosteriorSamples |
Number of samples to draw from each posterior distribution. |
overwrite |
Logical indicating whether to recompute posterior samples if they already exist. |
A limorhyde2
object containing everything in fit
with added or
updated element:
mashPosteriorSamples
: a three-dimensional array of coefficients, with dim
1 corresponding to features, dim 2 to model terms, and dim 3 to posterior
samples.
getPosteriorFit()
, getRhythmStats()
, getExpectedMeas()
,
getStatsIntervals()
library('data.table') y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) fit = getPosteriorSamples(fit, nPosteriorSamples = 10L) rhyStatsSamps = getRhythmStats( fit, features = c('13170', '13869'), fitType = 'posterior_samples') rhyStatsInts = getStatsIntervals(rhyStatsSamps)
library('data.table') y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) fit = getPosteriorSamples(fit, nPosteriorSamples = 10L) rhyStatsSamps = getRhythmStats( fit, features = c('13170', '13869'), fitType = 'posterior_samples') rhyStatsInts = getStatsIntervals(rhyStatsSamps)
This function uses stats::optim()
to compute various properties of
fitted curves with respect to time, potentially in each condition and for
each posterior sample, and adjusting for any covariates.
getRhythmStats( fit, fitType = c("posterior_mean", "posterior_samples", "raw"), features = NULL, dopar = TRUE, rms = FALSE )
getRhythmStats( fit, fitType = c("posterior_mean", "posterior_samples", "raw"), features = NULL, dopar = TRUE, rms = FALSE )
fit |
A |
fitType |
String indicating which fitted models to use to compute the
rhythmic statistics. A typical analysis using |
features |
Vector of names, row numbers, or logical values for
subsetting the features. |
dopar |
Logical indicating whether to run calculations in parallel if
a parallel backend is already set up, e.g., using
|
rms |
Logical indicating whether to calculate |
A data.table
containing the following rhythm statistics:
peak_phase
: time between 0 and fit$period
at which the peak or maximum
value occurs
peak_value
trough_phase
: time between 0 and fit$period
at which the trough or
minimum value occurs
trough_value
peak_trough_amp
: peak_value - trough_value
rms_amp
: root mean square difference between fitted curve and mean value
between time 0 and fit$period
(only calculated if rms
is TRUE
)
mesor
: mean value between time 0 and fit$period
The rows of the data.table
depend on the fit
object and fitType
:
fit
contains data from one condition and fitType
is posterior_mean' or
'raw': one row per feature.
fit
contains data from one condition and fitType
is
'posterior_samples': one row per feature per posterior sample.
fit
contains data from multiple conditions and fitType
is
'posterior_mean' or 'raw': one row per feature per condition.
fit
contains data from multiple conditions and fitType
is
'posterior_samples': one row per feature per condition per posterior
sample.
getModelFit()
, getPosteriorFit()
, getPosteriorSamples()
,
getDiffRhythmStats()
, getStatsIntervals()
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
library('data.table') # rhythmicity in one condition y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '13869')) # rhythmicity and differential rhythmicity in multiple conditions y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond') fit = getPosteriorFit(fit) rhyStats = getRhythmStats(fit, features = c('13170', '12686')) diffRhyStats = getDiffRhythmStats(fit, rhyStats)
This function uses posterior samples to quantify uncertainty in the properties of fitted curves.
getStatsIntervals(posteriorStats, mass = 0.9, method = c("eti", "hdi"))
getStatsIntervals(posteriorStats, mass = 0.9, method = c("eti", "hdi"))
posteriorStats |
A |
mass |
Number between 0 and 1 indicating the probability mass for which to calculate the intervals. |
method |
String indicating the type of interval: 'eti' for equal-tailed
using |
A data.table
containing lower and upper bounds of various
statistics for each feature or each feature-condition pair. For
peak_trough_amp
and rms_amp
, a negative lower bound indicates a rhythm
of the opposite phase.
getRhythmStats()
, getDiffRhythmStats()
,
getExpectedMeasIntervals()
library('data.table') y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) fit = getPosteriorSamples(fit, nPosteriorSamples = 10L) rhyStatsSamps = getRhythmStats( fit, features = c('13170', '13869'), fitType = 'posterior_samples') rhyStatsInts = getStatsIntervals(rhyStatsSamps)
library('data.table') y = GSE54650$y metadata = GSE54650$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) fit = getPosteriorSamples(fit, nPosteriorSamples = 10L) rhyStatsSamps = getRhythmStats( fit, features = c('13170', '13869'), fitType = 'posterior_samples') rhyStatsInts = getStatsIntervals(rhyStatsSamps)
Data are based on total RNA, measured by microarray, obtained from livers of
wild-type and liver-specific Reverb alpha/beta double knockout mice at
various times in a 12h:12h light:dark cycle. To save space and time, the data
include only a subset of genes, and so are mainly useful for examples of how
to use limorhyde2
.
GSE34018
GSE34018
A list with two elements:
y
: Matrix of normalized, log-transformed expression values. Rows
correspond to genes (rownames are Entrez Gene IDs) and columns to samples.
metadata
: data.table
with one row per sample. time
is in hours.
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE34018
Data are based on total RNA, measured by microarray, obtained from livers of
wild-type mice at various times after transfer to constant darkness. To save
space and time, the data include only a subset of genes, and so are mainly
useful for examples of how to use limorhyde2
.
GSE54650
GSE54650
A list with two elements:
y
: Matrix of normalized, log-transformed expression values. Rows
correspond to genes (rownames are Entrez Gene IDs) and columns to samples.
metadata
: data.table
with one row per sample. time
is in hours.
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE54650
This function is useful for plotting time-courses for individual features.
mergeMeasMeta(y, metadata, features = NULL, sampleColname = "sample")
mergeMeasMeta(y, metadata, features = NULL, sampleColname = "sample")
y |
Matrix-like object of measurements, with rows corresponding to features and columns to samples. |
metadata |
data.frame containing experimental design information for
each sample. Rows of |
features |
Vector of names, row numbers, or logical values for
subsetting the features. |
sampleColname |
String indicating the column in |
A data.table
with one row for each sample-feature pair.
library('data.table') y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) measObs = mergeMeasMeta(y, metadata, features = c('13170', '12686')) measFitMean = getExpectedMeas( fit, times = seq(0, 24, 0.5), features = c('13170', '12686'))
library('data.table') y = GSE34018$y metadata = GSE34018$metadata fit = getModelFit(y, metadata) fit = getPosteriorFit(fit) measObs = mergeMeasMeta(y, metadata, features = c('13170', '12686')) measFitMean = getExpectedMeas( fit, times = seq(0, 24, 0.5), features = c('13170', '12686'))