results {DESeq2} | R Documentation |
results
extracts results from a DESeq analysis
giving base means across samples, log2 fold changes,
standard errors, test statistics, p-values and adjusted
p-values; resultsNames
returns the names of the
estimated effects (coefficents) of the model;
removeResults
returns a DESeqDataSet
object
with results columns removed.
results(object, contrast, name, lfcThreshold = 0, altHypothesis = c("greaterAbs", "lessAbs", "greater", "less"), listValues = c(1, -1), cooksCutoff, independentFiltering = TRUE, alpha = 0.1, filter, theta, pAdjustMethod = "BH") resultsNames(object) removeResults(object)
object |
a DESeqDataSet, on which one of the
following functions has already been called:
|
contrast |
this argument specifies what comparison
to extract from the
If
specified, the |
name |
the name of the individual effect
(coefficient) for building a results table. Use this
argument rather than |
lfcThreshold |
a non-negative value, which specifies
the test which should be applied to the log2 fold
changes. The standard is a test that the log2 fold
changes are not equal to zero. However, log2 fold changes
greater or less than |
altHypothesis |
character which specifies the
alternative hypothesis, i.e. those values of log2 fold
change which the user is interested in finding. The
complement of this set of values is the null hypothesis
which will be tested. If the log2 fold change specified
by
|
listValues |
only used if a list is provided to
|
cooksCutoff |
theshold on Cook's distance, such that if one or more samples for a row have a distance higher, the p-value for the row is set to NA. The default cutoff is the .99 quantile of the F(p, m-p) distribution, where p is the number of coefficients being fitted and m is the number of samples. Set to Inf or FALSE to disable the resetting of p-values to NA. Note: this test excludes the Cook's distance of samples whose removal would result in rank deficient design matrix and samples belonging to experimental groups with only 2 samples. |
independentFiltering |
logical, whether independent filtering should be applied automatically |
alpha |
the significance cutoff used for optimizing the independent filtering |
filter |
the vector of filter statistics over which the independent filtering will be optimized. By default the mean of normalized counts is used. |
theta |
the quantiles at which to assess the number of rejections from independent filtering |
pAdjustMethod |
the method to use for adjusting
p-values, see |
Multiple results can be returned for analyses beyond a
simple two group comparison, so results
takes
arguments contrast
and name
to help the user
pick out the comparison of interest for printing the
results table. If results
is run without specifying
contrast
or name
, it will return the
comparison of the last level of the last variable in the
design formula over the first level of this variable. For
example, for a simple two-group comparison, this would
return the log2 fold changes of the second group over the
first group (the base level). Please see examples below and
in the vignette.
The argument contrast
can be used to generate
results tables for any comparison of interest, for example,
the log2 fold change between two levels of a factor, and
its usage is described below. It can also accomodate more
complicated numeric comparisons. The test statistic used
for a contrast is:
c' beta / sqrt( c' Sigma c )
The argument name
can be used to generate results
tables for individual effects, which must be individual
elements of resultsNames(object)
. These individual
effects could represent continuous covariates, effects for
individual levels, or individual interaction effects.
Information on the comparison which was used to build the
results table, and the statistical test which was used for
p-values (Wald test or likelihood ratio test) is stored
within the object returned by results
. This
information is in the metadata columns of the results
table, which is accessible by calling mcol
on the
DESeqResults
object returned by
results
.
By default, independent filtering is performed to select a
set of genes for multiple test correction which will
optimize the number of adjusted p-values less than a given
critical value alpha
(by default 0.1). The adjusted
p-values for the genes which do not pass the filter
threshold are set to NA
. By default, the mean of
normalized counts is used to perform this filtering, though
other statistics can be provided. Several arguments from
the filtered_p
function of genefilter are provided
here to control or turn off the independent filtering
behavior.
In addition, results
by default assigns a p-value of
NA
to genes containing count outliers, as identified
using Cook's distance. See the cooksCutoff
argument
for control of this behavior. Cook's distances for each
sample are accessible as a matrix "cooks" stored in the
assays() list. This measure is useful for identifying rows
where the observed counts might not fit to a negative
binomial distribution.
For analyses using the likelihood ratio test (using
nbinomLRT
), the p-values are determined
solely by the difference in deviance between the full and
reduced model formula. A log2 fold change is included,
which can be controlled using the name
argument, or
by default this will be the estimated coefficient for the
last element of resultsNames(object)
.
For results
: a DESeqResults
object,
which is a simple subclass of DataFrame. This object
contains the results columns: baseMean
,
log2FoldChange
, lfcSE
, stat
,
pvalue
and padj
, and also includes metadata
columns of variable information.
For resultsNames
: the names of the columns available
as results, usually a combination of the variable name and
a level
For removeResults
: the original DESeqDataSet
with results metadata columns removed
Richard Bourgon, Robert Gentleman, Wolfgang Huber: Independent filtering increases detection power for high-throughput experiments. PNAS (2010), http://dx.doi.org/10.1073/pnas.0914005107
# minimal example with simple two-group comparison example("DESeq") results(dds) resultsNames(dds) dds <- removeResults(dds) # two conditions, two groups, with interaction term dds <- makeExampleDESeqDataSet(n=100,m=12) dds$group <- factor(rep(rep(c("X","Y"),each=3),2)) design(dds) <- ~ group + condition + group:condition dds <- DESeq(dds) resultsNames(dds) results(dds, contrast=c("condition","B","A")) results(dds, contrast=c("group","Y","X")) # extract the interaction term simply with 'name' results(dds, name="groupY.conditionB") # two conditions, three groups, with interaction term dds <- makeExampleDESeqDataSet(n=100,m=18) dds$group <- factor(rep(rep(c("X","Y","Z"),each=3),2)) design(dds) <- ~ group + condition + group:condition dds <- DESeq(dds) resultsNames(dds) # results tables for various comparisons: # the condition effect over all groups results(dds, contrast=c("condition","B","A")) # which is equivalent to results(dds, contrast=list("conditionB","conditionA")) # which is equivalent to results(dds, contrast=c(0, 0,0,0, -1,1, 0,0,0, 0,0,0)) # the group Z effect compared to the average of group X and Y # here we use 'listValues' to multiply group X and # group Y by -1/2 in the numeric contrast results(dds, contrast=list("groupZ",c("groupX","groupY")), listValues=c(1,-1/2)) # the individual effect for group Z, compared to the intercept results(dds, name="groupZ") # the interaction effect of condition for group Z. # if this term is non-zero, then group Z has a # different condition effect than the overall condition effect results(dds, contrast=list("groupZ.conditionB","groupZ.conditionA")) # the condition effect for group Z: # this is the sum of the main effect for condition # and the interaction effect for group Z results(dds, contrast=list( c("conditionB","groupZ.conditionB"), c("conditionA","groupZ.conditionA")))