results {DESeq2}R Documentation

Extract results from a DESeq analysis

Description

results extracts results from a DESeq analysis giving base means across samples, log2 fold changes, standard errors, test statistics, p-values and adjusted p-values; resultsNames returns the names of the estimated effects (coefficents) of the model; removeResults returns a DESeqDataSet object with results columns removed.

Usage

results(object, contrast, name, lfcThreshold = 0,
  altHypothesis = c("greaterAbs", "lessAbs", "greater", "less"),
  listValues = c(1, -1), cooksCutoff, independentFiltering = TRUE,
  alpha = 0.1, filter, theta, pAdjustMethod = "BH")

resultsNames(object)

removeResults(object)

Arguments

object

a DESeqDataSet, on which one of the following functions has already been called: DESeq, nbinomWaldTest, or nbinomLRT

contrast

this argument specifies what comparison to extract from the object to build a results table. one of either:

  • a character vector with exactly three elements: the name of a factor in the design formula, the name of the numerator level for the log2 fold change, and the name of the denominator level for the log2 fold change (most simple case)

  • a list of two character vectors: the names of the effects for the numerator, and the names of the effects for denominator. these names should be elements of resultsNames(object). one list element can be the empty vector character(). (more general case, can be to combine interaction terms and main effects)

  • a numeric contrast vector with one element for each element in resultsNames(object) (most general case)

If specified, the name argument is ignored.

name

the name of the individual effect (coefficient) for building a results table. Use this argument rather than contrast for continuous variables, individual effects or for individual interaction terms. The value provided to name must be an element of resultsNames(object).

lfcThreshold

a non-negative value, which specifies the test which should be applied to the log2 fold changes. The standard is a test that the log2 fold changes are not equal to zero. However, log2 fold changes greater or less than lfcThreshold can also be tested. Specify the alternative hypothesis using the altHypothesis argument. If lfcThreshold is specified, the results are Wald tests, and LRT p-values will be overwritten.

altHypothesis

character which specifies the alternative hypothesis, i.e. those values of log2 fold change which the user is interested in finding. The complement of this set of values is the null hypothesis which will be tested. If the log2 fold change specified by name or by contrast is written as beta , then the possible values for altHypothesis represent the following alternate hypotheses:

  • greaterAbs - |beta| > lfcThreshold , and p-values are two-tailed

  • lessAbs - |beta| < lfcThreshold , NOTE: this requires that betaPrior=FALSE has been specified in the previous DESeq call. p-values are the maximum of the upper and lower tests.

  • greater - beta > lfcThreshold

  • less - beta < -lfcThreshold

listValues

only used if a list is provided to contrast: a numeric of length two, giving the values to assign to the first and second elements of the list, which should be positive and negative, respectively, to specify the numerator and denominator. by default this is c(1,-1)

cooksCutoff

theshold on Cook's distance, such that if one or more samples for a row have a distance higher, the p-value for the row is set to NA. The default cutoff is the .99 quantile of the F(p, m-p) distribution, where p is the number of coefficients being fitted and m is the number of samples. Set to Inf or FALSE to disable the resetting of p-values to NA. Note: this test excludes the Cook's distance of samples whose removal would result in rank deficient design matrix and samples belonging to experimental groups with only 2 samples.

independentFiltering

logical, whether independent filtering should be applied automatically

alpha

the significance cutoff used for optimizing the independent filtering

filter

the vector of filter statistics over which the independent filtering will be optimized. By default the mean of normalized counts is used.

theta

the quantiles at which to assess the number of rejections from independent filtering

pAdjustMethod

the method to use for adjusting p-values, see ?p.adjust

Details

Multiple results can be returned for analyses beyond a simple two group comparison, so results takes arguments contrast and name to help the user pick out the comparison of interest for printing the results table. If results is run without specifying contrast or name, it will return the comparison of the last level of the last variable in the design formula over the first level of this variable. For example, for a simple two-group comparison, this would return the log2 fold changes of the second group over the first group (the base level). Please see examples below and in the vignette.

The argument contrast can be used to generate results tables for any comparison of interest, for example, the log2 fold change between two levels of a factor, and its usage is described below. It can also accomodate more complicated numeric comparisons. The test statistic used for a contrast is:

c' beta / sqrt( c' Sigma c )

The argument name can be used to generate results tables for individual effects, which must be individual elements of resultsNames(object). These individual effects could represent continuous covariates, effects for individual levels, or individual interaction effects.

Information on the comparison which was used to build the results table, and the statistical test which was used for p-values (Wald test or likelihood ratio test) is stored within the object returned by results. This information is in the metadata columns of the results table, which is accessible by calling mcol on the DESeqResults object returned by results.

By default, independent filtering is performed to select a set of genes for multiple test correction which will optimize the number of adjusted p-values less than a given critical value alpha (by default 0.1). The adjusted p-values for the genes which do not pass the filter threshold are set to NA. By default, the mean of normalized counts is used to perform this filtering, though other statistics can be provided. Several arguments from the filtered_p function of genefilter are provided here to control or turn off the independent filtering behavior.

In addition, results by default assigns a p-value of NA to genes containing count outliers, as identified using Cook's distance. See the cooksCutoff argument for control of this behavior. Cook's distances for each sample are accessible as a matrix "cooks" stored in the assays() list. This measure is useful for identifying rows where the observed counts might not fit to a negative binomial distribution.

For analyses using the likelihood ratio test (using nbinomLRT), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A log2 fold change is included, which can be controlled using the name argument, or by default this will be the estimated coefficient for the last element of resultsNames(object).

Value

For results: a DESeqResults object, which is a simple subclass of DataFrame. This object contains the results columns: baseMean, log2FoldChange, lfcSE, stat, pvalue and padj, and also includes metadata columns of variable information.

For resultsNames: the names of the columns available as results, usually a combination of the variable name and a level

For removeResults: the original DESeqDataSet with results metadata columns removed

References

Richard Bourgon, Robert Gentleman, Wolfgang Huber: Independent filtering increases detection power for high-throughput experiments. PNAS (2010), http://dx.doi.org/10.1073/pnas.0914005107

See Also

DESeq

Examples

# minimal example with simple two-group comparison
example("DESeq")
results(dds)
resultsNames(dds)
dds <- removeResults(dds)

# two conditions, two groups, with interaction term
dds <- makeExampleDESeqDataSet(n=100,m=12)
dds$group <- factor(rep(rep(c("X","Y"),each=3),2))
design(dds) <- ~ group + condition + group:condition
dds <- DESeq(dds)
resultsNames(dds)
results(dds, contrast=c("condition","B","A"))
results(dds, contrast=c("group","Y","X"))
# extract the interaction term simply with 'name'
results(dds, name="groupY.conditionB")

# two conditions, three groups, with interaction term
dds <- makeExampleDESeqDataSet(n=100,m=18)
dds$group <- factor(rep(rep(c("X","Y","Z"),each=3),2))
design(dds) <- ~ group + condition + group:condition
dds <- DESeq(dds)
resultsNames(dds)

# results tables for various comparisons:

# the condition effect over all groups
results(dds, contrast=c("condition","B","A"))
# which is equivalent to
results(dds, contrast=list("conditionB","conditionA"))
# which is equivalent to
results(dds, contrast=c(0, 0,0,0, -1,1, 0,0,0, 0,0,0))

# the group Z effect compared to the average of group X and Y
# here we use 'listValues' to multiply group X and
# group Y by -1/2 in the numeric contrast
results(dds, contrast=list("groupZ",c("groupX","groupY")), listValues=c(1,-1/2))

# the individual effect for group Z, compared to the intercept
results(dds, name="groupZ")

# the interaction effect of condition for group Z.
# if this term is non-zero, then group Z has a
# different condition effect than the overall condition effect
results(dds, contrast=list("groupZ.conditionB","groupZ.conditionA"))

# the condition effect for group Z:
# this is the sum of the main effect for condition
# and the interaction effect for group Z
results(dds, contrast=list(
               c("conditionB","groupZ.conditionB"),
               c("conditionA","groupZ.conditionA")))

[Package DESeq2 version 1.4.5 Index]