LocationTests {coin} | R Documentation |
Testing the equality of the distributions of a numeric response in two or more independent groups against shift alternatives.
## S3 method for class 'formula' oneway_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'IndependenceProblem' oneway_test(object, ...) ## S3 method for class 'formula' wilcox_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'IndependenceProblem' wilcox_test(object, conf.int = FALSE, conf.level = 0.95, ...) ## S3 method for class 'formula' normal_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'IndependenceProblem' normal_test(object, ties.method = c("mid-ranks", "average-scores"), conf.int = FALSE, conf.level = 0.95, ...) ## S3 method for class 'formula' median_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'IndependenceProblem' median_test(object, conf.int = FALSE, conf.level = 0.95, ...) ## S3 method for class 'formula' kruskal_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'IndependenceProblem' kruskal_test(object, distribution = c("asymptotic", "approximate"), ...)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the model formula. |
subset |
an optional vector specifying a subset of observations to be used. |
weights |
an optional formula of the form |
object |
an object of class |
distribution |
a character, the null distribution of the test statistic
can be computed |
ties.method |
a character, two methods are available to adjust scores for ties,
either the score generating function is applied to |
conf.int |
a logical indicating whether a confidence interval for the difference in location should be computed. |
conf.level |
confidence level of the interval. |
... |
further arguments to be passed to or from methods. |
The null hypothesis of the equality of the distribution of y
in
the groups given by x
is tested. In particular, the methods
documented here are designed to detect shift alternatives. For a general
description of the test procedures documented here we refer to Hollander &
Wolfe (1999).
The test procedures apply a rank transformation to the response values
y
, except of oneway_test
which computes a test statistic
using the untransformed response values.
The asymptotic null distribution is computed by default for all procedures. Exact p-values may be computed for the two-sample problems and can be approximated via Monte-Carlo resampling for all procedures. Exact p-values are computed either by the shift algorithm (Streitberg & R\"ohmel, 1986, 1987) or by the split-up algorithm (van de Wiel, 2001).
The linear rank tests for two samples (wilcox_test
,
normal_test
and median_test
) can be used to test the
two-sided hypothesis H_0: Y_1 - Y_2 = 0, where Y_i is the median
of the responses in the ith group. Confidence intervals for the difference
in location are available for the rank-based procedures and are computed
according to Bauer (1972). In case alternative = "less"
, the
null hypothesis H_0: Y_1 - Y_2 ≥ 0 is tested and
alternative = "greater"
corresponds to a null hypothesis
H_0: Y_1 - Y_2 ≤ 0.
In case x
is an ordered factor, kruskal_test
computes the
linear-by-linear association test for ordered alternatives.
For the adjustment of scores for tied values see Hajek, Sidak and Sen (1999), page 131ff.
An object inheriting from class IndependenceTest-class
with
methods show
, statistic
, expectation
,
covariance
and pvalue
. The null distribution
can be inspected by pperm
, dperm
,
qperm
and support
methods. Confidence
intervals can be extracted by confint
.
Myles Hollander \& Douglas A. Wolfe (1999). Nonparametric Statistical Methods, 2nd Edition. New York: John Wiley & Sons.
Bernd Streitberg \& Joachim R\"ohmel (1986). Exact distributions for permutations and rank tests: An introduction to some recently published algorithms. Statistical Software Newsletter 12(1), 10–17.
Bernd Streitberg \& Joachim R\"ohmel (1987). Exakte Verteilungen f\"ur Rang- und Randomisierungstests im allgemeinen $c$-Stichprobenfall. EDV in Medizin und Biologie 18(1), 12–19.
Mark A. van de Wiel (2001). The split-up algorithm: a fast symbolic method for computing p-values of rank statistics. Computational Statistics 16, 519–538.
David F. Bauer (1972). Constructing confidence sets using rank statistics. Journal of the American Statistical Association 67, 687–690.
Jaroslav Hajek, Zbynek Sidak \& Pranab K. Sen (1999), Theory of Rank Tests. San Diego, London: Academic Press.
### Tritiated Water Diffusion Across Human Chorioamnion ### Hollander & Wolfe (1999), Table 4.1, page 110 water_transfer <- data.frame( pd = c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46, 1.15, 0.88, 0.90, 0.74, 1.21), age = factor(c(rep("At term", 10), rep("12-26 Weeks", 5)))) ### Wilcoxon-Mann-Whitney test, cf. Hollander & Wolfe (1999), page 111 ### exact p-value and confidence interval for the difference in location ### (At term - 12-26 Weeks) wt <- wilcox_test(pd ~ age, data = water_transfer, distribution = "exact", conf.int = TRUE) print(wt) ### extract observed Wilcoxon statistic, i.e, the sum of the ### ranks for age = "12-26 Weeks" statistic(wt, "linear") ### its expectation expectation(wt) ### and variance covariance(wt) ### and the exact two-sided p-value pvalue(wt) ##d and, finally, the confidence interval confint(wt) ### Confidence interval for difference (12-26 Weeks - At term) wilcox_test(pd ~ age, data = water_transfer, xtrafo = function(data) trafo(data, factor_trafo = function(x) as.numeric(x == levels(x)[2])), distribution = "exact", conf.int = TRUE) ### Permutation test, asymptotic p-value oneway_test(pd ~ age, data = water_transfer) ### approximate p-value (with 99% confidence interval) pvalue(oneway_test(pd ~ age, data = water_transfer, distribution = approximate(B = 9999))) ### exact p-value pt <- oneway_test(pd ~ age, data = water_transfer, distribution = "exact") pvalue(pt) ### plot density and distribution of the standardized ### test statistic layout(matrix(1:2, nrow = 2)) s <- support(pt) d <- sapply(s, function(x) dperm(pt, x)) p <- sapply(s, function(x) pperm(pt, x)) plot(s, d, type = "S", xlab = "Teststatistic", ylab = "Density") plot(s, p, type = "S", xlab = "Teststatistic", ylab = "Cumm. Probability") ### Length of YOY Gizzard Shad from Kokosing Lake, Ohio, ### sampled in Summer 1984, Hollander & Wolfe (1999), Table 6.3, page 200 YOY <- data.frame(length = c(46, 28, 46, 37, 32, 41, 42, 45, 38, 44, 42, 60, 32, 42, 45, 58, 27, 51, 42, 52, 38, 33, 26, 25, 28, 28, 26, 27, 27, 27, 31, 30, 27, 29, 30, 25, 25, 24, 27, 30), site = factor(c(rep("I", 10), rep("II", 10), rep("III", 10), rep("IV", 10)))) ### Kruskal-Wallis test, approximate exact p-value kw <- kruskal_test(length ~ site, data = YOY, distribution = approximate(B = 9999)) kw pvalue(kw) ### Nemenyi-Damico-Wolfe-Dunn test (joint ranking) ### Hollander & Wolfe (1999), page 244 ### (where Steel-Dwass results are given) if (require("multcomp")) { NDWD <- oneway_test(length ~ site, data = YOY, ytrafo = function(data) trafo(data, numeric_trafo = rank), xtrafo = function(data) trafo(data, factor_trafo = function(x) model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))), teststat = "max", distribution = approximate(B = 90000)) ### global p-value print(pvalue(NDWD)) ### sites (I = II) != (III = IV) at alpha = 0.01 (page 244) print(pvalue(NDWD, method = "single-step")) }