MaxstatTest {coin}R Documentation

Maximally Selected Statistics

Description

Testing the independence of a set of ordered or numeric covariates and a response of arbitrary measurement scale against cutpoint alternatives.

Usage

## S3 method for class 'formula'
maxstat_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
maxstat_test(object, 
    distribution = c("asymptotic", "approximate"), 
    teststat = c("max", "quad"),
    minprob = 0.1, maxprob = 1 - minprob, ...)

Arguments

formula

a formula of the form y ~ x1 + ... + xp | block where y and covariates x1 to xp can be variables measured at arbitrary scales; block is an optional factor for stratification.

data

an optional data frame containing the variables in the model formula.

subset

an optional vector specifying a subset of observations to be used.

weights

an optional formula of the form ~ w defining integer valued weights for the observations.

object

an object inheriting from class IndependenceProblem.

distribution

a character, the null distribution of the test statistic can be approximated by its asymptotic distribution (asymptotic) or via Monte-Carlo resampling (approximate). Alternatively, the functions approximate or asymptotic can be used to specify how the exact conditional distribution of the test statistic should be calculated or approximated.

teststat

a character, the type of test statistic to be applied: a maximum type statistic (max) or a quadratic form (quad).

minprob

a fraction between 0 and 0.5; consider only cutpoints greater than the minprob * 100 % quantile of x.

maxprob

a fraction between 0.5 and 1; consider only cutpoints smaller than the maxprob * 100 % quantile of x.

...

further arguments to be passed to or from methods.

Details

The null hypothesis of independence of all covariates to the response y against simple cutpoint alternatives is tested.

For an unordered covariate x, all possible partitions into two groups are evaluated. The cutpoint is then a set of levels defining one of the two groups.

Value

An object inheriting from class IndependenceTest-class with methods show, statistic, expectation, covariance and pvalue. The null distribution can be inspected by pperm, dperm, qperm and support methods.

References

Rupert Miller \& David Siegmund (1982). Maximally Selected Chi Square Statistics. Biometrics 38, 1011–1016.

Berthold Lausen \& Martin Schumacher (1992). Maximally Selected Rank Statistics. Biometrics 48, 73–85.

Torsten Hothorn \& Berthold Lausen (2003). On the Exact Distribution of Maximally Selected Rank Statistics. Computational Statistics \& Data Analysis 43, 121–137.

Berthold Lausen, Torsten Hothorn, Frank Bretz \& Martin Schumacher (2004). Optimally Selected Prognostic Factors. Biometrical Journal 46, 364–374.

J\"org M\"uller \& Torsten Hothorn (2004). Maximally Selected Two-Sample Statistics as a new Tool for the Identification and Assessment of Habitat Factors with an Application to Breeding Bird Communities in Oak Forests. European Journal of Forest Research, 123, 218–228.

Torsten Hothorn \& Achim Zeileis (2008). Generalized maximally selected statistics, Biometrics, 64(4), 1263–1269.

Examples


  ### analysis of the tree pipit data in Mueller and Hothorn (2004)
  maxstat_test(counts ~ coverstorey, data = treepipit)

  ### and for all possible covariates (simultaneously)
  mt <- maxstat_test(counts ~ ., data = treepipit)
  show(mt)$estimate

  ### reproduce applications in Sections 7.2 and 7.3 
  ### of Hothorn & Lausen (2003) with limiting distribution

  maxstat_test(Surv(time, event) ~  EF, data = hohnloser, 
      ytrafo = function(data) trafo(data, surv_trafo = function(x) 
         logrank_trafo(x, ties = "HL")))

  data("sphase", package = "TH.data")
  maxstat_test(Surv(RFS, event) ~  SPF, data = sphase,
      ytrafo = function(data) trafo(data, surv_trafo = function(x)
         logrank_trafo(x, ties = "HL")))


[Package coin version 1.0-24 Index]