ContingencyTests {coin}R Documentation

Independence in Three-Way Contingency Tables

Description

Testing the independence of two possibly ordered factors, eventually stratified by a third factor.

Usage


## S3 method for class 'formula'
cmh_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
cmh_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'IndependenceProblem'
cmh_test(object, distribution = c("asymptotic", "approximate"), ...)

## S3 method for class 'formula'
chisq_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
chisq_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'IndependenceProblem'
chisq_test(object, distribution = c("asymptotic", "approximate"), ...)

## S3 method for class 'formula'
lbl_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
lbl_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'IndependenceProblem'
lbl_test(object, distribution = c("asymptotic", "approximate"), ...)

Arguments

formula

a formula of the form y ~ x | block where y and x are factors (possibly ordered) and block is an optional factor for stratification.

data

an optional data frame containing the variables in the model formula.

subset

an optional vector specifying a subset of observations to be used.

weights

an optional formula of the form ~ w defining integer valued weights for the observations.

object

an object inheriting from class "IndependenceProblem" or an object of class table.

distribution

a character, the null distribution of the test statistic can be approximated by its asymptotic distribution ("asymptotic") or via Monte-Carlo resampling ("approximate"). Alternatively, the functions approximate or asymptotic can be used to specify how the exact conditional distribution of the test statistic should be calculated or approximated.

...

further arguments to be passed to or from methods.

Details

The null hypothesis of the independence of y and x is tested, block defines an optional factor for stratification. chisq_test implements Pearson's chi-squared test, cmh_test the Cochran-Mantel-Haenzsel test and lbl_test the linear-by-linear association test for ordered data.

In case either x or y are ordered factors, the corresponding linear-by-linear association test is performed by all the procedures. lbl_test coerces factors to class ordered under any circumstances. The default scores are 1:nlevels(x) and 1:nlevels(y), respectively. The default scores can be changed via the scores argument (see independence_test), for example scores = list(y = 1:3, x = c(1, 4, 6)) first triggers a coercion to class ordered of both variables and attaches the list elements as scores to the corresponding factors. The length of a score vector needs to be equal the number of levels of the factor of interest.

The authoritative source for details on the documented test procedures is Agresti (2002).

Value

An object inheriting from class IndependenceTest-class with methods show, statistic, expectation, covariance and pvalue. The null distribution can be inspected by pperm, dperm, qperm and support methods.

References

Alan Agresti (2002), Categorical Data Analysis. Hoboken, New Jersey: John Wiley & Sons.

Examples




  ### for females only
  chisq_test(as.table(jobsatisfaction[,,"Female"]), 
      distribution = approximate(B = 9999))

  ### both Income and Job.Satisfaction unordered
  cmh_test(jobsatisfaction)

  ### both Income and Job.Satisfaction ordered, default scores
  lbl_test(jobsatisfaction)

  ### both Income and Job.Satisfaction ordered, alternative scores
  lbl_test(jobsatisfaction, scores = list(Job.Satisfaction = c(1, 3, 4, 5),
                                          Income = c(3, 10, 20, 35)))

  ### the same, null distribution approximated
  cmh_test(jobsatisfaction, scores = list(Job.Satisfaction = c(1, 3, 4, 5),
                                        Income = c(3, 10, 20, 35)),
           distribution = approximate(B = 10000))

  ### Smoking and HDL cholesterin status
  ### (from Jeong, Jhun and Kim, 2005, CSDA 48, 623-631, Table 2)
  smokingHDL <- as.table(
      matrix(c(15,  8, 11,  5, 
                3,  4,  6,  1, 
                6,  7, 15, 11, 
                1,  2,  3,  5), ncol = 4,
             dimnames = list(smoking = c("none", "< 5", "< 10", ">=10"), 
                             HDL = c("normal", "low", "borderline", "abnormal"))
  ))
  ### use interval mid-points as scores for smoking
  lbl_test(smokingHDL, scores = list(smoking = c(0, 2.5, 7.5, 15)))

  ### Cochran-Armitage trend test for proportions
  ### Lung tumors in female mice exposed to 1,2-dichloroethane
  ### Encyclopedia of Biostatistics (Armitage & Colton, 1998), 
  ### Chapter Trend Test for Counts and Proportions, page 4578, Table 2
  lungtumor <- data.frame(dose = rep(c(0, 1, 2), c(40, 50, 48)),
                          tumor = c(rep(c(0, 1), c(38, 2)),
                                    rep(c(0, 1), c(43, 7)),
                                    rep(c(0, 1), c(33, 15))))
  table(lungtumor$dose, lungtumor$tumor)

  ### Cochran-Armitage test (permutation equivalent to correlation 
  ### between dose and tumor), cf. Table 2 for results
  independence_test(tumor ~ dose, data = lungtumor, teststat = "quad")

  ### linear-by-linear association test with scores 0, 1, 2
  ### is identical with Cochran-Armitage test
  lungtumor$dose <- ordered(lungtumor$dose)
  independence_test(tumor ~ dose, data = lungtumor, teststat = "quad",
                    scores = list(dose = c(0, 1, 2)))


[Package coin version 1.0-24 Index]