rpart.plot {rpart.plot}R Documentation

Plot an rpart model.

Description

Plot an rpart model. This function combines and extends plot.rpart and text.rpart in the rpart package. It automatically scales and adjusts the displayed tree for best fit.

This is a front end to prp, with the most useful arguments of that function. See ../doc/prp.pdf for an overview.

Usage

rpart.plot(x=stop("no 'x' arg"),
    type=0, extra=0, under=FALSE, clip.right.labs=TRUE,
    fallen.leaves=FALSE, branch=if(fallen.leaves) 1 else .2,
    uniform=TRUE,
    digits=2, varlen=-8, faclen=3,
    cex=NULL, tweak=1,
    compress=TRUE, ycompress=uniform,
    snip=FALSE,
    ...)

Arguments

To start off, look at the arguments x, type and extra. Just those arguments will suffice for many users. For an overview of the other arguments see ../doc/prp.pdf.

x

An rpart object. The only required argument.

type

Type of plot. Five possibilities:

0 The default. Draw a split label at each split and a node label at each leaf.

1 Label all nodes, not just leaves. Similar to text.rpart's all=TRUE.

2 Like 1 but draw the split labels below the node labels. Similar to the plots in the CART book.

3 Draw separate split labels for the left and right directions.

4 Like 3 but label all nodes, not just leaves. Similar to text.rpart's fancy=TRUE. See also clip.right.labs.

extra

Display extra information at the nodes. Possible values:

0 No extra information (the default).

1 Display the number of observations that fall in the node (per class for class objects; prefixed by the number of events for poisson and exp models). Similar to text.rpart's use.n=TRUE.

2 Class models: display the classification rate at the node, expressed as the number of correct classifications and the number of observations in the node.
Poisson and exp models: display the number of events.

3 Class models: misclassification rate at the node, expressed as the number of incorrect classifications and the number of observations in the node.

4 Class models: probability per class of observations in the node (conditioned on the node, sum across a node is 1).

5 Class models: like 4 but do not display the fitted class.

6 Class models: the probability of the second class only. Useful for binary responses.

7 Class models: like 6 but do not display the fitted class.

8 Class models: the probability of the fitted class.

9 Class models: the probabilities times the fraction of observations in the node (the probability relative to all observations, sum across all leaves is 1).

+100 Add 100 to any of the above to also display the percentage of observations in the node. For example extra=101 displays the number and percentage of observations in the node. Actually, it's a weighted percentage using the weights passed to rpart.

Note 1: Unlike text.rpart, by default prp uses its own routine for generating node labels (not the function attached to the object). See node.fun.
Note 2: The extra argument has special meaning for mvpart objects. See the Appendix to this package's vignette.

under

Applies only if extra > 0. Default FALSE, meaning put the extra text in the box. Use TRUE to put the text under the box.

clip.right.labs

Default is TRUE meaning “clip” the right-hand split labels, i.e. do not print variable=. Applies only if type=3 or 4.

fallen.leaves

Default FALSE. If TRUE, display the leaves at the bottom of the graph.

branch

Controls the shape of the branch lines. Specify a value between 0 (V shaped branches) and 1 (square shouldered branches). Default is if(fallen.leaves) 1 else .2.

uniform

If TRUE (the default), the vertical spacing of the nodes is uniform. If FALSE, the nodes are spaced proportionally to the fit (more precisely, to the difference between a node's deviance and the sum of its two children's deviances). Small vertical spaces are automatically artificially expanded to make room for the labels. Note: uniform=FALSE with cex=NULL (the default) can sometimes cause very small text.

digits

The number of significant digits in displayed numbers. Default 2. If 0, use getOption("digits"). Details: Numbers from 0.001 to 9999 are printed without an exponent (and the number of digits is actually only a suggestion, see format for details). Numbers out that range are printed with an “engineering” exponent (a multiple of 3).

varlen

Length of variable names in text at the splits (and, for class responses, the class in the node label). Default -8, meaning truncate to eight characters. Possible values:
=0 use full names.
>0 call abbreviate with the given varlen.
<0 truncate variable names to the shortest length where they are still unique, but never truncate to shorter than abs(varlen).

faclen

Length of factor level names in splits. Default 3, meaning abbreviate to three characters. Possible values are as varlen above, except that 1 is treated specially, meaning represent the factor levels with alphabetic characters (a for the first level, b for the second, etc.).

cex

Default NULL, meaning calculate the text size automatically.

tweak

Adjust the (possibly automatically calculated) cex. Default 1, meaning no adjustment. Use say tweak=1.2 to make the text 20% larger. Note that font sizes are discrete, so the cex you ask for may not be the cex you get. And a small tweak may not actually change the type size or change it more than you want.

compress

If TRUE (the default), make more space by shifting nodes horizontally where space is available. This often allows larger text. (This is the same as plot.rpart's argument of the same name, except that here the default is TRUE.)

ycompress

If TRUE (the default unless uniform=FALSE), make more space by shifting labels vertically where space is available. Actually, this only kicks in if the initial automatically calculated cex is less than 0.7. Use ycompress=FALSE if you feel the resulting display is too messy. In the current implementation, the shifting algorithm works a little better (allowing larger text) with type=1, 2, or 3.

snip

Default FALSE. Set TRUE to interactively trim the tree with the mouse. See ../doc/prp.pdf (or just try it).

...

Extra arguments passed to prp and the plotting routines. Any of prp's arguments can be used.

Value

The returned value is identical to that of prp.

Author(s)

Stephen Milborrow, borrowing heavily from the rpart package by Terry M. Therneau and Beth Atkinson, and the R port of that package by Brian Ripley.

See Also

../doc/prp.pdf
prp
plot.rpart
text.rpart
rpart

Examples

data(ptitanic)
tree <- rpart(survived ~ ., data=ptitanic, cp=.02)
                         # cp=.02 because want small tree for demo

old.par <- par(mfrow=c(2,2))
                         # put 4 figures on one page

rpart.plot(tree, main="default rpart.plot\n(type = 0, extra = 0)")

prp(tree, main="type = 4, extra = 6", type=4, extra=6, faclen=0)
                         # faclen=0 to print full factor names

rpart.plot(tree, main="extra = 106,  under = TRUE", extra=106, under=TRUE, faclen=0)

# the old way for comparison
plot(tree, uniform=TRUE, compress=TRUE, branch=.2)
text(tree, use.n=TRUE, cex=.6, xpd=NA) # cex is a guess, depends on your window size
title("rpart.plot for comparison", cex=.6)

par(old.par)

[Package rpart.plot version 1.4-4 Index]