selm {sn} | R Documentation |
Function selm fits a linear model with skew-elliptical error term. The term skew-elliptical distribution is an abbreviated equivalent of skew-elliptically contoured (SEC) distribution. The function works for univariate and multivariate response variables.
selm(formula, family = "SN", data, weights, subset, na.action, start = NULL, fixed.param = list(), method = "MLE", penalty=NULL, offset, model = TRUE, x = FALSE, y = FALSE, ...)
formula |
an object of class |
family |
a character string which selects the parametric family
of SEC type assumed for the error term. It must one of
|
data |
an optional data frame containing the variables in
the model. If not found in |
weights |
a numeric vector of weights associated to individual
observations. Weights are supposed to represent frequencies, hence must be
non-negative integers (not all 0) and |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen
when the data contain |
start |
a vector (in the univariate case) or a list (in the
multivariate case) of initial values for the search of the parameter
estimates. If |
fixed.param |
a list of assignments of parameter values which must
be kept fixed in the numerical maximization process.
Currently, there is only one such option, of the form
|
method |
a character string which selects the estimation method to be
used for fitting. Currently two options exist: |
penalty |
a character string which denotes the penalty function to be
subtracted to the log-likelihood function, when |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting. This
should be |
model, x, y |
logicals. If |
... |
optional control parameters, as follows.
|
By default, selm
fits the selected model by maximum
likelihood estimation (MLE), making use of some numerical
optimization method. Maximization is performed in one
parameterization, usually DP, and then the estimates are mapped to
other parameter sets, CP and pseudo-CP;
see dp2cp
for more information on parameterizations.
These parameter transformations are carried out trasparently to the user.
The observed information matrix is used to obtain the estimated variance
matrix of the MLE's and from this the standard errors.
Background information on MLE in the context of SEC
distributions is provided by Azzalini and Capitanio (2014);
see specifically Chapter 3, Sections 4.3, 5.2, 6.2.5–6. For additional
information, see the original research work referenced therein.
Although the density functionof SEC distributions are expressed using
DP parameter sets, the methods associated to the objects created
by this function communicate, by default, their outcomes in the CP
parameter set, or its variant form pseudo-CP when CP
does not exist; the ‘Note’ at
summary.selm
explains why. A more detailed discussion is
available in Sections 3.1.4–6 and 5.2.3 of Azzalini and Capitanio (2014)
and in Section 4 of Arellano-Valle and Azzalini (2008).
There is a known open issue which affects computation of the information
matrix of the multivariate skew-normal distribution when the slant
parameter α approaches the null vector; see p.149 of
Azzalini and Capitanio (2014). Consequently, if a model with
multivariate response is fitted with family="SN"
and the estimate
alpha
of α is at the origin or neary so, the
information matrix and the standard errors are not computed and a
warning message is issued. In this unusual circumstance, a simple
work-around is to re-fit the model with family="ST"
, which will
work except in remote cases when (i) the estimated degrees of freedom
nu
diverge and (ii) still alpha
remains at the origin.
In some cases, especially for small sample size, the MLE occurs on
the frontier of the parameter space, leading to DP estimates with
alpha=Inf
or to a similar situation in the multivariate case or in an
alternative parameterization. Such outcome is regared by many as
unsatisfactory; surely it prevents using the observed information matrix to
compute standard errors. This problem motivates the use of maximum penalized
likelihood estimation (MPLE), where the regular log-likelihood
function log(L) is penalized by subtracting an amount
Q, say, increasingly large as |α| increases.
Hence the function which is maximized at the optimization stage is now
log(L) - Q. If method="MPLE"
and
penalty=NULL
, the default function Qpenalty
is used,
which implements the penalization:
Q(α)= c₁ log(1 + c₂ [α*]²)
where c₁ and c₂ are positive constants, but
depending on the degrees of freedom nu
in the "ST"
case,
[α*]² = α' cor(Ω) α
and cor(Ω) denotes the correlation matrix
associated to the scale matrix Omega
described in connection with
makeSECdistr
. In the univariate case
cor(Ω)=1,
so that [α*]²=α². Further information
on MPLE and this choice of the penalty function is given in
Section 3.1.8 and p.111 of Azzalini and Capitanio (2014); for a more
detailed account, see Azzalini and Arellano-Valle (2013) and references
therein.
It is possible to change the penalty function, to be declared via the
argument penalty
. For instance, if the calling statement includes
penalty="anotherQ"
, the user must have defined
anotherQ <- function(alpha_etc, nu = NULL, der = 0)
with the following arguments.
alpha_etc
: in the univariate case, a single value alpha
;
in the multivariate case, a two-component list whose first component is
the vector alpha
, the second one is matrix equal to
cov2cor(Omega)
.
nu
: degrees of freedom, only relevant if family="ST"
.
der
: a numeric value which indicates the required order of
derivation; if der=0
(default value), only the penalty Q
need to be retuned by the function;
if der=1
, attr(Q, "der1")
must represent the
first order derivative of Q
with respect to alpha
; if
der=2
, also attr(Q, "der2")
must be assigned, containing
the second derivative (only required in the univariate case).
This function must return a single numeric value, possibly with required
attributes when is called with der>1
.
Since sn imports functions grad
and
hessian
from package numDeriv, one can rely
on them for numerical evaluation of the derivatives, if they are not
available in an explicit form.
This penalization scheme allows to introduce a prior distribution
π for α by setting Q=-log(π),
leading to a maximum a posteriori estimate in the stated sense.
See Qpenalty
for more information and an illustration.
an S4 object of class selm
or mselm
, depending on whether
the response variable of the fitted model is univariate or multivariate.
These objects are described in the selm class
.
The estimates are obtained by numerical optimization methods and, as
usual in similar cases, there is no guarantee that the maximum of the
objective function is achieved. Both consideration of model simplicity
and numerical experience indicate that models with SN error
terms generally produce more reliable results compared to those with
the ST family. Take into account that models involving a
traditional Student's t distribution with unknown degres of freedom
can already be problematic; the presence of the (multivariate) slant parameter
α in the ST family cannot make things any simpler.
Consequently, care must be exercised, especially so if one works with
the (multivariate) ST family.
Consider re-fitting a model with different starting values and,
in the ST case, building the profile log-likelihood for a range
of ν values. Details on the numerical optimization which has produced
object obj
can be estracted with slot(obj, "opt.method")
.
Be aware that occasionally optim
and nlminb
declare successful
completion of a regular minimization problem at a point where the Hessian
matrix is not positive-definite. A case of this sort is presented in the
final portion of the examples below.
Adelchi Azzalini
Arellano-Valle, R. B., and Azzalini, A. (2008). The centred parametrization for the multivariate skew-normal distribution. J. Multiv. Anal. 99, 1362–1382. Corrigendum: vol.100 (2009), p.816.
Azzalini, A. with the collaboration of Capitanio, A. (2014). The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Azzalini, A. and Arellano Valle, R. V. (2013, available on line 30 June 2012). Maximum penalized likelihood estimation for skew-normal and skew-t distributions. J. Stat. Planning & Inference 143, 419–433.
selm
for classes "selm"
and "mselm"
,
summary.selm
for summaries, plot.selm
for plots
the generic functions coef
, logLik
,
residuals
, fitted
, vcov
.
the underlying function selm.fit
and those further down
the selection of a penalty function of the log-likelihood,
such as Qpenalty
.
data(ais) m1 <- selm(log(Fe) ~ BMI + LBM, family="SN", data=ais) print(m1) summary(m1) s<- summary(m1, "DP", cov=TRUE, cor=TRUE) plot(m1) plot(m1, param.type="DP") logLik(m1) coef(m1) coef(m1, "DP") var <- vcov(m1) # m1a <- selm(log(Fe) ~ BMI + LBM, family="SN", method="MPLE", data=ais) m1b <- selm(log(Fe) ~ BMI + LBM, family="ST", fixed.par=list(nu=8), data=ais) # data(barolo) attach(barolo) A75 <- (reseller=="A" & volume==75) logPrice <- log(price[A75],10) m <- selm(logPrice ~ 1, family="ST") summary(m) plot(m, which=2, col=4, main="Barolo log10(price)") # cfr Figure 4.7 of Azzalini & Capitanio (2014), p.107 detach(barolo) #----- # examples with multivariate response # m3 <- selm(cbind(BMI, LBM) ~ WCC + RCC, family="SN", data=ais) plot(m3, col=2, which=2) summary(m3, "dp") coef(m3) coef(m3, vector=FALSE) # data(wines) m28 <- selm(cbind(chloride, glycerol, magnesium) ~ 1, family="ST", subset=(wine=="Grignolino"), data=wines) dp28 <- coef(m28, "DP", vector=FALSE) plot(m28, param.type="dp") # cfr Figures 6.1 and 6.2 of Azzalini & Capitanio (2014), pp.181-2 plot(m28, param.type="pseudo-CP") # m31 <- selm(cbind(BMI, LBM)~ Ht + Wt, family="ST", data=ais) # Warning message... slot(m31, "opt.method")$convergence m32 <- selm(cbind(BMI, LBM) ~ Ht + Wt, family="ST", data=ais, opt.method="BFGS") # Warning message... slot(m32, "opt.method")$convergence