The purpose of this page is to collect various material related to the
Skew-Normal (SN) probability distribution and related distributions.
The SN distribution is an extension of the normal (Gaussian) probability
distribution, allowing for the presence of skewness.
Similarly to the SN density, a
skew-t (ST) distribution has been developed, which allows
ro regulate both skewness and kurthosis of the fitted model.
The distribution is obtained by introducing a skewness parameter
to the usual t density.
Introduction
If you have never read about the skew-normal probability distribution,
you may want to look at a
very brief account.
To view the shape of the density function, here are some
graphical demostration programs:
A complete (so to speak!) bibliography
is available (last update on 2008-07-22).
The list includes only published material or papers accepted for
publication or other material having `firm form', such as a
Ph. D. thesis.
No further update of this list is planned at the moment.
IV Skew Workshop,
Pontificia Universidad Católica de Chile, 16th to 19th May 2011
A pioneer
In 1908, Fernando de Helguero
presented
a paper
which examines a selection mechanism of a normal population as a
model of departure from normality. This construction essentially
perturbates the normal density via a uniform distribution function,
leading to a form of skew-normal density. Although mathematically
somewhat different from the above-described form of skew-normal
density, the underlying stochastic mechanism is intimately
related. (2004-12-13)
The `library sn' is a suite of functions
for handling skew-normal and skew-t
distributions, both in the univariate and the multivariate cases.
The available facilities include various standard operations
(density function, random number generation, etc), data fitting via MLE,
plotting log-likelihood surfaces and others. For data fitting,
simple random samples and regression models are dealth with.
Current development is done in R.
Some porting to other languages are available but they are not really
maintained: if you want the most recent version, use the one for R.
A major fact is that existing portings to other envirnorments have
been made before version 0.3-0, and therefore they do not include
any facilities for the skew-t distribution.
If you already are a user of package sn, or you are going to
be one, please read this announcement
.
The most recent version of the library is the one for
R, update 0.4-18
(last update on 2013-05-01)
for Unix; the MS-windows version is produced a bit later.
It requires R 2.2.0 or later; a few functions require
package mnormt.
Notice that the R
versions are also obtainable directly from
CRAN,
and they are simply installed
using the install.packages("sn") command, provided that
your installation is suitably configured (and that you are actually
connected on-line!).
A PDF version of the on-line documentation
is available. It is the one specific for R (update 0.4-5), but
it is largely similar to the one for S-plus, except that the current
S-plus is missing some functions with respect to R.
Two S-plus versions are available: for Unix and MS-windows, with
restriction however, as now explained.
The current version level is 0.2-1 (1999-04-01); hence facilities
for the skew-t distribution are not included.
Notice that the existing version of the library have ben built
for Splus 3.2 on Unix and Splus 3.3 on MS-windows. Therefore, problems
can be faced with current versions of Splus; since I do not have
Splus available any longer, I cannot help you with the installation.
The library has been ported to Matlab by
Nicola Sartori.
So far, this refers to update 0.21; hence facilities
for the skew-t distribution are not included. A portion of the
facilities for the skew-t distribution is however available
via a set of Matlab functions
which have been written and made available by
Enrique Batiz (Enrique.Batiz [at] postgrad.mbs.ac.uk)
An MS-DOS executable program is available which implements a
small portion of the `sn library', namely MLE estimation for random
samples and for regression models with errors having scalar SN
distribution.
The above-described program is based on some Fortran90 code which
represents the porting of a few basic routines from R. About
10 years after writing it, prompted by a user request, I have thought
that after all that Fortran code could be of use to other people,
so now it is available. (Written in 1998, on line since 2008-09-25)
a set of Python routines for the
univariate skew-normal variates has been made available by
Janwillem van Dijk (2013-01-29).
For random numbers generation, specifically:
Excel users can make use of
VBA routines kindly made
available by Stephen H. Gersuk (2008-09-22);
a Perl
module has been provided by Jiri Vaclavik (added 2011-10-21);
On-line procedures
Data fitting
You can fit a skew-normal distribution to your data using
this form. This procedure also serves as
a demonstration of the library sn
functionality, although only in a simple case. If you have a
more complex problem (large data set, data with covariates,
multivariate data, etc), then you must download the full library
and run it yourself. (Created on 2003-02-17,
updated 2003-04-22, 2008-12-02, 2011-08-03).
Random numbers generation
You can generate random numbers with SN or ST distribution
in 1 or 2 dimensions using this form
(2003-11-12). See also the FAQ below.
How to generate random variates with SN or ST distribution?
See here. (2003-11-12)
Another frequent question
Where did the skew-normal distribution appeared first?
See here. (2009-11-17)
A less frequent question
In the multivariate case, the feasible region for the set of
correlations and the indices of skewness of the individual components
is not simple to perceive. To help visualizing this region
in the bivariate case, you can run the R program
feasible-CP2.R; besides R, it requires
its package 'rgl'. To run it, save this file locally,
then start R and type source('feasible-CP2.R').
(2009-05-27)
The program displays two plots in sequence.
The first plot adopts delta as the shape parameter;
the connection between delta and
gamma1 is described in various articles, including
this one. The second plot uses
gamma1.
Translations of the term "skew-normal distribution"
available at ISI
A research problem
The above paper Statistical applications of the multivariate
skew-normal distribution includes the discussion of an
apparently innocuous dataset,
but having the MLE on the frontier of the parameter space.
Can you suggest an explanation of the phenomenon, and/or
propose an alternative, `reasonable' estimate?
It should work with this as well as with more regular datasets.
Hence, the obvious answer (the method of moments) is not acceptable,
since it would work here but not with other datasets
having the sample index of skewness outside the feasible region.
Various solutions to the problem have been put forward, both in
the classical and in the Bayesian approach.
You can get the `frontier' data,
and try out your own method.