Please use this identifier to cite or link to this item:
Scopus Web of ScienceĀ® Altmetric
Type: Journal article
Title: EMMIXuskew: an R package for fitting mixtures of multivariate skew t distributions via the EM algorithm
Author: Lee, S.
McLachlan, G.
Citation: Journal of Statistical Software, 2013; 55(12):1-22
Publisher: Journal Statistical Software
Issue Date: 2013
ISSN: 1548-7660
Statement of
Sharon X. Lee, Geoffrey J. McLachlan
Abstract: This paper describes an algorithm for fitting finite mixtures of unrestricted Multivariate Skew t (FM-uMST) distributions. The package EMMIXuskew implements a closed-form expectation-maximization (EM) algorithm for computing the maximum likelihood (ML) estimates of the parameters for the (unrestricted) FM-MST model in R. EMMIXuskew also supports visualization of fitted contours in two and three dimensions, and random sample generation from a specified FM-uMST distribution. Finite mixtures of skew t distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour, for example, datasets from flow cytometry. In recent years, various versions of mixtures with multivariate skew t (MST) distributions have been proposed. However, these models adopted some restricted characterizations of the component MST distributions so that the E-step of the EM algorithm can be evaluated in closed form. This paper focuses on mixtures with unrestricted MST components, and describes an iterative algorithm for the computation of the ML estimates of its model parameters. Its implementation in R is presented with the package EMMIXuskew. The usefulness of the proposed algorithm is demonstrated in three applications to real datasets. The first example illustrates the use of the main function fmmst in the package by fitting a MST distribution to a bivariate unimodal flow cytometric sample. The second example fits a mixture of MST distributions to the Australian Institute of Sport (AIS) data, and demonstrates that EMMIXuskew can provide better clustering results than mixtures with restricted MST components. In the third example, EMMIXuskew is applied to classify cells in a trivariate flow cytometric dataset. Comparisons with some other available methods suggest that EMMIXuskew achieves a lower misclassification rate with respect to the labels given by benchmark gating analysis.
Keywords: Mixture models; skew distributions; multivariate t distribution; EM algorithm; flow cytometry; R.
Rights: JSS is committed to electronic open-access publishing since its foundation in 1996 and has chosen to apply the Creative Commons Attribution License (CCAL) to all articles. Under the CCAL, authors retain ownership of the copyright for their article, but authors allow anyone to download, reuse, reprint, modify, distribute, and/or copy articles in JSS, so long as the original authors and source are credited. This broad license was developed to facilitate open access to, and free use of, original works of all types. Applying this standard license to your work will ensure your right to make your work freely and openly available. This work is licensed under the licenses: Paper: Creative Commons Attribution 3.0 Unported License Code: GNU General Public License (at least one of version 2 or version 3) or a GPL-compatible license.
DOI: 10.18637/jss.v055.i12
Grant ID: ARC
Appears in Collections:Aurora harvest 8
Mathematical Sciences publications

Files in This Item:
File Description SizeFormat 
hdl_117889.pdfPublished Version971.24 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.