Modeling multivariable hydrological series: principal component analysis or independent component analysis?

Files

hdl_72514.pdf (532.33 KB)
  (Published version)

Date

2007

Authors

Westra, S.
Brown, C.
Lall, U.
Sharma, A.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Journal article

Citation

Water Resources Research, 2007; 43(W06429):1-11

Statement of Responsibility

Seth Westra, Casey Brown, Upmanu Lall and Ashish Sharma

Conference Name

Abstract

The generation of synthetic multivariate rainfall and/or streamflow time series that accurately simulate both the spatial and temporal dependence of the original multivariate series remains a challenging problem in hydrology and frequently requires either the estimation of a large number of model parameters or significant simplifying assumptions on the model structure. As an alternative, we propose a relatively parsimonious two-step approach to generating synthetic multivariate time series at monthly or longer timescales, by first transforming the data to a set of statistically independent univariate time series and then applying a univariate time series model to the transformed data. The transformation is achieved through a technique known as independent component analysis (ICA), which uses an approximation of mutual information to maximize the independence between the transformed series. We compare this with principal component analysis (PCA), which merely removes the covariance (or spatial correlation) of the multivariate time series, without necessarily ensuring complete independence. Both methods are tested using a monthly multivariate data set of reservoir inflows from Colombia. We show that the discrepancy between the synthetically generated data and the original data, measured as the mean integrated squared bias, is reduced by 25% when using ICA compared with PCA for the full joint distribution and by 28% when considering marginal densities in isolation. These results suggest that there may be significant benefits to maximizing statistical independence, rather than merely removing correlation, when developing models for the synthetic generation of multivariate time series.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

Copyright 2007 by the American Geophysical Union

License

Grant ID

Call number

Persistent link to this record