Exploring the impact of data splitting methods on artificial neural network models
dc.contributor.author | Wu, W. | |
dc.contributor.author | Maier, H. | |
dc.contributor.author | Dandy, G. | |
dc.contributor.author | May, R. | |
dc.contributor.conference | International Conference on Hydroinformatics (10th : 2012 : Hamburg, Germany) | |
dc.date.issued | 2012 | |
dc.description.abstract | Data splitting is an important step in the artificial neural network (ANN) development process whereby data is divided into training, test and validation subsets to ensure good generalization ability of the model. In previous research, guidelines on choosing a suitable data splitting method based on the dimensionality and distribution of the dataset were derived from results obtained using synthetic datasets. This study extends previous research by investigating the impact of three data splitting methods tested in previous research on the predictive performance of ANN models using real-world datasets. Three real-world water resources datasets with varying statistical properties are used. It has been found that the relationship between different data splitting methods and data with different statistical properties obtained in previous research using synthetic data also generally holds for realworld water resources data. However, some data splitting methods produce an optimistically low validation error due to the bias created by allocating the extreme observations to the training set. | |
dc.description.statementofresponsibility | Wenyan Wu, Holger R. Maier, Graeme C. Dandy and Robert May | |
dc.description.uri | http://www.wqra.com.au/research-strategy/issue-details/49 | |
dc.identifier.citation | Proceedings of the 10th International Conference on Hydroinformatics: Understanding Changing Climate and Environment and Finding Solutions, held in Hamburg, Germany, 14-18 July, 2012: pp.1-8 | |
dc.identifier.orcid | Wu, W. [0000-0003-3907-1570] | |
dc.identifier.orcid | Maier, H. [0000-0002-0277-6887] | |
dc.identifier.orcid | Dandy, G. [0000-0001-5846-7365] | |
dc.identifier.uri | http://hdl.handle.net/2440/77610 | |
dc.language.iso | en | |
dc.publisher | HIC2012 | |
dc.publisher.place | CD | |
dc.rights | Copyright status unknown | |
dc.title | Exploring the impact of data splitting methods on artificial neural network models | |
dc.type | Conference paper | |
pubs.publication-status | Published |