When the Past != The Future: Assessing the Impact of Dataset Drift on the Fairness of Learning Analytics Models

Deho, O.B.; Liu, L.; Li, J.; Liu, J.; Zhan, C.; Joksimovic, S.

doi:10.1109/TLT.2024.3351352

When the Past != The Future: Assessing the Impact of Dataset Drift on the Fairness of Learning Analytics Models

dc.contributor.author	Deho, O.B.
dc.contributor.author	Liu, L.
dc.contributor.author	Li, J.
dc.contributor.author	Liu, J.
dc.contributor.author	Zhan, C.
dc.contributor.author	Joksimovic, S.
dc.date.issued	2024
dc.description.abstract	Learning analytics (LA), like much of machine learning, assumes the training and test datasets come from the same distribution. Therefore, LA models built on past observations are (implicitly) expected to work well for future observations. However, this assumption does not always hold in practice because the dataset may drift. Recently, algorithmic fairness has gained significant attention. Nevertheless, algorithmic fairness research has paid little attention to dataset drift. Majority of the existing fairness algorithms are “statically” designed. Put another way, LA models tuned to be “fair” on past data are expected to still be “fair” when dealing with current/future data. However, it is counter-intuitive to deploy a statically fair algorithm to a nonstationary world. There is, therefore, a need to assess the impact of dataset drift on the unfairness of LA models. For this reason, we investigate the relationship between dataset drift and unfairness of LA models. Specifically, we first measure the degree of drift in the features (i.e., covariates) and target label of our dataset. After that, we train predictive models on the dataset and evaluate the relationship between the dataset drift and the unfairness of the predictive models. Our findings suggest a directly proportional relationship between dataset drift and unfairness. Further, we find covariate drift to have the most impact on unfairness of models as compared to target drift, and there are no guarantees that a once fair model would consistently remain fair. Our findings imply that “robustness” of fair LA models to dataset drift is necessary before deployment.
dc.description.statementofresponsibility	Oscar Blessed Deho, Lin Liu, Jiuyong Li, Jixue Liu, Chen Zhan, and Srecko Joksimovic
dc.identifier.citation	IEEE Transactions on Learning Technologies, 2024; 17:1007-1020
dc.identifier.doi	10.1109/TLT.2024.3351352
dc.identifier.issn	1939-1382
dc.identifier.issn	1939-1382
dc.identifier.orcid	Liu, J. [0000-0002-0794-0404]
dc.identifier.orcid	Zhan, C. [0000-0002-4794-8339]
dc.identifier.uri	https://hdl.handle.net/2440/143934
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.grant	http://purl.org/au-research/grants/arc/DP200101210
dc.rights	© 2024 IEEE. Personal use is permitted
dc.source.uri	https://doi.org/10.1109/TLT.2024.3351352
dc.subject	Dataset drift; ethical learning analytics; fairness; learning analytics (LA); predictive modeling; virtual learning environment
dc.title	When the Past != The Future: Assessing the Impact of Dataset Drift on the Fairness of Learning Analytics Models
dc.type	Journal article
pubs.publication-status	Published

Collections

Research Outputs

When the Past != The Future: Assessing the Impact of Dataset Drift on the Fairness of Learning Analytics Models

Files

Collections