Should Learning Analytics Models Include Sensitive Attributes? Explaining the Why

Deho, O.B.; Joksimovic, S.; Li, J.; Zhan, C.; Liu, J.; Liu, L.

doi:10.1109/TLT.2022.3226474

Should Learning Analytics Models Include Sensitive Attributes? Explaining the Why

dc.contributor.author	Deho, O.B.
dc.contributor.author	Joksimovic, S.
dc.contributor.author	Li, J.
dc.contributor.author	Zhan, C.
dc.contributor.author	Liu, J.
dc.contributor.author	Liu, L.
dc.date.issued	2023
dc.description.abstract	Many educational institutions are using predictive models to leverage actionable insights using student data and drive student success. A common task has been predicting students at risk of dropping out for the necessary interventions to be made. However, issues of discrimination by these predictive models based on protected attributes of students have recently been raised. An important question that is constantly asked is: should the protected attributes be excluded from the learning analytics (LA) models in order to ensure fairness? In this article, we aimed at answering questions that if we exclude the protected attributes from the LA models, does the exclusion ensure fairness as it supposedly should? Does the exclusion affect the performance of the LA model? If so, why? We found answers to these questions and went further to explain why. We built machine learning models and performed empirical evaluations using a three-year dropout data for a particular program in a large Australian university. We found that excluding or including the protected attributes had marginal effect on predictive performance and fairness. Perhaps not surprisingly, our findings suggest that the effect of including or excluding protected attributes is a function of how they relate with the prediction outcome. More specifically, if a protected attribute is correlated with the target label and proves to be an important feature, then their inclusion or exclusion would have effect on the performance and fairness and vice versa. Our findings provide insightful information that can be used by relevant stakeholders to make well-informed decisions.
dc.description.statementofresponsibility	Oscar Blessed Deho, Srecko Joksimovic, Jiuyong Li, Chen Zhan, Jixue Liu, and Lin Liu
dc.identifier.citation	IEEE Transactions on Learning Technologies, 2023; 16(4):560-572
dc.identifier.doi	10.1109/TLT.2022.3226474
dc.identifier.issn	1939-1382
dc.identifier.issn	1939-1382
dc.identifier.orcid	Zhan, C. [0000-0002-4794-8339]
dc.identifier.uri	https://hdl.handle.net/2440/143955
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.grant	http://purl.org/au-research/grants/arc/DP200101210
dc.rights	© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
dc.source.uri	https://doi.org/10.1109/tlt.2022.3226474
dc.subject	Dropout prediction; explainable AI; fairness in education; learning analytics (LA)
dc.title	Should Learning Analytics Models Include Sensitive Attributes? Explaining the Why
dc.type	Journal article
pubs.publication-status	Published

Collections

Research Outputs

Should Learning Analytics Models Include Sensitive Attributes? Explaining the Why

Files

Collections