Information flow reveals prediction limits in online social activity

dc.contributor.authorBagrow, J.
dc.contributor.authorLiu, X.
dc.contributor.authorMitchell, L.
dc.date.issued2019
dc.description.abstractModern society depends on the flow of information over online social networks, and users of popular platforms generate substantial behavioural data about themselves and their social ties1,2,3,4,5. However, it remains unclear what fundamental limits exist when using these data to predict the activities and interests of individuals, and to what accuracy such predictions can be made using an individual’s social ties. Here, we show that 95% of the potential predictive accuracy for an individual is achievable using their social ties only, without requiring that individual’s data. We used information theoretic tools to estimate the predictive information in the writings of Twitter users, providing an upper bound on the available predictive information that holds for any predictive or machine learning methods. As few as 8–9 of an individual’s contacts are sufficient to obtain predictability compared with that of the individual alone. Distinct temporal and social effects are visible by measuring information flow along social ties, allowing us to better study the dynamics of online activity. Our results have distinct privacy implications: information is so strongly embedded in a social network that, in principle, one can profile an individual from their available social ties even when the individual forgoes the platform completely.
dc.description.statementofresponsibilityJames P. Bagrow, Xipei Liu and Lewis Mitchell
dc.identifier.citationNature Human Behaviour, 2019; 3(2):122-128
dc.identifier.doi10.1038/s41562-018-0510-5_REMOVE_THIS_TEXT
dc.identifier.issn2397-3374
dc.identifier.issn2397-3374
dc.identifier.orcidMitchell, L. [0000-0001-8191-1997]
dc.identifier.urihttp://hdl.handle.net/2440/119229
dc.language.isoen
dc.publisherSpringer Nature
dc.rights© The Author(s), under exclusive licence to Springer Nature Limited 2019
dc.source.urihttps://doi.org/10.1038/s41562-018-0510-5
dc.subjectHumans
dc.subjectLanguage
dc.subjectSocial Behavior
dc.subjectInformation Theory
dc.subjectSocial Media
dc.subjectMachine Learning
dc.subjectOnline Social Networking
dc.titleInformation flow reveals prediction limits in online social activity
dc.typeJournal article
pubs.publication-statusPublished

Files