Inductive reasoning in humans and large language models

Han, S.J.; Ransom, K.J.; Perfors, A.; Kemp, C.

doi:10.1016/j.cogsys.2023.101155

Inductive reasoning in humans and large language models

dc.contributor.author	Han, S.J.
dc.contributor.author	Ransom, K.J.
dc.contributor.author	Perfors, A.
dc.contributor.author	Kemp, C.
dc.date.issued	2024
dc.description.abstract	The impressive recent performance of large language models has led many to wonder to what extent they can serve as models of general intelligence or are similar to human cognition. We address this issue by applying GPT-3.5 and GPT-4 to a classic problem in human inductive reasoning known as property induction. Over two experiments, we elicit human judgments on a range of property induction tasks spanning multiple domains. Although GPT-3.5 struggles to capture many aspects of human behavior, GPT-4 is much more successful: for the most part, its performance qualitatively matches that of humans, and the only notable exception is its failure to capture the phenomenon of premise non-monotonicity. Our work demonstrates that property induction allows for interesting comparisons between human and machine intelligence and provides two large datasets that can serve as benchmarks for future work in this vein.
dc.description.statementofresponsibility	Simon Jerome Han, Keith J. Ransom, Andrew Perfors, Charles Kemp
dc.identifier.citation	Cognitive Systems Research, 2024; 83:101155-1-101155-28
dc.identifier.doi	10.1016/j.cogsys.2023.101155
dc.identifier.issn	2214-4366
dc.identifier.issn	1389-0417
dc.identifier.orcid	Ransom, K.J. [0000-0001-5423-6455]
dc.identifier.uri	https://hdl.handle.net/2440/146303
dc.language.iso	en
dc.publisher	Elsevier
dc.relation.grant	http://purl.org/au-research/grants/arc/FT190100200
dc.rights	© 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
dc.source.uri	https://doi.org/10.1016/j.cogsys.2023.101155
dc.subject	reasoning; property induction; category-based induction; non-monotonicity; neural networks; GPT-3.5; GPT-4; AI Large language models; representation
dc.title	Inductive reasoning in humans and large language models
dc.type	Journal article
pubs.publication-status	Published

Collections

Research Outputs

Inductive reasoning in humans and large language models

Files

Collections