Inductive reasoning in humans and large language models

Date

2024

Authors

Han, S.J.
Ransom, K.J.
Perfors, A.
Kemp, C.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Journal article

Citation

Cognitive Systems Research, 2024; 83:101155-1-101155-28

Statement of Responsibility

Simon Jerome Han, Keith J. Ransom, Andrew Perfors, Charles Kemp

Conference Name

Abstract

The impressive recent performance of large language models has led many to wonder to what extent they can serve as models of general intelligence or are similar to human cognition. We address this issue by applying GPT-3.5 and GPT-4 to a classic problem in human inductive reasoning known as property induction. Over two experiments, we elicit human judgments on a range of property induction tasks spanning multiple domains. Although GPT-3.5 struggles to capture many aspects of human behavior, GPT-4 is much more successful: for the most part, its performance qualitatively matches that of humans, and the only notable exception is its failure to capture the phenomenon of premise non-monotonicity. Our work demonstrates that property induction allows for interesting comparisons between human and machine intelligence and provides two large datasets that can serve as benchmarks for future work in this vein.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

© 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

License

Call number

Persistent link to this record