Do large language models perform the way people expect? Measuring the human generalization function

Do large language models perform the way people expect? Measuring the human generalization function” with Vafa, Keyon and Rambachan, Ashesh, International Conference on Machine Learning (ICML), 2024.