-----

Neural network language models, such as ChatGPT, are powerful in many ways, but their usefulness as models of cognition is limited by the fact that they are difficult to understand and control. In this talk, McCoy will discuss a perspective that mitigates these issues: approaching neural networks from the perspective of the problem that they are trained to solve (i.e., viewing them at Marr’s computational level). The first part of the talk will discuss how this perspective enables us to predict some important limitations of large language models - an analysis supported by situations in which language models struggle on seemingly simple tasks. In the second part of the talk, McCoy will show how the same perspective can enable us to control neural networks by connecting them to structured Bayesian models. We apply this approach in a case study based on language learning, where our goal is to distill the symbolic priors of a Bayesian model into a neural network. Like a Bayesian model, the resulting system can learn linguistic patterns from a small number of examples; like a neural network, it can also learn aspects of English syntax from a naturalistic corpus. Overall, these results show how it is both possible and beneficial to combine Bayesian models and neural networks, two popular approaches within computational cognitive science that have often been viewed as antagonistic.

-----