Natural Language as a Code: Modeling Human Language Using Information Theory

February 14, 2019
4:00p.m. - 5:00p.m.
Social Science Plaza A, Room 2112
Richard Futrell, Assistant Professor, Language Science

Why is natural language the way it is? Futrell proposes that human languages can be modeled as solutions to the problem of efficient communication subject to certain information processing constraints, in particular constraints on short-term memory. He will present an analysis of dependency treebank corpora of over 50 languages, in which the syntax of sentences is represented using simple graph structures, and show that word orders across languages are optimized to limit short-term memory demands in parsing, in that words that are linked by a dependency edge tend to be close in linear order. This effect is called dependency locality. Next he develops a general Bayesian, information-theoretic model of human language processing, in which short-term memory is modeled as a noisy channel, recovering dependency locality as a special case. Finally he combines these insights in a model of human languages as information-theoretic codes for latent tree structures, and show that optimization of these codes for expressivity and compressibility results in grammars that resemble human languages.

Contact: Joanna Kerner, kernerj@uci.edu
Sponsor: Institute for Mathematical Behavioral Sciences

About

Directories

Academics

Undergraduates

Graduates

Office of Research

Office of Research Development

People in SocSci

Students

Alumni

Faculty & Lecturers

Staff

Administrative Offices

About

Alumni & Community Organizations

Equity, Diversity & Inclusion Resources

Student Organizations

Student Academic Programs

Funding

Employment

Involvement

Giving

Giving

Involvement

Be in the know

Directories

Natural Language as a Code: Modeling Human Language Using Information Theory

School Calendar

Archive

connect with us