- This event has passed.
Colloquium: Aaron White
November 11, 2022 @ 3:00 pm - 4:30 pm
Speaker: Aaron White (University of Rochester)
Title: Semantic Category Induction
Our ability to use language to convey arbitrarily complex information about the world’s possible past, present, and future configurations is undergirded by systematic relationships between linguistic expressions and conceptual categories. Understanding these relationships is not only a core part of understanding what it means to know and be able to use a language; it potentially provides a window onto the nature of higher cognition in humans more generally.
In this talk, I report on research investigating these systematic relationships in the domain of predicates that combine with subordinate clauses (broadly construed)–e.g. think (that Bo left), see (Bo leave), want (Bo to leave), hope (to leave), love (that Bo left), manage (to leave), and start (leaving). Such predicates constitute a useful case study both because their distributional signatures are highly complex and because these distributional signatures show intricate correlations with these predicates’ inferential properties–suggesting that very fine-grained aspects of the concepts associated with these predicates may be formally expressed.
I approach this investigation by developing computational models for discovering representations that can simultaneously explain these predicates’ distributional characteristics and inferential affordances–i.e. by inducing semantic categories that optimally predict predicates’ distributional signatures. In the first part of the talk, I focus on models aimed at uncovering which kinds of lexically triggered inference patterns are (un)attested across clause-embedding verbs in English. To carry out this investigation, I use three inference judgment datasets collected under the auspices of the MegaAttitude Project: MegaVeridicality, MegaNegRaising, and MegaIntensionality, which capture a variety of theoretically important inference types across a wide swath of the English clause-embedding lexicon.
In the second part of the talk, I describe ongoing efforts to assess these explanatory power of these categories across typologically diverse languages. I focus in particular on a case study relating the induced semantic categories described in the first part of the talk to Mandarin predicates’ distributional signatures. I first describe a new dataset aimed at capturing these distributional characteristics for a wide swath of clause-embedding predicates in Mandarin. I then use this dataset in conjunction with the English datasets described in the first part of the talk to induce a mapping from English distributional characteristics to Mandarin distributional characteristics. I use this mapping to derive a set of predictions about how semantic categories are expressed in both English and Mandarin. I conclude with a discussion of the implications of these predictions.