Language Variation

My PhD dissertation focused on language variation in Ayapaneco, an under-documented and understudied critically endangered language spoken in southern Mexico. This study followed a multifactorial approach in order to determine the mechanisms at interplay in synchronic language variation. The variation documented in this dissertation included phonetic, morpho-syntactic and lexico-semantic tokens collected through extensive fieldwork. The data collected corresponded to different speech act events observed in natural settings and elicited through visual, verbal and audiovisual stimuli. The ultimate goal of the data collected was to capture a representative sample of language in use by the few remaining speakers of Ayapaneco in order to create a corpus for analysis in ELAN, Toolbox and PRAAT.

Although variation is a phenomenon inherent to all world languages, studying variation in under-documented and understudied critically endangered languages presents specific theoretical and methodological challenges that I attempt to address and question, such as linguistic competence and the lack of suitable stimuli for collecting specialized language items.

Positional Verbs

Position is a notion common to all world languages. However, the study of highly specialized positional verbs is constrained by their relatively low occurrence in everyday use. To address the lack of suitable stimuli, especially for animated figures, as well as the low occurrence of tokens, I designed and tested my own set as part of my PhD research. This stimulus set allowed me to a) collect tokens of previously undocumented positional verbs, and b) discover morpho-syntactic and semantic variation in their use.


Who is a speaker of a given language? What makes him/her/they a speaker? Are there different types of speakers? Can we differentiate among different types of speakers? Can the speakers themselves tell the difference?

Through an interdisciplinary approach I attempt to respond to these questions. Their possible answers raise complex theoretical and methodological questions that challenge fundamental concepts for linguistics, such as linguistic competence.

Documentation & Revitalization

Currently 10% of the world’s languages are critically endangered meaning that they will disappear in the next 5-10 years. These languages are low-resourced and very often under-documented. Through a collaborative approach I build robust, multimodal and multipurpose open access corpora allowing speech communities to undertake revitalization actions. I’m particularly interested in community-based revitalization efforts and their impact in the community, above and beyond a purely linguistic scope.

Language Policies

What’s an official/national language? What implications does the officiality of a language have? Who decides and how? How to accommodate the linguistic heterogeneity of a given territory through language policies? Do we even need language policies?

In my master’s dissertation I explore some of these questions, first through a macro-sociolinguistic case study and then through a micro-sociolinguistic case study examining the application of these policies on the ground. I’m particularly interested in exploring jure versus de facto in language planning.

NLP in Low-Resourced Languages

Currently 97% of the world’s languages are technologically low-resourced. This means that most of our existing language diversity falls outside the scope of language technologies. This situation perpetuates inequality among languages and people resulting in unequal access to information. I explore the intersection of language and tech, specifically speech processing for low-resourced languages.

Quantum Language

Can language be quantum native? If so, is it possible to develop a quantum theory of language, treating and processing it as a quantum object? What are the implications of language being quantum native? Recently I’ve begun exploring these questions from a broad linguistic, computer science and quantum physics perspective.