Navigation auf uzh.ch

Suche

Department of Comparative Language Science Distributional Linguistics Lab

AUTOTYP

AUTOTYP is a large-scale research program with goals in both quantitative and qualitative typology. It is organised into a series of interconnected thematic modules, each hosting one or more datasets. The current release includes over 200 primary (hand-entered) typological variables (not counting auxiliary variables, comments, bookkeeping and recodings) that describe 1,225 languages over approximately 270,000 data points. Together with the derived (aggregated) data, we provide over 2,000,000 data points.  Latest official downloadable dataset is available on Zenodo (https://zenodo.org/records/7976754). The development version and the issue tracker are located on GitHub (https://github.com/autotyp/autotyp-data)

 

The two key design principles of AUTOTYP are autotypology and  late aggregation. Autotypology (Bickel & Nichols 2002, Witzlack-Makarevich et al. 2021) means that  most variables are not predefined but instead continuously developed and refined to maintain a required level of empirical adequacy. Late aggregation means that the data is collected at the lowest possible level of description (comparable in many cases to reference grammar descriptions) and needs to be aggregated and reshaped for most analytical purposes. This allows the data to be reused for multiple purpose and support different styles of analyses.

 

References. 

 

Bickel, Balthasar & Johanna Nichols. 2002. Autotypologizing databases and their use in fieldwork. In Austin et al. (eds.), Proceedings of the International LREC Workshop on Resources and Tools in Field Linguistics, Las Palmas, 26-27 May 2002. Nijmegen: MPI for Psycholinguistics [PDF].

 

Bickel, Balthasar & Johanna Nichols. 2006. Oceania, the Pacific Rim, and the theory of linguistic areas. Proc. Berkeley Linguistics Society 32. 3–15. [PDF]

 

Nichols, Johanna, Alena Witzlack-Makarevich & Balthasar Bickel. 2013. The AUTOTYP genealogy and geography database: 2013 release. Electronic database [PDF].

 

Schiering, René, Kristine Hildebrandt & Balthasar Bickel. 2012. Stress-timed = word-based? Testing a hypothesis in Prosodic Typology. Language Typology and Universals 65. 157–168.

 

Witzlack-Makarevich, Alena, Johanna Nichols, Kristine Hildebrandt, Taras Zakharko & Balthasar Bickel. 2022. Managing AUTOTYP Data: design principles and implementation. In Berez-Kroeker et al. (eds.), Open Handbook of Linguistic Data Management, Cambridge, MA: MIT Press [PDF].