Call for Papers: 39. Romanistiktag, Sektion "The Quantitative Turn: NLP and AI Methods in Romance Linguistics"
Stadt: Konstanz
Frist: 2024-12-31
URL: https://www.romanistiktag.de/xxxix-romanistiktag/sektionen/sektion-13/
In the last few decades, linguistic research has undergone an undoubtedly significant shift of focus towards the use of more empirical methods, and, relatedly, of mathematical formalisation and modelling. The experts talk about an actual “Quantitative Turn” (Kortmann 2021). This is partly driven by criticism of previous theoretical research relying mostly on subjective introspection of few experts or speakers, and partly by easier access to quantitative or quantifiable data, such as crowdsourced data from judgement tasks via websites as Amazon’s Mechanical Turk (Winter 2022) or social media data like from X (former Twitter), useful to capture more spontaneous linguistic behaviour (Kellert et al. 2023). However, crowdsourced data and naturally occurring speech may display an unstructured nature and need pre-processing to undergo quantitative modelling. Manual pre-processing of a large amount of natural language data is time consuming and costly. To overcome these shortcomings, Natural Language Processing (NLP) and AI (Artificial Intelligence) language models have shown to perform well in handling large amounts of data. An example is delivered by Large Language Models (LLMs), which are capable of handling and generating coherent natural language based on word embeddings and transformers architecture (Vaswani et al. 2017). This combination makes it possible to numerically encode semantic relations between words in an embedding space without previous pre-processing (e.g., word2vec, Mikolov et al. 2013; BERT, Devlin et al. 2019), and to reach sophisticated understanding and generation of human language. These and other language models enable the use of NLP and AI methods in various linguistic fields, such as dialect variation and language change (Kellert & Zaman 2022), development of syntactic parsers for unstructured data and semantic role labelling (Zhang et al. 2022), coreference resolution (Dobrovolskii 2021), text summarization and machine reading comprehension, among others.
However, the application of the newest NLP and AI methods is largely focused on big languages such as English and on more standardised language varieties, and neglects Romance languages and smaller language varieties (Kellert & Zaman 2023). As a result, datasets, tools, and methods are often not adjusted yet to the application in Romance Linguistics. This leads to the missing opportunity of using unstructured data and automated research pipelines to complement traditional linguistic methodologies.
In this workshop, we address the need for further exploration of data sources and data processing methods with the means of NLP and AI in order to answer questions in Romance Linguistics.
Suggestions of topics and questions that can be addressed by the contributors are:
• What are the challenges of unstructured data types for linguistic analysis and how can we address them?
• (How) Can Romance languages and Romance varieties benefit from the newest developments in NLP and AI?
• How can we ensure the accuracy and reliability of linguistic insights derived from large-scale social media data?
• How can interdisciplinary collaboration enhance the application of NLP and AI methods in Romance Linguistics?
• What are the implications of the Quantitative Turn in Linguistics for language policy and planning in Romance-speaking communities? How can the results influence decisions about language use, education, preservation, and other aspects of language policy?
• What are the emerging trends and challenges in NLP and/or AI specific to Romance languages, and how do they impact the development of NLP and AI models, techniques, and applications?
Submission Guidelines:
• Working language: English, German, and Romance Languages
• Abstract Length: 4000 characters including white spaces and bibliography
• Submission Deadline: December 31st, 2024
• Event Dates: September 22nd-25th, 2025
• Location: Konstanz, Germany
Please submit your abstract as PDF attachment to iris.ferrazzo@uni-bonn.de. Include your name, affiliation, and contact information in the body of the email. We welcome submissions from researchers from diverse disciplinary backgrounds. The decision on the choice of abstracts will be announced on January 31st, 2025.
Event Coordinators:
• Iris Ferrazzo (Universität Bonn / Bonner Center for Digital Humanities)
• Olga Kellert (Universität A Coruña / Universität Göttingen)
For any inquiries or further information, please contact iris.ferrazzo@uni-bonn.de.
Beitrag von: Iris Ferrazzo
Redaktion: Robert Hesselbach