Mapping Meaning in Latin with Large Language Models

Our PhD student Andrea Farina will present his latest research at CLiC-it 2025: Eleventh Italian Conference on Computational Linguistics.

The paper “Mapping Meaning in Latin with Large Language Models“, co-authored with me and Barbara McGillivray, focuses on preverbed motion verbs (like exeo “exit” or ineo “enter”) and spatial relations, important linguistic features that encode movement, direction, and place in Latin texts. Using a custom-annotated corpus spanning 3rd century BCE to 2nd century CE, we evaluated GPT-4, Llama, and Mistral across three tasks:

  1. Identifying preverbed motion verbs
  2. Classifying spatial relations (Source, Path, Goal)
  3. Disambiguating spatial expressions (e.g., Roma vs. domus)

The results highlight both promise and challenge. GPT-4 consistently outperforms open models, showing strong zero-shot performance likely due to its pretraining on Latin. Yet, even GPT-4 struggles with syntactic reasoning, especially linking proper nouns to the right verbs, while open models frequently fail to generalise.

Beyond benchmarking, the study represents the first evaluation of spatial relation recognition in historical languages, offering valuable groundwork for future fine-tuning and domain-specific adaptation in computational humanities.

➡️ Reference: Farina A., Ballatore A., and McGillivray B. (2025) Mapping Meaning in Latin with Large Language Models: A Multi-Task Evaluation of Preverbed Motion Verbs and Spatial Relation Detection in LLMs. CLiC-it 2025: Eleventh Italian Conference on Computational Linguistics, Sept 24-26, 2025, Cagliari, Italy.

👉 Below you can find both the poster and the article about this research:

Leave a comment