Esther Ploeger

About Me

I am a PhD student at the DKW group at the Department of Computer Science at Aalborg University, supervised by Johannes Bjerva (Aalborg University) and Robert Östling (Stockholm University). I work on combining methods and findings from linguistic typology with natural language processing (NLP). In particular, I am interested in (multilingual) machine translation. These are some questions that I have been working on recently:

I am passionate about diversity in computer science research, and give workshops to high school students about working on machine translation. Over time, this workshop has been attended by more than 170 students across multiple events.

News and Activities

  • (03/2025) Gave an invited talk (online) at the Cambridge NLIP Seminar Series
  • (03/2025) Presented our paper at NoDaLiDa/BalticHLT
  • (02/2025) Gave an invited talk at ITU, Copenhagen
  • (01/2025) Participated in the UniDive 3rd General Meeting
  • (01/2025) Gave a research talk at the Utrecht University NLP group
  • (12/2024) Gave a lightning talk at the AAU-NLP symposium
  • (11/2024) Wessel and I now lead Subtask 4.3 at the EU Cost Action UniDive, WG4
  • (11/2024) Gave MT workshops at Faglig Inspirationsdag and U-faktor day
  • (10/2024) Gave MT workshops at Science Day and Studiepraktik
  • (09/2024) Our paper was featured in a blog post
  • (08/2024) Gave a research talk at the LAGoM NLP group at KU Leuven
  • (06/2024) Presented a paper at EAMT conference
  • (06/2024) Attended the Emerging Topics in Typology (ETT) conference
  • (05/2024) Presented our paper at EACL conference
  • (05/2024) Gave a trial talk (online) at the University of Groningen
  • (04/2024) Gave a research seminar talk at the University of Helsinki
  • (03/2024) Our paper was featured on the MT highlights blog
  • (03/2024) Started my research visit at the University of Helsinki
  • (03/2024) Received external funding from Otto Mønsteds Fond for stay abroad
  • (10/2023) Gave MT workshops at Science Day and Studiepraktik
  • (10/2023) Attended the Multi* workshop at ITU, Copenhagen
  • (07/2023) Attended the LxMLS summer school in Lisbon
  • (05/2023) Attended the EACL conference
  • (03/2023) Attended the HumanCLAIM workshop in Amsterdam

Selected Publications

What is "Typological Diversity" in NLP?

Esther Ploeger*, Wessel Poelman*, Miryam de Lhoneux, Johannes Bjerva

EMNLP 2024 [link to paper]

TL;DR In this work, we systematically investigate NLP research that includes claims regarding 'typological diversity'. We find there are no set definitions or criteria for such claims. We introduce metrics to approximate the diversity of language selection along several axes and find that the results vary considerably across papers.


Multilingual Gradient Word-Order Typology from Universal Dependencies

Emi Baylor*, Esther Ploeger*, Johannes Bjerva

EACL 2024 [link to paper]

TL;DR Discrete typological categorisations may differ significantly from the continuous nature of phenomena, as found in natural language corpora. In this paper, we introduce a new seed dataset made up of continuous-valued data, rather than categorical data, that may better reflect the variability of language.


A Principled Framework for Evaluating on Typologically Diverse Languages

Esther Ploeger, Wessel Poelman, Andreas Holck Høeg-Petersen, Anders Schlichtkrull, Miryam de Lhoneux, Johannes Bjerva

Pre-print [link to paper]

TL;DR We present a language sampling framework for selecting highly typologically diverse languages given a sampling frame, informed by language typology. We compare sampling methods with a range of metrics and find that our systematic methods consistently retrieve more typologically diverse language selections than previous methods in NLP.