Talks

This page lists invited talks that I have given at seminars. If you are interested in inviting me to speak at your seminar, please contact me at ayamaguchi1@sheffield.ac.uk.

Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks

  • Abstract: Language models (LMs) are pre-trained on raw text datasets to generate text sequences token-by-token. While this approach facilitates the learning of world knowledge and reasoning, it does not explicitly optimise for linguistic competence. To bridge this gap, we propose L2T, a pre-training framework integrating Language Learning Tasks alongside standard next-token prediction. Inspired by human language acquisition, L2T transforms raw text into structured input-output pairs to provide explicit linguistic stimulation. Pre-training LMs on a mixture of raw text and L2T data not only improves overall performance on linguistic competence benchmarks but accelerates its acquisition, while maintaining competitive performance on general reasoning tasks. In this talk, we overview our ACL 2026 paper, 'Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks,' sharing both our key findings and how this research journey began and evolved.
  • Date: May 29, 2026
  • Venue: NLIP Seminar Series, University of Cambridge
  • Links: Slides Paper

Cross-lingual Vocabulary Adaptation: Overview and Challenges

  • Abstract: Cross-lingual vocabulary adaptation (CVA) offers an efficient approach to cross-lingual transfer by modifying a source model's vocabulary and subsequently performing continual pre-training on target language data. This adaptation allows the model to effectively process target languages that can be underrepresented or absent in the source model's training data. In this talk, I will overview CVA, including current state-of-the-art techniques and related studies, and discuss its associated challenges.
  • Date: April 14, 2025
  • Venue: Glasgow IR Seminar, University of Glasgow
  • Links: Slides
Updated on May 23, 2026