Gastronomic Named Entity Recognition in Spanish–English Texts
The gastronomic domain presents distinctive challenges for Natural Language Processing. Terms such as arepa de choclo, ají amarillo, or tacos al pastor carry cultural identity and semantic specificity that demand precise entity recognition before downstream tasks like machine translation or recipe recommendation can be performed reliably.
GastroCorp NER 2026 is a shared task over a bilingual (Spanish–English) gastronomic corpus. Participants must assign an IOB2 entity label to tokens from restaurant menus or culinary recipes.
Universidad Tecnológica de Bolívar
Cartagena, Colombia.
Whitespace-tokenized tokens in HuggingFace NER JSONL format.
IOB2 labels from the following tag set:
| Tag | Entity Type |
|---|---|
| DISH | Prepared food items |
| BEVERAGE | Drinks and final products |
| INGREDIENT | Individual components |
| BRAND | Commercial brands |
| Token | Tag |
|---|---|
| Pizza | B-DISH |
| Margarita | I-DISH |
| con | O |
| mozzarella | B-INGREDIENT |
| y | O |
| Heineken | B-BRAND |
| Event | Date |
|---|---|
| Task published | March 23, 2026 |
| Challenge opens | April 5, 2026 |
| Development phase closes | April 25, 2026 |
| Evaluation phase opens | April 26, 2026 |
| Results announced | May 5, 2026 |
| Paper submission deadline | Second week of May 2026 |
git clone https://github.com/nalef-initiative/GastroCorpNER cd gastrocorp-ner-2026 pip install -r requirements.txt
python baselines/baseline_majority.py \
--train data/menu_train.jsonl \
--dev data/menu_dev.jsonl \
--out predictions/menu_dev_baseline.csv
python evaluate.py \
--gold data/menu_dev.jsonl \
--pred predictions/menu_dev_baseline.csv
Participants are invited to submit system description papers (5–8 pages) in English using the CEURART format.
Copyright © 2026 for this paper by its authors. Use permitted under CC BY 4.0. NALEF 2026, Cartagena, Colombia.