Task 2: GastroCorp NER

Gastronomic Named Entity Recognition in Spanish–English Texts

Overview

The gastronomic domain presents distinctive challenges for Natural Language Processing. Terms such as arepa de choclo, ají amarillo, or tacos al pastor carry cultural identity and semantic specificity that demand precise entity recognition before downstream tasks like machine translation or recipe recommendation can be performed reliably.

GastroCorp NER 2026 is a shared task over a bilingual (Spanish–English) gastronomic corpus. Participants must assign an IOB2 entity label to tokens from restaurant menus or culinary recipes.

Organization

Universidad Tecnológica de Bolívar
Cartagena, Colombia.

Task Description

Input

Whitespace-tokenized tokens in HuggingFace NER JSONL format.

Output

IOB2 labels from the following tag set:

TagEntity Type
DISHPrepared food items
BEVERAGEDrinks and final products
INGREDIENTIndividual components
BRANDCommercial brands
Illustrative Example
TokenTag
PizzaB-DISH
MargaritaI-DISH
conO
mozzarellaB-INGREDIENT
yO
HeinekenB-BRAND

Timeline

EventDate
Task publishedMarch 23, 2026
Challenge opensApril 5, 2026
Development phase closesApril 25, 2026
Evaluation phase opensApril 26, 2026
Results announcedMay 5, 2026
Paper submission deadlineSecond week of May 2026

Quick Start

  1. Clone and Install:
    git clone https://github.com/nalef-initiative/GastroCorpNER
    cd gastrocorp-ner-2026
    pip install -r requirements.txt
  2. Run Majority Baseline:
    python baselines/baseline_majority.py \
        --train data/menu_train.jsonl \
        --dev   data/menu_dev.jsonl \
        --out   predictions/menu_dev_baseline.csv
  3. Evaluate Locally:
    python evaluate.py \
        --gold data/menu_dev.jsonl \
        --pred predictions/menu_dev_baseline.csv

Paper Submission

Participants are invited to submit system description papers (5–8 pages) in English using the CEURART format.

Copyright © 2026 for this paper by its authors.
Use permitted under CC BY 4.0.
NALEF 2026, Cartagena, Colombia.