VLSP 2025 Medical domain MT Challenge

Organized by thviet - Current server time: Aug. 30, 2025, 9:20 p.m. UTC

First phase

Public Test
July 17, 2025, midnight UTC

End

Competition Ends
Aug. 25, 2025, midnight UTC

VLSP 2025

VLSP 2025 CHALLENGE ON MEDICAL MACHINE TRANSLATION WITH LIMIT PARAMETER AND RESOURCE USING PRE-TRAIN MODEL

Data Format

Input format: For the parallel data, training, development and public test sets will be provided as UTF-8 plaintexts, 1-to-1 sentence aligned, one “sentence” per line. Notice that “sentence” here is not necessarily a linguistic sentence but maybe phrases. For the monolingual corpora, we provide UTF-8 plaintexts, one “sentence” per line as you would see when you downloaded them Output format: UTF-8, precomposed Unicode plaintexts, one sentence per line. Participants might choose appropriate casing methods in the preprocessing steps: word segmentation, true casing, lowercasing or leaving it all along. You might want to use those tools which are available in the Moses git repository.

Private test format

LinkPrivate test set

The private test contains 2 files: en.csv, vi.csv . These two files are not translation result of each other. The ground truth contains gold translation of these two files is hidden.

Baseline example

There is an example of what the result file should looks like. The file is created by prompting Qwen3-0.6B for translating. The file is provided in the same folder of private test named results.csv

Evaluation 

  • Final system ranking will be determined by human evaluation.

  • Participants are allowed to submit only constrained systems.

    • A constrained system is defined as a system trained exclusively on the data provided by the organizers.

    • Only constrained submissions will be officially evaluated and ranked.

  • AIhub will display SacreBLEU scores for submitted systems.

    • These scores are for reference only and contribute to the overall evaluation.

    • The final official ranking may differ from the AIhub leaderboard because it also incorporates human evaluation.

  • Team ranking will be based on a combination of:

    1. Automatic evaluation (SacreBLEU)

    2. Human evaluation

1. Metadata (Required Information)

When submitting your system, each team must provide the following metadata through the AIhub submission form (or organizer form):

  • Team name

  • Method name

  • Method description (short summary + prompt used for inference)

  • Project URL → link to a Google Drive folder containing your final submission package (see Section 2).


2. Google Drive Project Folder (Final Submission)

The Project URL must point to a Google Drive folder that contains your full system for final evaluation.

This folder must include:

2.1 Docker Image

  • Self-contained: includes all models and dependencies.

  • Offline: no calls to online services or APIs.

2.2 Inference Script (Bash)

  • Sends an input text file to your Docker service.

  • Produces the corresponding translations in an output text file.

  • Accepts three arguments:

    1. host:port of the Docker service

    2. Input file path

    3. Output file path

2.3 Report

  • Description of your method (training data, approach, prompts).

  • Short overview of your system.

2.4 Packaging

  • Compress everything into a .zip before uploading to Google Drive.

  • Provide an MD5 checksum for integrity verification.

  • Ensure the Google Drive link has download permission enabled.

2.5 Optional Statistics

  • Total runtime.

  • Sentences per second, words per second.

✅ Multiple submissions are allowed, but only the last submission will be used for evaluation.


3. AIhub .zip Submission (Automatic BLEU Scoring)

In addition to the Project URL, each team must upload a .zip file to AIhub.
This .zip is used only for automatic SacreBLEU evaluation and leaderboard display.

Packaging

  • Submit one .zip file only.

  • Inside, include exactly one file named:

     
    results.csv
  • Do not include extra files.

results.csv format

  • Two columns only:

    • English → Translation of the original Vietnamese sentences.

    • Vietnamese → Translation of the original English sentences.

  • Column names must be exactly:

     
    English, Vietnamese

    (case-sensitive, no extra spaces).

  • Row order must match the original input files exactly.

⚠️ Any deviation (wrong column names, extra files, row mismatch) will cause the evaluation script to fail with KeyError.


4. Evaluation

  • AIhub SacreBLEU scores (from results.csv) will be displayed on the leaderboard.

  • Final ranking will be determined by:

    • Human evaluation

    • Automatic scoring (SacreBLEU)

  • Only constrained systems (trained solely on the data provided by organizers) are eligible for final ranking.

Public Test

Start: July 17, 2025, midnight

Private Test

Start: Aug. 17, 2025, midnight

Competition Ends

Aug. 25, 2025, midnight

You must be logged in to participate in competitions.

Sign In