VLSP 2025 – viTempQA Date-arith Challenge - Vietnamese Temporal QA Date-Arith

Organized by toanlnhus - Current server time: Aug. 30, 2025, 9:20 p.m. UTC

Previous

Public Test
July 10, 2025, midnight UTC

Current

Private Test
Aug. 20, 2025, midnight UTC

End

Competition Ends
Aug. 31, 2025, midnight UTC

VLSP 2022 – viTempQA Date-Arith Challenge: Vietnamese Temporal Question Answer

Shared Task Registration Form

Important dates 
  • June 23, 2025: Registration open
  • July 6, 2025: Training data release
  • July 15, 2025: Public test release
  • August 20, 2025: System submission deadline
  • August 30, 2025: Private test results release
  • September 10, 2025: Technical report submission
  • September 27, 2025: Notification of acceptance
  • October 3, 2025: Camera-ready deadline
  • October 29-30, 2025: Conference dates
 
Task Description

Objective: Build a system to answer temporal questions in Vietnamese across three sub-tasks: Date Arithmetic (date-arith), Duration Question Answering (durationQA). The system must extract and reason about temporal information to provide accurate answers related to dates, durations, and temporal relationships.

  • Sub-Task 1: Date Arithmetic (date-arith)
    Description: The date-arith sub-task focuses on handling questions related to date calculations, such as adding or subtracting time intervals from a given date. This involves understanding and manipulating time expressions to compute answers based on the provided context.
    Focus: Parse and manipulate temporal expressions to compute new dates.
 
Evaluation

System performance will be evaluated using a range of standard metrics, including Accuracy, Exact Match, Precision, Recall, and F1-score:

Evaluation Metrics

  • Accuracy: Used for Sub-Task 1 (Date Arithmetic). It is the percentage of system answers that exactly match the ground-truth answers.

Evaluation is performed separately for each sub-task. The final evaluation report includes individual scores as well as aggregate performance across all tasks.

Example for Sub-Task 1: Date Arithmetic

Input:

{

    "question": "Thời gian 1 năm và 2 tháng trước tháng 6, 1297 là khi nào?",

    "context": "",

    "answer": ["Tháng 4, 1296"]

 }

System Prediction:

["Tháng 4, 1296"]

  • Accuracy: The prediction matches the ground-truth exactly.
    Accuracy = 1.0

 

References
  1. Chu, Zheng, et al. "Timebench: A comprehensive evaluation of temporal reasoning abilities in large language models." arXiv preprint arXiv:2311.17667 (2023).
  2. Tan, Qingyu, Hwee Tou Ng, and Lidong Bing. "Towards benchmarking and improving the temporal reasoning capability of large language models." arXiv preprint arXiv:2306.08952 (2023).
  3. Virgo, Felix, Fei Cheng, and Sadao Kurohashi. "Improving event duration question answering by leveraging existing temporal information extraction data." Proceedings of the Thirteenth Language Resources and Evaluation Conference. 2022.

Public Test

Start: July 10, 2025, midnight

Private Test

Start: Aug. 20, 2025, midnight

Competition Ends

Aug. 31, 2025, midnight

You must be logged in to participate in competitions.

Sign In