Task Description
Machine Reading Comprehension (MRC) has lately emerged as an area in computational linguistics (CL) in which automatic systems are developed to find correct answers to questions posed in human language, given documents containing the answers. The task of Vietnamese Machine Reading Comprehension is the extraction-based machine reading comprehension on Vietnamese Wikipedia-based texts. Based on SQuAD [1, 2], we developed Vietnamese Question Answering Dataset (UIT-ViQuAD), which is a reading comprehension dataset, consisting of questions posed by crowd-workers on a set of Wikipedia Vietnamese articles, where the answer to every question is a span of text, from the corresponding reading passage, or the question might be unanswerable.
UIT-ViQuAD2.0 combines the 23K questions in UIT-ViQuAD 1.0 [3] with over 12K unanswerable questions written adversarially by crowd-workers to look similar to answerable ones. To do well on UIT-ViQuAD 2.0, MRC systems must not only answer questions when possible but also determine when no answer is supported by the context and abstain from answering. In this task, participating teams use UIT-ViQuAD2.0 to evaluate machine reading comprehension models.
UIT-ViQuAD 1.0, the previous version of the UIT-ViQuAD dataset [3], contains 23K+ question-answer pairs on 170+ articles
IMPORTANT: Before submitting on the system, you must rename your the submission file to results.json, and compressed it as zip file with the name: results.zip
NOTE: On the Post Evaluation phase, please choose the button "Make your submission public" (the red button after your submission has finished) to send your submission result to the Public leaderboard
Publication: Please cite this paper if you use this dataset for research purposes:
Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son T. Luu, and Ngan Luu-Thuy Nguyen. 2021. VLSP 2021 - ViMRC Challenge: Vietnamese machine reading comprehension. In Proceedings of the 8th International Workshop on Vietnamese Language and Speech Processing (VLSP 2021)
Link to the publication paper: https://arxiv.org/abs/2203.11400
Evaluation Metrics
Following the evaluation metrics on SQuAD2.0 [2], we use EM and F1-score as evaluation metrics for Vietnamese machine reading comprehension:
Precision=(the number of matched tokens)/(the total number of tokens in the predicted answer)
Recall=(the number of matched tokens)/(the total number of tokens in the gold standard answer)
F1-score=(2*Precision*Recall)/(Precision+Recall)
The final ranking is evaluated on the test set, according to the F1-score (EM as a secondary metric when there is a tie). The results are round to the nearest hundredth (3 decimal places). If 2 teams have the same F1 score, EM score is used to determine which team is better.
The task's evaluation script: https://drive.google.com/file/d/1vn6Aed4nacSD932YezQgvWNIOx_1PCb4/view?usp=sharing
Dataset Information
We provide UIT-ViQuAD2.0 consisting of over 35K questions to participating teams. The dataset is stored in .json format. Here are a few question examples extracted from the dataset.
Context: Khác với nhiều ngôn ngữ Ấn-Âu khác, tiếng Anh đã gần như loại bỏ hệ thống biến tố dựa trên cách để thay bằng cấu trúc phân tích. Đại từ nhân xưng duy trì hệ thống cách hoàn chỉnh hơn những lớp từ khác. Tiếng Anh có bảy lớp từ chính: động từ, danh từ, tính từ, trạng từ, hạn định từ (tức mạo từ), giới từ, và liên từ. Có thể tách đại từ khỏi danh từ, và thêm vào thán từ.
question : Tiếng Anh có bao nhiêu loại từ?
is_impossible : False. // There exists an answer to the question.
answer : bảy.
-----------------
question : Ngôn ngữ Ấn-Âu có bao nhiêu loại từ?
is_impossible : True. // There is no correct answer extracted from the Context.
plausible_answer : bảy. // A plausible but incorrect answer extracted from the Context has the same type which the question aims to.
Note: All data are transferred to participating teams via email.
Publication: Please cite this paper if you use this dataset for research purposes:
Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son T. Luu, and Ngan Luu-Thuy Nguyen. 2021. VLSP 2021 - ViMRC Challenge: Vietnamese machine reading comprehension. In Proceedings of the 8th International Workshop on Vietnamese Language and Speech Processing (VLSP 2021)
Link to the publication paper: https://arxiv.org/abs/2203.11400
Terms:
Submission guidelines:
The submission file is in JSON format, and must be named as: results.json.
The JSON content is structured as:
{
“<id_of_question>” : “answers text”,
…..
}
Here is an example of JSON format for the submission file:
{
“uit_034_35”: “Paris là kinh đô ánh sáng”,
“uit_035_57”: “”,
“uit_037_12”: “Paris là thủ đô Cộng hoà Pháp”,
…...
}
Before submitting on the system, the submission file must be compressed as zip file with the name: results.zip
[1] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. "SQuAD: 100,000+ Questions for Machine Comprehension of Text." Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.
[2] Pranav Rajpurkar, Robin Jia, and Percy Liang. "Know What You Don’t Know: Unanswerable Questions for SQuAD." Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2018.
[3] Kiet Van Nguyen, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen. "A Vietnamese Dataset for Evaluating Machine Reading Comprehension." Proceedings of the 28th International Conference on Computational Linguistics. 2020.
Start: Sept. 30, 2021, 5 p.m.
Start: Oct. 4, 2021, 5 p.m.
Description: Please name your team
Start: Oct. 24, 2021, 5 p.m.
Start: Oct. 15, 2021, 5 p.m.
Description: Please choose the button "Make your submission public" (the red button after your submission has finished) to send your submission result to the public leaderboard
Oct. 27, 2021, 4:59 p.m.
You must be logged in to participate in competitions.
Sign In