The paper titled "Machine Reading Comprehension for Vietnamese Customer Reviews: Task, Corpus And Baseline Models" has been accepted for presentation at the 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37).
Authors:
Đỗ Phạm Phúc Tính – 20522020 – KHDL2020 – Co-author
Cao Đình Duy Ngọc – 20521661 – KHDL2020 – Co-author
Nguyễn Thành Nhân – 20521701 – KHDL2020 – Co-author
Supervisors:
MSc Nguyễn Văn Kiệt
MSc Huỳnh Văn Tín
Paper Summary:
Customers spend significant time researching product information before making a purchase. Machine Reading Comprehension (MRC) on customer reviews can partially address this issue. However, benchmark corpora specific to the review domain in Vietnamese are lacking for effective MRC implementation. To fill this gap, the team proposed ViRe4MRC, the first benchmark corpus for evaluating review-based machine reading comprehension on customer reviews in Vietnamese. This corpus comprises 6,603 human-generated question-answer pairs from 2,174 customer reviews on smartphone and restaurant domains. The team also evaluated experimental results of monolingual language models: ViBERT, PhoBERT, and vELECTRA; multilingual language models: mBERT and XLM-RoBERTa (XLM-R). The XLM-R-Large model, as the best model, achieved 44.25% Exact Match (EM) and 78.13% F1. The corpus is available for research purposes.
The team extends their gratitude to Mr. Nguyễn Văn Kiệt and Mr. Huỳnh Văn Tín for their guidance and support throughout the research and publication process. They also thank Mr. Vũ Quí San, a former student of the KHTN2018 class, for his orientation and assistance.
The 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37) is a prestigious international conference in the field of theoretical analysis and natural language processing. Since 1982, the
PACLIC conference series has provided a forum for researchers in various language research fields in the Asia-Pacific region to share discoveries and insights in formal and experimental language research. In 2023, the main PACLIC 37 conference will take place from December 2-4 at The Hong Kong Polytechnic University. Previous PACLIC proceedings have been indexed in Scopus (since PACLIC 19 in 2005) and listed in the ACL Anthology.
Detailed Information: https://www.facebook.com/UIT.Fanpage/posts/pfbid07vowaL94RN41ECjJTKSkWPKgC3Lp27Xx9ivaVb56BAYtNnuurHCZCkk5eiuN6pRMl
Hạ Băng - Collaborator, Information Technology University Communication Team
English version: Phan Huy Hoang