Title of the Paper: "Domain Adaptation in Nested Named Entity Recognition From Scientific Articles in Agriculture"
Student Authors:
- Phan Doãn Thái Bình – 20520043 – KHTN2020 – Main Author
- Lê Phước Vĩnh Linh – 20521531 – KHTN2020 - Main Author
Supervisors:
- Dr. Ngô Quốc Hưng
- Dr. Lương Ngọc Hoàng
Abstract:
In the context of digital agriculture, timely and informed decision-making relies on the utilization of agricultural data, including text sources such as scientific articles. Named Entity Recognition (NER) and Agricultural Entity Recognition (AGER) play a crucial role in semantic understanding for precise identification and categorization of farming components. However, existing approaches to agricultural entity recognition face challenges due to limited resources, and the need for recognizing nested named entities arises due to the complexities in the agricultural domain.
This study introduces the SAGRI dataset, incorporating a novel tagset for AGER, systematically established through annotation. The tagset enables the extraction of domain-independent concepts from scientific article abstracts. The study also presents a state-of-the-art deep learning baseline with an advanced Triaffine attention mechanism for robust entity extraction. Additionally, it introduces a pioneering few-shot learning strategy for cross-domain categorization, particularly effective with scarce training data, achieving high F1 scores compared to the baseline.
The students express gratitude to Dr. Ngô Quốc Hưng and Dr. Lương Ngọc Hoàng for their dedicated guidance and for pointing out the limitations in their research during the process of conducting and publishing this international scientific paper.
SOICT 2023 is an international symposium covering various research areas, and the conference aims to provide an academic platform for researchers and students to share their latest findings and identify future challenges in computer science. The symposium will be held in Ho Chi Minh City on December 7-8, 2023, and the proceedings will be published in the ACM Digital Library. SOICT 2023 is organized by several institutions, including the School of Information and Communication Technology, Hanoi University of Science and Technology, and Hochiminh University of Science – Vietnam National University.
Further information: https://www.facebook.com/UIT.Fanpage/posts/pfbid09pMH91yccUKC5LR8w72kpTe...
Written by: Hai Bang
Translated by: Ngoc Diem