Article: "A Scalable Hate Speech Detection System for Vietnamese Social Media Using Real-Time Big Data Processing and Distributed Deep Learning"
Article Link: https://drive.google.com/drive/folders/1U4pfIkZvDOD3rLXZkW...
Authors:
- Dinh Van Co – 19521293 – KHDL2019: Main author.
- Nguyen Thi Mai Phuong – 19522064 – KHDL2019: Main author.
- Vo Tran Dai – 19521308 – KHDL2019: Main author.
Supervising Instructor:
- Dr. Do Trong Hop
Abstract:
In this study, a system to detect hate and offensive social network comments in real-time using big data and distributed deep learning technology is presented. In the offline phase, state-of-the-art deep learning models are trained in a distributed manner using the BigDL library. The trained models are then integrated into the real- time big data processing component powered by Apache Spark, which is a big data framework capable of processing a huge amount of comments in real-time. In the online phase, continuous stream of comments from Facebook are crawled and channeled through Kafka to this real-time big data processing component to output hate speech detection results. These results are then then analyzed, and the statistical data is displayed in a web-app powered by Flask. Therefore, this work not only focuses on accuracy but also emphasizes the system’s practicality. Thanks to state- of-the-art deep learning models, the system can achieve high accuracy in hate speech detection. With the deployed big data technology, the system can collect and process huge amounts of Facebook comments and produce statistical results in real-time.
"We would like to express our sincere gratitude to Dr. Do Trong Hop, who has consistently provided strong support, advice, and guidance throughout our study and the completion of this research."
The International Conference on Advanced Technologies for Communications is an annual conference series, since 2008, co-organized by the Radio & Electronics Association of Vietnam (REV) and the IEEE Communications Society (IEEE ComSoc). The goal of the series is twofold: to foster an international forum for scientific and technological exchange among Vietnamese and worldwide scientists and engineers in the fields of electronics, communications and related areas, and to gather their high-quality research contributions.
For more detailed information, please visit the university's fan page: Trường Đại học Công nghệ Thông tin.
Dong Xanh - Media Collaborator, University of Information Technology
Nhat Hien - Translation Collaborator, University of Information Technology