Paper: "Gendec: A Machine Learning-based Framework for Gender Detection from Japanese Names"
Paper link: Gendec: A Machine Learning-based Framework for Gender Detection from Japanese Names
Student involved:
- Pham Tien Duong – 20521222 – Data Science 2020 – Main author
Supervisor: MSc. Nguyen Thanh Luan
Summary of the paper:
Every human has their own name, a fundamental aspect of their identity and cultural heritage. The name can often convey a wealth of information, including details about an individual’s background, ethnicity, and, especially, their gender. By detecting gender through the analysis of names, researchers can unlock valuable insights into linguistic patterns and cultural norms, which can be applied to practical applications. In this paper, we release a novel large-scale dataset for Japanese name gender detection comprising 64,139 full names in romaji, hiragana, and kanji forms, along with their biological genders. Moreover, we propose Gendec, a machine learning-based framework for gender detection from Japanese names that leverages diverse approaches, including traditional machine learning techniques or state-of-the-art transfer learning approaches, to predict the gender associated with Japanese names accurately. Through a thorough investigation, the proposed framework is expected to be effective and serve potential applications in various domains.
The ISDA conference is a reputable international conference held annually, focusing on research in Intelligent Systems and their real-world applications. The main theme of ISDA’23 is the application of large language models and generation in the real world.
For more details, visit: https://www.facebook.com/UIT.Fanpage/posts/pfbid025YnynHGuk1YQB52YBoFrGxkdg...
Hai Bang - Media Collaborator, University of Information Technology
Nhat Hien - Translation Collaborator, University of Information Technology