Thesis Title: Development of pattern mining algorithms on quantitative databases
[Ph.D. Thesis in Computer Science]
Author: Bui Danh Huong
Abstract:
The dissertation presents an overview and theoretical basis study of common weighted pattern mining on quantitative databases. Based on this, the thesis proposes new efficient methods to solve problems and overcome current challenges in weighted pattern mining such as excessively large result pattern sets, redundant rule issues, or user-directed and real-time pattern mining challenges. Four specific problems focused on in the thesis are weighted frequent pattern mining, weighted closed frequent pattern mining, top-rank-k weighted frequent pattern mining, and stream data-based weighted frequent pattern mining. The algorithms proposed in the dissertation demonstrate superior efficiency over existing algorithms in terms of runtime, memory usage, and scalability when mining various forms of weighted patterns in the aforementioned problems.
The research results of the dissertation have been published in 4 SCIE journals including Expert Systems with Applications (Q1, 2018), Applied Intelligence (Q2, 2020), Knowledge-Based Systems (Q1, 2020), IEEE Access (Q1, 2021), and presented in 3 domestic and international conferences (SMC-2016, FAIR-2016, and @-2017).
*NOVEL CONTRIBUTIONS OF THE THESIS
The scientific contributions of the dissertation include:
-
Proposal of WN-tree structure and WN-list data structure, leading to the NFWI algorithm for efficient weighted frequent pattern mining. The WN-list structure has several advantages such as linear complexity of intersection operation, self-pruning capability, and fast computation of pattern weight support based on the WN-list of the pattern.
-
Proposal oed on the ancestor relationship of WN-list, leading to the NFWCI algorithm for efficient weighted closed frequent pattern mining based on the WN-list structure.
-
Proposal of TFWIN+ algorithm for efficient top-rank-k weighted frequent pattern mining based on the WN-list structure and early threshold increasing and branch pruning strategies.
-
Proposal of SWN-tree structure, an improvement from the WN-tree structure, for effective storage and maintenance of information on data windows when sliding on data streams. This leads to the FWPODS algorithm for efficient stream data-based weighted frequent pattern mining based on the sliding window model.
APPLICATIONS/ POTENTIAL APPLICATIONS IN PRACTICE OR OPEN ISSUES NEEDING FURTHER RESEARCH
Future research directions will focus on solving several pattern mining problems on weighted databases such as mining maximal weighted itemsets, mining weighted closed frequent patterns on growing databases, mining weighted frequent patterns on uncertain databases, and deploying solutions for weighted frequent pattern mining on multi-core and distributed systems. We will also investigate applications utilizing the weighted pattern mining platform such as graph mining, social network mining, text mining, and IoT data mining.
Interested readers are invited to visit the Library to read the hard copy or access the full text remotely at the following address:
https://ir.vnulib.edu.vn/handle/VNUHCM/15989
For any inquiries regarding access accounts, please contact via email: thuvien@uit.edu.vn
For more details, please refer to: https://www.facebook.com/LibUIT.Fanpage/posts/pfbid02CFrU9pEpP3iVrnYyjXX...
Written by: Ha Bang
Translated by: Ngoc Diem