Title: Enhancing the Efficiency of Closed Pattern Mining Approach
[Ph.D. Dissertation in Computer Science]
Author: Nguyen Thanh Trung
Summary:
Contributions of the Dissertation:
In Chapter 2: A novel intermediate structure called the Pattern Generating Set (P-set) is proposed, comprising pattern templates from which closed itemsets and frequent itemsets are derived. The algorithm for constructing the P-set enhances the generation of transactions using ConPatSet and IncPatSet functions to update the P-set. When reducing or removing transactions from the transaction set, the DesPatSet function is utilized to update the P-set. For a database with n transactions, m items, the complexity of ConPatSet and IncPatSet algorithms to establish P is O(mnk^2), where k = |P|. Storing transaction data in bit sequences offers significant advantages, primarily minimizing both internal and external memory usage. Calculations on bit data are much faster compared to other data types. The speed of performing read-write tasks between external and internal memory using bit mechanisms is significantly improved compared to other mechanisms. Based on the pattern generating set, enhanced mining algorithms in Chapter 2 belong to the direct mining techniques group.
In Chapter 3: Enhanced mining techniques in batches are developed. Specifically, the P-set of the entire database is generated from component P-sets, corresponding to dividing the database into transaction batches or item batches. Enhanced mining algorithms in batches of transactions or item batches are also proposed.
Subsequently, in Chapter 4: From batch processing algorithms, parallelization solutions are developed and implemented on Hadoop-Spark to solve problems with large datasets. Additionally, the pattern generating set is applied to solve the problem of reducing characteristic attribute sets and mining data streams.
Summary Source: https://sdh.uit.edu.vn/sites/default/files/201805/thongtinla_tiengviet_trungnt.pdf
For those interested, please visit the Library to read the hardcopy or access the full text remotely at the following address: https://ir.vnulib.edu.vn/handle/VNUHCM/15977
For any inquiries regarding access accounts, please contact via email: thuvien@uit.edu.vn
Detailed Information:
Hạ Băng - Media Collaborator, University of Information Technology
English version: Phan Huy Hoang