查字典论文网 >> 有效的不确定数据概率频繁项集挖掘算法





中图分类号: TP301.6 文献标志码:A


Abstract:When using the way of pattern growth to construct tree structure, the exiting algorithms for mining probabilistic frequent itemsets suffer many problems, such as generating large number of tree nodes, occupying large memory space and having low efficiency. In order to solve these problems, a Progressive Uncertain Frequent Pattern Growth algorithm named PUFPGrowth was proposed. By the way of reading data in the uncertain database tuple by tuple, the proposed algorithm constructed tree structure as compact as Frequent Pattern Tree (FPTree) and updated dynamic array of expected value whose header table saved the same itemsets. When all transactions were inserted into the Progressive Uncertain Frequent Pattern tree (PUFPTree), all the probabilistic frequent itemsets could be mined by traversing the dynamic array. The experimental results and theoretical analysis show that PUFPGrowth algorithm can find the probabilistic frequent itemsets effectively. Compared with the Uncertain Frequent pattern Growth (UFGrowth) algorithm and Compressed Uncertain FrequentPattern Mine (CUFPMine) algorithm, the proposed PUFPGrowth algorithm can improve mining efficiency of probabilistic frequent itemsets on uncertain dataset and reduce memory usage to a certain degree.


Key words:data mining; uncertain data; possible world model; probabilistic frequent itemset; frequent pattern

0 引言

[9]CHUI CK, KAO B, HUNG E. Mining frequent itemsets from uncertain data [C]// PAKDD 2007: Proceedings of the 11th Pacific-

Asia conference on Advances in Knowledge Discovery and Data Mining, LNCS 4426. Berlin: Springer, 2007: 47-58.



下一篇:如何对幼儿进行德育教育论文 幼儿园关于德育教育之类的论文

2023年高中生申请加入共青团申请书 加入共青团申请书(实用10篇) 证专业技能与职业素养专题报告数控 专业技能与素养(精选5篇)