### Improved Algorithm for Mining of High Utility patterns in one phase Based on Map Reduce Framework on Hadoop

#### Abstract

database alludes to the disclosure of itemsets with high utility like

benefits. In spite of the fact that various significant calculations

have been proposed lately, they bring about the problem of

causing a sizably voluminous number of applicant itemsets for

high utility itemsets. Such a large number of candidate itemsets

degrades the mining performance in terms of execution time

and space requirement. Earlier work shows this on two phase

candidate generation. This approach suffers from scalability issue

due to the huge number of candidates. Our paper presents the

efficient approach where we can generate high utility patterns

in one phase without generating candidates. Here we have

taken experiments on linear data structure, our pattern growth

approach is to search a reverse set enumeration tree and to prune

search space by utility upper bounding. Also high utility patterns

are identified by a closure property and singleton property. Iin

this venture we are displaying new approach which is extending

these calculations to conquer the restrictions utilizing the Map

Reduce structure on Hadoop. Experimental results show that the

proposed algorithms, not only reduce the number of candidates

effectively but also outperform other algorithms substantially in

terms of runtime, especially when databases contain lots of long

transactions.

#### Full Text:

PDF#### References

R. Agarwal, C. Aggarwal, and V. Prasad, Depth first generation of

long patterns, in Proc. ACM SIGKDD Int. Conf. Knowl. DiscoveryData

Mining, 2000, pp. 108118.

R. Agrawal, T. Imielinski, and A. Swami, Mining association rules

between sets of items in large databases in Proc. ACM SIGMOD Int.

Conf. Manage. Data, 1993, pp. 207216.

R. Agrawal and R. Srikant, Fast algorithms for mining association rules

in Proc. 20th Int. Conf. Very Large Databases, 1994, pp. 487499

C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, Efficient tree

structures for high utility pattern mining in incremental databases IEEE

Trans. Knowl. Data Eng., vol. 21, no. 12, pp. 1708 1721, Dec. 2009.

R. Bayardo and R. Agrawal, Mining the most interesting rules in Proc.

th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 1999, pp.

F. Bonchi, F. Giannotti, A. Mazzanti, and D. Pedreschi, ExAnte: A

preprocessing method for frequent-pattern mining IEEE Intell. Syst., vol.

, no. 3, pp. 2531, May/Jun. 2005.

F. Bonchi and B. Goethals, FP-Bonsai: The art of growing and pruning

small FP-trees in Proc. 8th Pacific-Asia Conf. Adv. Knowl. Discovery

Data Mining, 2004, pp. 155160

F. Bonchi and C. Lucchese, Extending the state-of-the-art of constraintbased pattern discovery Data Knowl. Eng., vol. 60, no. 2, pp. 377399,

C. Bucila, J. Gehrke, D. Kifer, and W. M. White, Dualminer: A dualpruning algorithm for itemsets with constraints Data Mining Knowl.

Discovery, vol. 7, no. 3, pp. 241272, 2003.

R. Chan, Q. Yang, and Y. Shen, Mining high utility itemsets in Proc.

Int. Conf. Data Mining, 2003, pp. 1926.

S. Dawar and V. Goyal, UP-Hist tree: An efficient data structure for

mining high utility patterns from transaction databases in Proc. 19th Int.

Database Eng. Appl. Symp., 2015, pp. 5661.

T. De Bie, Maximum entropy models and subjective interestingness: An

application to tiles in binary databasesData Mining Knowl. Discovery,

vol. 23, no. 3, pp. 407446, 2011.

L. De Raedt, T. Guns, and S. Nijssen, Constraint programming for

itemset mining in Proc. ACM SIGKDD, 2008, pp. 204212.

A. Erwin, R. P. Gopalan, and N. R. Achuthan, Efficient mining of high

utility itemsets from large datasets in Proc. 12th Pacific-Asia Conf. Adv.

Knowl. Discovery Data Mining, 2008, pp. 554561.

P. Fournier-Viger, C.-W. Wu, S. Zida, and V. S. Tseng, FHM: Faster

high-utility itemset mining using estimated utility cooccurrence pruning

in Proc. 21st Int. Symp. Found. Intell. Syst., 2014, pp. 8392.

L. Geng and H. J. Hamilton, Interestingness measures for data mining:

A survey ACM Comput. Surveys, vol. 38, no. 3, p. 9, 2006.

J. Han, J. Pei, and Y. Yin, Mining frequent patterns without candidate

generation in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2000, pp.

R. J. Hilderman, C. L. Carter, H. J. Hamilton, and N. Cercone, Mining

market basket data using share measures and characterized itemsets in

Proc. PAKDD, 1998, pp. 7286.

R. J. Hilderman and H. J. Hamilton, Measuring the interestingness of

discovered knowledge: A principled approach Intell. Data Anal., vol. 7,

Raymond Chan; Qiang Yang; Yi-Dong Shen, ”Mining high utility

itemsets” In Proc. of Third IEEE Intl Conf. on Data Mining ,November

### Refbacks

- There are currently no refbacks.