School of Computer Science Engineering and Technology, Bennett University, Greater Noida, India"/> Department of Computer Science, University of Delhi, Delhi 110–007, India"/>
Search
2024 Volume 39
Article Contents
REVIEW   Open Access    

Top-k high utility itemset mining: current status and future directions

More Information
  • Abstract: High utility itemsets mining (HUIM) is an important sub-field of frequent itemset mining (FIM). Recently, HUIM has received much attention in the field of data mining. High utility itemsets (HUIs) have proven to be quite useful in marketing, retail marketing, cross-marketing, and e-commerce. Traditional HUIM approaches suffer from a drawback as they need a user-defined minimum utility ($ min\_util $) threshold. It is not easy for the users to set the appropriate $ min\_util $ threshold to find actionable HUIs. To target this drawback, top-k HUIM has been introduced. top-k HUIM is more suitable for supermarket managers and retailers to prepare appropriate strategies to generate higher profit. In this paper, we provide an in-depth survey of the current status of top-k HUIM approaches. The paper presents the task of top-k HUIM and its relevant definitions. It reviews the top-k HUIM approaches and presents their advantages and disadvantages. The paper also discusses the important strategies of the top-k HUIM, their variations, and research opportunities. The paper provides a detailed summary, analysis, and future directions of the top-k HUIM field.
  • 加载中
  • Agrawal , R. & Srikant , R. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB’94, 487–499. Morgan Kaufmann Publishers Inc.

    Google Scholar

    Ahmed , C. F., Tanbeer , S. K., Jeong , B.-S. & Choi , H.-J. 2012. Interactive mining of high utility patterns over data streams. Expert Systems with Applications 39(15), 11979–11991. https://www.sciencedirect.com/science/article/pii/S0957417412005854

    Google Scholar

    Ahmed , C. F., Tanbeer , S. K., Jeong , B.-S. & Lee , Y.-K. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering 21(12), 1708–1721.

    Google Scholar

    Ashraf , M., Abdelkader , T., Rady , S. & Gharib , T. F. 2022. TKN: An efficient approach for discovering top-k high utility itemsets with positive or negative profits. Information Sciences 587, 654–678. https://www.sciencedirect.com/science/article/pii/S0020025521012457

    Google Scholar

    Ayres , J., Flannick , J., Gehrke , J. & Yiu , T. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’02, 429–435. Association for Computing Machinery. https://doi.org/10.1145/775047.775109

    Google Scholar

    Cagliero , L., Chiusano , S., Garza , P. & Ricupero , G. 2017. Discovering high-utility itemsets at multiple abstraction levels. In New Trends in Databases and Information Systems: ADBIS 2017 Short Papers and Workshops, AMSD, BigNovelTI, DAS, SW4CH, DC, Nicosia, Cyprus, September 24--27, Proceedings 21, 224–234. Springer.

    Google Scholar

    Cao , L., Dong , X. & Zheng , Z. 2016. e-NSP: Efficient negative sequential pattern mining. Artificial Intelligence 235, 156–182. https://www.sciencedirect.com/science/article/pii/S0004370216300248

    Google Scholar

    Chen , J., Wan , S., Gan , W., Chen , G. & Fujita , H. 2021. TOPIC: Top-k high-utility itemset discovering. ArXiv abs/2106.14811.

    Google Scholar

    Chen , Y.-C., Chen , C.-C., Peng , W.-C. & Lee , W.-C. 2014. Mining correlation patterns among appliances in smart home environment. In Advances in Knowledge Discovery and Data Mining, Tseng , V. S., Ho , T. B., Zhou , Z.-H., Chen , A. L. P. & Kao , H.-Y. (eds), 222–233. Springer International Publishing.

    Google Scholar

    Cheng , C.-P., Liu , Y.-C., Tsai , Y.-L. & Tseng , V. S. 2013. An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets. BMC Bioinformatics 14(12), 1–12.

    Google Scholar

    Chu , C.-J., Tseng , V. S. & Liang , T. 2009. An efficient algorithm for mining high utility itemsets with negative item values in large databases. Applied Mathematics and Computation 215(2), 767–778. http://www.sciencedirect.com/science/article/pii/S009630030900561X

    Google Scholar

    Dam , T.-L., Li , K., Fournier-Viger , P. & Duong , Q.-H. 2017. An efficient algorithm for mining top-k on-shelf high utility itemsets. Knowledge and Information Systems 52(3), 621–655. https://doi.org/10.1007/s10115-016-1020-2

    Google Scholar

    Dawar , S., Sharma , V. & Goyal , V. 2017. Mining top-k high-utility itemsets from a data stream under sliding window model. Applied Intelligence 47(4), 1240–1255. https://doi.org/10.1007/s10489-017-0939-7

    Google Scholar

    de Boer , P.-T., Kroese , D. P., Mannor , S. & Rubinstein , R. Y. 2004. A tutorial on the cross-entropy method. Annals of Operations Research 134, 19–67.

    Google Scholar

    Deb , K., Pratap , A., Agarwal , S. & Meyarivan , T. 2002. A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation 6(2), 182–197.

    Google Scholar

    Dong , X., Qiu , P., Lü , J., Cao , L. & Xu , T. 2019. Mining top- k useful negative sequential patterns via learning. IEEE Transactions on Neural Networks and Learning Systems 30(9), 2764–2778.

    Google Scholar

    Duong , Q.-H., Liao , B., Fournier-Viger , P. & Dam , T.-L. 2016. An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowledge-Based Systems 104, 106–122. http://www.sciencedirect.com/science/article/pii/S0950705116300582

    Google Scholar

    Esmat Rashedi , H. N.-p. & Saryazdi, S. 2010. BGSA: Binary gravitational search algorithm. Natural Computing 9, 727–745. https://link.springer.com/article/10.1007/s11047-009-9175-3

    Google Scholar

    Fournier-Viger , P. 2014. Fhn: Efficient mining of high-utility itemsets with negative unit profits. In Advanced Data Mining and Applications, Luo , X., Yu , J. X. & Li , Z. (eds), 16–29. Springer International Publishing.

    Google Scholar

    Fournier-Viger , P., Chun-Wei Lin , J., Truong-Chi , T. & Nkambou , R. 2019. A Survey of High Utility Itemset Mining. Springer International Publishing, 1–45. https://doi.org/10.1007/978-3-030-04921-8_1

    Google Scholar

    Fournier-Viger , P., Lin , J. C.-W., Duong , Q.-H. & Dam , T.-L. 2016a. FHM +: Faster High-Utility Itemset Mining Using Length Upper-Bound Reduction. Springer International Publishing, 115–127. http://dx.doi.org/10.1007/978-3-319-42007-3_11

    Google Scholar

    Fournier-Viger , P., Lin , J. C.-W., Duong , Q.-H. & Dam , T.-L. 2016b. PHM: Mining periodic high-utility itemsets. In Advances in Data Mining. Applications and Theoretical Aspects, Perner, P. (ed.), 64–79. Springer International Publishing.

    Google Scholar

    Fournier-Viger , P., Wang , Y., Lin , J. C.-W., Luna , J. M. & Ventura , S. 2020. Mining cross-level high utility itemsets. In Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices, Fujita , H., Fournier-Viger , P., Ali , M. & Sasaki , J. (eds), 858–871. Springer International Publishing.

    Google Scholar

    Fournier-Viger , P., Wu , C.-W., Zida , S. & Tseng , V. S. 2014. FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning, 83–92. Springer International Publishing. http://dx.doi.org/10.1007/978-3-319-08326-1_9

    Google Scholar

    Fournier-Viger , P. & Zida , S. 2015. FOSHU: Faster on-shelf high utility itemset mining – with or without negative unit profit. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC’15. ACM, 857–864. http://doi.acm.org/10.1145/2695664.2695823

    Google Scholar

    Fournier-Viger , P., Zida , S., Lin , J. C.-W., Wu , C.-W. & Tseng , V. S. 2016. EFIM-Closed: Fast and Memory Efficient Discovery of Closed High-Utility Itemsets, 199–213. Springer International Publishing. http://dx.doi.org/10.1007/978-3-319-41920-6_15

    Google Scholar

    Gan , W., Lin , J. C.-W., Chao , H.-C. & Yu , P. S. 2023. Discovering high utility episodes in sequences. IEEE Transactions on Artificial Intelligence 4(3), 473–486.

    Google Scholar

    Gan , W., Lin , J. C.-W., Fournier-Viger , P., Chao , H.-C., Tseng , V. S. & Yu , P. S. 2021. A survey of utility-oriented pattern mining. IEEE Transactions on Knowledge and Data Engineering 33(4), 1306–1327.

    Google Scholar

    Gan , W., Wan , S., Chen , J., Chen , C.-M. & Qiu , L. 2020. TopHUI: Top-k high-utility itemset mining with negative utility. In 2020 IEEE International Conference on Big Data (Big Data), 5350–5359.

    Google Scholar

    Gantner , Z., Rendle , S., Freudenthaler , C. & Schmidt-Thieme , L. 2011. MyMediaLite: A free recommender system library. In Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys’11, 305–308. Association for Computing Machinery. https://doi.org/10.1145/2043932.2043989

    Google Scholar

    Guo , G., Zhang , L., Liu , Q., Chen , E., Zhu , F. & Guan , C. 2014. High utility episode mining made practical and fast. In Advanced Data Mining and Applications, Luo , X., Yu , J. X. & Li , Z. (eds), 71–84. Springer International Publishing.

    Google Scholar

    Guttman , A. 1984. R-Trees: A dynamic index structure for spatial searching, In Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, SIGMOD’84, 47–57. Association for Computing Machinery. https://doi.org/10.1145/602259.602266

    Google Scholar

    Han , J., Pei , J. & Yin , Y. 2000. Mining frequent patterns without candidate generation. SIGMOD Record 29(2), 1–12. http://doi.acm.org/10.1145/335191.335372

    Google Scholar

    Han , X., Liu , X., Li , J. & Gao , H. 2021. Efficient top-k high utility itemset mining on massive data. Information Sciences 557, 382–406. https://www.sciencedirect.com/science/article/pii/S0020025520307921

    Google Scholar

    Holland , J. H. 1975. Adaptation in Natural and Artificial Systems, University of Michigan Press. 2nd edition, 1992.

    Google Scholar

    Kannimuthu , S. & Premalatha , K. 2014. Discovery of high utility itemsets using genetic algorithm with ranked mutation. Applied Artificial Intelligence 28(4), 337–359.

    Google Scholar

    Kennedy , J. & Eberhart , R. 1995. Particle swarm optimization. In Proceedings of ICNN’95 - International Conference on Neural Networks, 4, 1942–1948.

    Google Scholar

    Kennedy , J. & Eberhart , R. 1997. A discrete binary version of the particle swarm algorithm. In 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, 5, 4104–4108.

    Google Scholar

    Kiran , R. U., Zettsu , K., Toyoda , M., Fournier-Viger , P., Reddy , P. K. & Kitsuregawa , M. 2019. Discovering spatial high utility itemsets in spatiotemporal databases. In Proceedings of the 31st International Conference on Scientific and Statistical Database Management, SSDBM’19, 49–60. Association for Computing Machinery. https://doi.org/10.1145/3335783.3335789

    Google Scholar

    Krishnamoorthy , S. 2015. Pruning strategies for mining high utility itemsets. Expert Systems with Applications 42(5), 2371–2381. http://www.sciencedirect.com/science/article/pii/S0957417414006848

    Google Scholar

    Krishnamoorthy , S. 2017. HMiner: Efficiently mining high utility itemsets. Expert Systems with Applications 90(Supplement C), 168–183. http://www.sciencedirect.com/science/article/pii/S0957417417305675

    Google Scholar

    Krishnamoorthy , S. 2018. Efficiently mining high utility itemsets with negative unit profits. Knowledge-Based Systems 145, 1–14. https://www.sciencedirect.com/science/article/pii/S0950705117306135

    Google Scholar

    Krishnamoorthy , S. 2019a. A Comparative Study of Top-K High Utility Itemset Mining Methods. Springer International Publishing, 47–74. https://doi.org/10.1007/978-3-030-04921-8_2

    Google Scholar

    Krishnamoorthy , S. 2019b. Mining top-k high utility with effective threshold raising strategies. Expert Systems with Applications 117, 148–165. http://www.sciencedirect.com/science/article/pii/S0957417418306286

    Google Scholar

    Kumar , R. & Singh , K. 2022. A survey on soft computing-based high-utility itemsets mining. Soft Computing 26(13), 6347–6392. https://link.springer.com/article/10.1007/s10489-023-04853-5

    Google Scholar

    Kumar , R. & Singh , K. 2023. High utility itemsets mining from transactional databases: A survey. Applied Intelligence 53(22), 27655–27703. https://doi.org/10.1007/s10489-023-04853-5

    Google Scholar

    Kumari , P. L., Sanjeevi , S. G. & Rao , T. M. 2019. Mining top-k regular high-utility itemsets in transactional databases. International Journal of Data Warehousing and Mining (IJDWM) 15(1), 58–79. https://doi.org/10.4018/IJDWM.2019010104

    Google Scholar

    Lan , G.-C., Hong , T.-P., Huang , J.-P. & Tseng , V. S. 2014. On-shelf utility mining with negative item values. Expert Systems with Applications 41(7), 3450–3459. http://dx.doi.org/10.1016/j.eswa.2013.10.049

    Google Scholar

    Le , B., Truong , C. & Tran , M.-T. 2017. Enhancing threshold-raising strategies for effective mining top-k high utility patterns. In 2017 4th NAFOSTED Conference on Information and Computer Science, 78–83.

    Google Scholar

    Lee , S. & Park , J. S. 2016. Top-k high utility itemset mining based on utility-list structures. In 2016 International Conference on Big Data and Smart Computing (BigComp), 101–108.

    Google Scholar

    Li , Z., Ding , B., Han , J., Kays , R. & Nye , P. 2010. Mining periodic behaviors for moving objects. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’10, 1099–1108. Association for Computing Machinery. https://doi.org/10.1145/1835804.1835942

    Google Scholar

    Lin , C.-H., Wu , C.-W., Huang , J. & Tseng , V. S. 2019. Parallel mining of top-k high utility itemsets in spark in-memory computing architecture, 253–265. Springer-Verlag. https://doi.org/10.1007/978-3-030-16145-3_20

    Google Scholar

    Lin , J. C.-W., Li , T., Fournier-Viger , P., Hong , T.-P., Zhan , J. & Voznak , M. 2016. An efficient algorithm to mine high average-utility itemsets. Advanced Engineering Informatics 30(2), 233–243. http://www.sciencedirect.com/science/article/pii/S1474034616300507

    Google Scholar

    Lin , J. C.-W., Yang , L., Fournier-Viger , P., Hong , T.-P. & Voznak , M. 2017. A binary pso approach to mine high-utility itemsets. Soft Computing 21(17), 5103–5121. https://doi.org/10.1007/s00500-016-2106-1

    Google Scholar

    Lin , J. C.-W., Yang , L., Fournier-Viger , P., Wu , J. M.-T., Hong , T.-P., Wang , L. S.-L. & Zhan , J. 2016. Mining high-utility itemsets based on particle swarm optimization. Engineering Applications of Artificial Intelligence 55, 320–330. https://www.sciencedirect.com/science/article/pii/S0952197616301312

    Google Scholar

    Lin , Y. C., Wu , C.-W. & Tseng , V. S. 2015. Mining high utility itemsets in big data. In Advances in Knowledge Discovery and Data Mining, Cao , T., Lim , E.-P., Zhou , Z.-H., Ho , T.-B., Cheung , D. & Motoda , H. (eds), 649–661. Springer International Publishing.

    Google Scholar

    Liu , B., Wang , L. & Jin , Y. 2007. An effective pso-based memetic algorithm for flow shop scheduling. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 18–27. https://doi.org/10.1109/TSMCB.2006.883272

    Google Scholar

    Liu , J., Wang , K. & Fung , B. C. M. 2012. Direct discovery of high utility itemsets without candidate generation. In 2012 IEEE 12th International Conference on Data Mining, Brussels, Belgium, 984–989.

    Google Scholar

    Liu , J., Wang , K. & Fung , B. C. M. 2016. Mining high utility patterns in one phase without generating candidates. IEEE Transactions on Knowledge and Data Engineering 28(5), 1245–1257.

    Google Scholar

    Liu , J., Zhang , X., Fung , B. C., Li , J. & Iqbal , F. 2018. Opportunistic mining of top-n high utility patterns. Information Sciences 441, 171–186. https://www.sciencedirect.com/science/article/pii/S002002551631012X

    Google Scholar

    Liu , M. & Qu , J. 2012. Mining high utility itemsets without candidate generation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM’12, 55–64. ACM. http://doi.acm.org/10.1145/2396761.2396773

    Google Scholar

    Liu , Y.-C., Cheng , C.-P. & Tseng , V. 2013. Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC Bioinformatics 14, 230.

    Google Scholar

    Liu , Y., Liao , W.-k. & Choudhary, A. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD’05, 689–695. Springer-Verlag. http://dx.doi.org/10.1007/11430919_79

    Google Scholar

    Lu , T., Liu , Y. & Wang , L. 2014a. An algorithm of top-k high utility itemsets mining over data stream. Journal of Software 9(9), 2342–2347.

    Google Scholar

    Lu , T., Vo , B., Nguyen , H. T. & Hong , T.-P. 2014b. A new method for mining high average utility itemsets. In 13th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Saeed , K. & Snášel , V. (eds). Computer Information Systems and Industrial Management LNCS-8838, 33–42. Springer. Part 2: Algorithms. https://hal.inria.fr/hal-01405552

    Google Scholar

    Lu , W., Chen , S., Li , K. & Lakshmanan , L. V. S. 2014c. Show me the money: Dynamic recommendations for revenue maximization. Proceedings of the VLDB Endowment 7(14), 1785–1796. https://doi.org/10.14778/2733085.2733086

    Google Scholar

    Lucas , T., Silva , T. C., Vimieiro , R. & Ludermir , T. B. 2017. A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data. Applied Soft Computing 59(C), 487–499. https://doi.org/10.1016/j.asoc.2017.05.048

    Google Scholar

    Luna, J. M., Kiran, R. U., Fournier-Viger, P. & Ventura, S. 2023. Efficient mining of top-k high utility itemsets through genetic algorithms. Information Sciences 624, 529–553. https://www.sciencedirect.com/science/article/pii/S0020025522015882

    Google Scholar

    McDunn , J. E., Husain , K. D., Polpitiya , A. D., Burykin , A., Ruan , J., Li , Q., Schierding , W., Lin , N., Dixon , D., Zhang , W., Coopersmith , C. M., Dunne , W. M., Colonna , M., Ghosh , B. K. & Cobb , J. P. 2008. Plasticity of the systemic inflammatory response to acute infection during critical illness: Development of the riboleukogram. PLOS ONE 3(2), 1–14. https://doi.org/10.1371/journal.pone.0001564

    Google Scholar

    Nawaz , M. S., Fournier-Viger , P., Yun , U., Wu , Y. & Song , W. 2021. Mining high utility itemsets with hill climbing and simulated annealing. ACM Transactions on Management Information Systems 13(1). https://doi.org/10.1145/3462636

    Google Scholar

    Nouioua , M., Fournier-Viger , P., Gan , W., Wu , Y., Lin , J. C.-W. & Nouioua , F. 2022. TKQ: Top-k quantitative high utility itemset mining. In Advanced Data Mining and Applications, Li , B., Yue , L., Jiang , J., Chen , W., Li , X., Long , G., Fang , F. & Yu , H. (eds), 16–28. Springer International Publishing.

    Google Scholar

    Nouioua , M., Fournier-Viger , P., Wu , C.-W., Lin , J. C.-W. & Gan , W. 2021. FHUQI-Miner: Fast high utility quantitative itemset mining. Applied Intelligence 51(10), 6785–6809. https://doi.org/10.1007/s10489-021-02204-w

    Google Scholar

    Nouioua , M., Wang , Y., Fournier-Viger , P., Lin , J. C.-W. & Wu , J. M.-T. 2020. TKC: Mining top-k cross-level high utility itemsets. In 2020 International Conference on Data Mining Workshops (ICDMW), 673–682.

    Google Scholar

    Pallikila , P., Veena , P., Kiran , R. U., Avatar , R., Ito , S., Zettsu , K. & Reddy , P. K. 2021. Discovering top-k spatial high utility itemsets in very large quantitative spatiotemporal databases. In 2021 IEEE International Conference on Big Data (Big Data), 4925–4935.

    Google Scholar

    Pei , J., Han , J., Mortazavi-Asl , B. & Zhu , H. 2000. Mining access patterns efficiently from web logs. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, PADKK’00, 396–407. Springer-Verlag.

    Google Scholar

    Pham , N. N., Komnková Oplatková , Z., Huynh , H. M. & Vo , B. 2022a. Mining top-k high utility itemset using bio-inspired algorithms. In 2022 IEEE Workshop on Complexity in Engineering (COMPENG), 1–5.

    Google Scholar

    Pham , N. N., Oplatková , Z. K., Huynh , H. M. & Vo , B. 2022b. Mining top-k high utility itemsets using bio-inspired algorithms with a diversity within population framework. In 2022 RIVF International Conference on Computing and Communication Technologies (RIVF), 167–172.

    Google Scholar

    Qu , J.-F., Liu , M. & Fournier Viger , P. 2019. Efficient Algorithms for High Utility Itemset Mining Without Candidate Generation, 131–160.

    Google Scholar

    Rahmati , B. & Sohrabi , M. K. 2019. A systematic survey on high utility itemset mining. International Journal of Information Technology & Decision Making 18(04), 1113–1185. https://doi.org/10.1142/S0219622019300027

    Google Scholar

    Rathore , S., Dawar , S., Goyal , V. & Patel , D. 2016. Top-k high utility episode mining from a complex event sequence. In 21st International Conference on Management of Data, COMAD 2016, Pune, India, March 11–13, 2016, Deshpande , A., Ravindran , B. & Ranu , S. (eds), 56–63, Computer Society of India. http://comad.in/comad2016/proceedings/paper_19.pdf

    Google Scholar

    Ryang , H. & Yun , U. 2015. Top-k high utility pattern mining with effective threshold raising strategies. Knowledge-Based Systems 76, 109–126. http://www.sciencedirect.com/science/article/pii/S0950705114004481

    Google Scholar

    Shie , B.-E., Hsiao , H.-F., Tseng , V. S. & Yu , P. S. 2011. Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environments, 224–238. Springer Berlin Heidelberg. http://dx.doi.org/10.1007/978-3-642-20149-3_18

    Google Scholar

    Singh , K. & Biswas , B. 2019. Efficient algorithm for mining high utility pattern considering length constraints. International Journal of Data Warehousing and Mining (IJDWM) 15(3), 1–27. https://ideas.repec.org/a/igg/jdwm00/v15y2019i3p1-27.html

    Google Scholar

    Singh , K., Kumar , A., Singh , S. S., Shakya , H. K. & Biswas , B. 2019a. EHNL: An efficient algorithm for mining high utility itemsets with negative utility value and length constraints. Information Sciences 484, 44–70. https://www.sciencedirect.com/science/article/pii/S0020025519300696

    Google Scholar

    Singh , K., Kumar , R. & Biswas , B. 2022. High average-utility itemsets mining: a survey. Applied Intelligence 52(4), 3901–3938. https://link.springer.com/article/10.1007/s10489-021-02611-z

    Google Scholar

    Singh , K., Shakya , H. K., Abhimanyu , S. & Biswas , B. 2018. Mining of high utility itemsets with negative utility. Expert Systems 35(6), e12296. https://onlinelibrary.wiley.com/doi/abs/10.1111/exsy.12296

    Google Scholar

    Singh , K., Singh , S. S., Kumar , A. & Biswas , B. 2019b. TKEH: An efficient algorithm for mining top-k high utility itemsets. Applied Intelligence 49, 1078–1097. https://link.springer.com/article/10.1007/s10489-018-1316-x

    Google Scholar

    Singh , K., Singh , S. S., Kumar , A. & Biswas , B. n.d. High utility itemsets mining with negative utility value: A survey. Journal of Intelligent & Fuzzy Systems 35(6), 6551–6562.

    Google Scholar

    Singh , K., Singh , S. S., Luhach , A. K., Kumar , A. & Biswas , B. 2021. Mining of closed high utility itemsets: A survey. Recent Advances in Computer Science and Communications 14(1), 6–12. https://doi.org/10.2174/2213275912666190204134822

    Google Scholar

    Song , W. & Huang , C. 2018. Mining high utility itemsets using bio-inspired algorithms: A diverse optimal value framework. IEEE Access 6, 19568–19582.

    Google Scholar

    Song , W. & Li , J. 2020. Discovering high utility itemsets using set-based particle swarm optimization. In Advanced Data Mining and Applications, Yang , X., Wang , C.-D., Islam , M. S. & Zhang , Z. (eds). Springer International Publishing, 38–53.

    Google Scholar

    Song , W., Li , J. & Huang , C. 2021. Artificial fish swarm algorithm for mining high utility itemsets. In Advances in Swarm Intelligence, Tan , Y. & Shi , Y. (eds), 407–419. Springer International Publishing.

    Google Scholar

    Song , W., Liu , L. & Huang , C. 2020. TKU-CE: Cross-Entropy Method for Mining Top-K High Utility Itemsets, 846–857.

    Google Scholar

    Song , W., Liu , Y. & Li , J. 2014. BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap. International Journal of Data Warehousing and Mining 10(1), 1–15. https://doi.org/10.4018/ijdwm.2014010101

    Google Scholar

    Song , W., Zheng , C., Huang , C. & Liu , L. 2021. Heuristically mining the top-k high-utility itemsets with cross-entropy optimization. Applied Intelligence. http://doi.org/10.1007/s10489-021-02576-z

    Google Scholar

    Sun , R., Han , M., Zhang , C., Shen , M. & Du , S. 2021. Mining of top-k high utility itemsets with negative utility. Journal of Intelligent & Fuzzy Systems 40(3), 5637–5652. https://doi.org/10.3233/JIFS-201357

    Google Scholar

    Thanh Lam , H. & Calders , T. 2010. Mining Top-k Frequent Items in a Data Stream with Flexible Sliding Windows, KDD’10, 283–292. Association for Computing Machinery. https://doi.org/10.1145/1835804.1835842

    Google Scholar

    Tseng , V. S., Shie , B.-E., Wu , C.-W. & Yu , P. S. 2013. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering 25(8), 1772–1786. http://dx.doi.org/10.1109/TKDE.2012.59

    Google Scholar

    Tseng , V. S., Wu , C. W., Fournier-Viger , P. & Yu , P. S. 2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Transactions on Knowledge and Data Engineering 28(1), 54–67.

    Google Scholar

    Tseng , V. S., Wu , C.-W., Shie , B.-E. & Yu , P. S. 2010. UP-Growth: An efficient algorithm for high utility itemset mining. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’10, 253–262. Association for Computing Machinery. https://doi.org/10.1145/1835804.1835839

    Google Scholar

    Tzvetkov , P., Yan , X. & Han , J. 2003. TSP: Mining top-k closed sequential patterns. In Third IEEE International Conference on Data Mining, 347–354.

    Google Scholar

    Wan , S., Chen , J., Gan , W., Chen , G. & Goyal , V. 2021. THUE: Discovering top-k high utility episodes.

    Google Scholar

    Wang , J.-Z., Huang , J.-L. & Chen , Y.-C. 2016. On efficiently mining high utility sequential patterns. Knowledge and Information Systems 49(2), 597–627. https://doi.org/10.1007/s10115-015-0914-8

    Google Scholar

    Wu , C. W., Fournier-Viger , P., Gu , J. Y. & Tseng , V. S. 2015. Mining closed+ high utility itemsets without candidate generation. In 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), 187–194.

    Google Scholar

    Wu , C. W., Shie , B.-E., Tseng , V. S. & Yu , P. S. 2012. Mining top-k high utility itemsets. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’12. Association for Computing Machinery, 78–86. https://doi.org/10.1145/2339530.2339546

    Google Scholar

    Wu , R. & He , Z. 2018. Top-k high average-utility itemsets mining with effective pruning strategies. Applied Intelligence 48(10), 3429–3445. https://doi.org/10.1007/s10489-018-1155-9

    Google Scholar

    Wu , S.-Y. & Chen , Y.-L. 2007. Mining nonambiguous temporal patterns for interval-based events. IEEE Transactions on Knowledge and Data Engineering 19(6), 742–758.

    Google Scholar

    Yang , R., Xu , M., Jones , P. & Samatova , N. 2017. Real time utility-based recommendation for revenue optimization via an adaptive online top-k high utility itemsets mining model. In 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 1859–1866.

    Google Scholar

    Yang , X.-S. 2011. Bat algorithm for multi-objective optimisation. International Journal of Bio-Inspired Computation 3(5), 267–274. https://doi.org/10.1504/IJBIC.2011.042259

    Google Scholar

    Yen , S.-J. & Lee , Y.-S. 2007a. Mining High Utility Quantitative Association Rules. Springer Berlin Heidelberg, 283–292. http://dx.doi.org/10.1007/978-3-540-74553-2_26

    Google Scholar

    Yen , S.-J. & Lee , Y.-S. 2007b. Mining high utility quantitative association rules. In Data Warehousing and Knowledge Discovery, Song , I. Y., Eder , J. & Nguyen , T. M. (eds), 283–292. Springer Berlin Heidelberg.

    Google Scholar

    Yin , J., Zheng , Z. & Cao , L. 2012. USpan: An efficient algorithm for mining high utility sequential patterns. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’12, 660–668. ACM. http://doi.acm.org/10.1145/2339530.2339636

    Google Scholar

    Yin , J., Zheng , Z., Cao , L., Song , Y. & Wei , W. 2013. Efficiently mining top-k high utility sequential patterns. In 2013 IEEE 13th International Conference on Data Mining, 1259–1264.

    Google Scholar

    Yun , U. & Kim , D. 2017. Mining of high average-utility itemsets using novel list structure and pruning strategy. Future Generation Computer Systems 68, 346–360. http://www.sciencedirect.com/science/article/pii/S0167739X16304733

    Google Scholar

    Yun , U., Ryang , H. & Ryu , K. H. 2014. High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Systems with Applications 41(8), 3861–3878. http://www.sciencedirect.com/science/article/pii/S0957417413009585

    Google Scholar

    Zaki , M. J. 2001. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning 42(1), 31–60. https://link.springer.com/article/10.1023/A:1007652502315

    Google Scholar

    Zhang , C., Almpanidis , G., Wang , W. & Liu , C. 2018. An empirical evaluation of high utility itemset mining algorithms. Expert Systems with Applications 101, 91–115. https://www.sciencedirect.com/science/article/pii/S0957417418300782

    Google Scholar

    Zhang , C., Du , Z., Gan , W. & Yu , P. S. 2021. TKUS: Mining top-k high utility sequential patterns. Information Sciences 570, 342–359. https://www.sciencedirect.com/science/article/pii/S0020025521003625

    Google Scholar

    Zhang , C., Han , M., Sun , R., Du , S. & Shen , M. 2020. A survey of key technologies for high utility patterns mining. IEEE Access 8, 55798–55814. http://doi.org/10.1109/ACCESS.2020.2981962

    Google Scholar

    Zhang , L., Yang , S., Wu , X., Cheng , F., Xie , Y. & Lin , Z. 2019. An indexed set representation based multi-objective evolutionary approach for mining diversified top-k high utility patterns. Engineering Applications of Artificial Intelligence 77, 9–20.

    Google Scholar

    Zhang , Q. & Li , H. 2007. Moea/d: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on Evolutionary Computation 11(6), 712–731.

    Google Scholar

    Zheng , Z., Zhao , Y., Zuo , Z. & Cao , L. 2009. Negative-GSP: An efficient method for mining negative sequential patterns. In Eighth Australasian Data Mining Conference, AusDM 2009, Melbourne, Australia, December 2009, Kennedy , P. J., Ong , K. & Christen , P. (eds), CRPIT 101, 63–68. Australian Computer Society. http://crpit.scem.westernsydney.edu.au/abstracts/CRPITV101Zheng.html

    Google Scholar

    Zida , S., Fournier-Viger , P., Lin , J. C.-W., Wu , C.-W. & Tseng , V. S. 2015. EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining. Springer International Publishing, 530–546. http://dx.doi.org/10.1007/978-3-319-27060-9_44

    Google Scholar

    Zihayat , M. & An , A. 2014. Mining top-k high utility patterns over data streams. Information Sciences 285, 138–161. Processing and Mining Complex Data Streams. http://www.sciencedirect.com/science/article/pii/S0020025514000814

    Google Scholar

    Zihayat , M., Davoudi , H. & An , A. 2016. Top-k utility-based gene regulation sequential pattern discovery. In 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 266–273.

    Google Scholar

    Zihayat , M., Wu , C.-W., An , A. & Tseng , V. S. 2015. Mining High Utility Sequential Patterns from Evolving Data Streams. Association for Computing Machinery. https://doi.org/10.1145/2818869.2818883

    Google Scholar

    Zitzler , E., Laumanns , M. & Thiele , L. 2001. SPEA2: Improving the strength pareto evolutionary algorithm. TIK-Report. Eidgenössische Technische Hochschule ZÜrich (ETH), Institut für Technische Informatik und Kommunikationsnetze (TIK). http://doi.org/10.3929/ethz-a-004284029

    Google Scholar

    Zuo , Y., Gong , M., Zeng , J., Ma , L. & Jiao , L. 2015. Personalized recommendation based on evolutionary multi-objective optimization [research frontier]. IEEE Computational Intelligence Magazine 10(1), 52–62.

    Google Scholar

  • Cite this article

    Rajiv Kumar, Kuldeep Singh. 2024. Top-k high utility itemset mining: current status and future directions. The Knowledge Engineering Review 39(1), doi: 10.1017/S0269888924000055
    Rajiv Kumar, Kuldeep Singh. 2024. Top-k high utility itemset mining: current status and future directions. The Knowledge Engineering Review 39(1), doi: 10.1017/S0269888924000055

Article Metrics

Article views(81) PDF downloads(230)

Other Articles By Authors

REVIEW   Open Access    

Top-k high utility itemset mining: current status and future directions

Abstract: Abstract: High utility itemsets mining (HUIM) is an important sub-field of frequent itemset mining (FIM). Recently, HUIM has received much attention in the field of data mining. High utility itemsets (HUIs) have proven to be quite useful in marketing, retail marketing, cross-marketing, and e-commerce. Traditional HUIM approaches suffer from a drawback as they need a user-defined minimum utility ($ min\_util $) threshold. It is not easy for the users to set the appropriate $ min\_util $ threshold to find actionable HUIs. To target this drawback, top-k HUIM has been introduced. top-k HUIM is more suitable for supermarket managers and retailers to prepare appropriate strategies to generate higher profit. In this paper, we provide an in-depth survey of the current status of top-k HUIM approaches. The paper presents the task of top-k HUIM and its relevant definitions. It reviews the top-k HUIM approaches and presents their advantages and disadvantages. The paper also discusses the important strategies of the top-k HUIM, their variations, and research opportunities. The paper provides a detailed summary, analysis, and future directions of the top-k HUIM field.

    • If an item is non-frequent, then all of its super-sets are non-frequent.

    • An itemset is HTWUI (High Transaction-Weighted Utility Itemset) if its TWU is no less than the minimum-utility threshold specified by the user.

    • This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
References (128)
  • About this article
    Cite this article
    Rajiv Kumar, Kuldeep Singh. 2024. Top-k high utility itemset mining: current status and future directions. The Knowledge Engineering Review 39(1), doi: 10.1017/S0269888924000055
    Rajiv Kumar, Kuldeep Singh. 2024. Top-k high utility itemset mining: current status and future directions. The Knowledge Engineering Review 39(1), doi: 10.1017/S0269888924000055
  • Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return