Youcef Djenouri
Frequent itemset mining in big data with effective single scan algorithms
Djenouri, Youcef; Djenouri, Djamel; Chun-Wei Lin, Jerry; Belhadi, Asma
Authors
Dr Djamel Djenouri Djamel.Djenouri@uwe.ac.uk
Associate Professor in Computer Science
Jerry Chun-Wei Lin
Asma Belhadi
Abstract
© 2013 IEEE. This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases.
Journal Article Type | Article |
---|---|
Acceptance Date | Nov 5, 2018 |
Online Publication Date | Nov 9, 2018 |
Publication Date | Nov 9, 2018 |
Deposit Date | Jan 21, 2020 |
Publicly Available Date | Jan 23, 2020 |
Journal | IEEE Access |
Electronic ISSN | 2169-3536 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Volume | 6 |
Pages | 68013-68026 |
DOI | https://doi.org/10.1109/ACCESS.2018.2880275 |
Keywords | Itemsets; data mining; big data; clustering algorithms; runtime; computer science; Apriori; frequent itemset mining; heuristic; parallel computing; support computing |
Public URL | https://uwe-repository.worktribe.com/output/5193077 |
Publisher URL | https://doi.org/10.1109/ACCESS.2018.2880275 |
Files
Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms
(858 Kb)
PDF
Licence
http://www.rioxx.net/licenses/all-rights-reserved
Publisher Licence URL
http://www.rioxx.net/licenses/all-rights-reserved
Copyright Statement
(c) 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
You might also like
A gradual solution to detect selfish nodes in mobile ad hoc networks
(2010)
Journal Article
Towards immunizing MANET's source routing protocols against packet droppers
(2009)
Journal Article
On eliminating packet droppers in MANET: A modular solution
(2008)
Journal Article
Struggling against selfishness and black hole attacks in MANETs
(2007)
Journal Article
Distributed low-latency data aggregation scheduling in wireless sensor networks
(2015)
Journal Article
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search