A MapReduce Framework for Mining Maximal Contiguous Frequent Patterns in Large DNA Sequence Datasets

Cited 16 time in webofscience Cited 0 time in scopus
  • Hit : 673
  • Download : 811
Current DNA sequence datasets have become extremely large, making it a great challenge for single-processor and main-memory-based computing systems to mine interesting patterns. Such limited hardware resources make the performance of most Apriori-like algorithms inefficient. However, recent implementation of a MapReduce framework has overcome these limitations. Furthermore, mining with maximal contiguous frequent patterns to express the function and structure of DNA sequences is a useful technique, capable of capturing the common data characteristics among related sequences. In this paper, we proposed an efficient approach for mining maximal contiguous frequent patterns in large DNA sequence data using MapReduce framework which can handle a massive DNA sequence datasets with a large number of nodes on a Hadoop platform. Our extensive experimental results show that the proposed approach can mine the complete set of maximal contiguous frequent patterns very efficiently.
Publisher
MEDKNOW PUBLICATIONS & MEDIA PVT LTD
Issue Date
2012-03
Language
English
Article Type
Article
Citation

IETE TECHNICAL REVIEW, v.29, no.2, pp.162 - 168

ISSN
0256-4602
DOI
10.4103/0256-4602.95388
URI
http://hdl.handle.net/10203/103290
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
000311688700007.pdf(947.48 kB)Download
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 16 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0