Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Mining Big Data Streams: The Fallacy of Blind Correlation and the Importance of Models

Abbass, H.

    Big data streams mark a new era in artificial intel- ligence and the data mining literature. Video and voice streams have grown rapidly in recent years. A single lab–based human–computer interaction exper- iment with one human subject collecting Cognitive, Physiological, and other data can easily generate a few terabytes of data in a single hour; growing rapidly to a Petabyte within a timeframe less than a month. In an article in the Wired Magazine, 2008, by Chris Anderson, he wrote “the data deluge makes the sci- entific method obsolete”. He predicted that in the age of Petabyte and beyond, a meaningful correlation analysis is enough! Chris comment was provocative; but some started believing it. So was Chris right or wrong? Why? What can we do to face the outburst of big data? Do we have the data mining tools to man- age these data? Where is the future of data mining heading? In this talk, I will discuss the above ques- tions and demonstrate some answers using examples of my work and analysis.
Cite as: Abbass, H. (2011). Mining Big Data Streams: The Fallacy of Blind Correlation and the Importance of Models. In Proc. Australasian Data Mining Conference (AusDM 11) Ballarat, Australia. CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. and Kennedy, P. J. Eds., ACS. 5
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS