Data Mining with Big DataXindong Wu, Fellow , IEEE, Xingquan Zhu, Senior Member, IEEE,Gong-Qing Wu, and Wei Ding, Senior Member, IEEEAbstract— Big Data concernlarge-volume, complex, growing data sets with multiple, autonomous sources. With the fast developmentof networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineeringdomains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the featuresof the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven modelinvolves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacyconsiderations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.Index Terms— Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associationsÇ1INTRODUCTIONDR. Yan Mo won the 2012 Nobel Prize in Literature. Thisis probably the most controversial Nobel prize of thiscategory. Searching on Google with “Yan Mo Nobel Prize,”resulted in 1,050,000 web pointers on the Internet (as of3 January 2013). “For all praises as well as criticisms,” saidMo recently, “I am grateful.” What types of praises andcriticisms has Mo actually received over his 31-year writingcareer? As comments keep coming on the Internet and invarious news media, can we summarize all types ofopinions in different media in a real-time fashion, includingupdated, cross-referenced discussions by critics? This typeof summarization program is an excellent example for BigData processing, as the information comes from multiple,heterogeneou...