中文摘要在科技不断发展、网络覆盖率持续扩张的今天,大数据时代早已悄然而至,每个行业都会形成海量、繁杂的数据。对于如何有效地采集信息、如何挖掘出数据内部的规律,学者们一直在不遗余力地研究并不断提供各种新型的技术。对于国内的私募基金而言,大数据的价值及其所蕴含的新商机并未充分体现出来,不过可以肯定的是,大数据在私募基金决策的过程中起着越来越重要的作用。基金行业是一个大数据市场,股市每天交易频繁,由此形成了大量的数据,大数据分析技术在这一行业不断走向成熟的过程中,也在不断地更新。笔者在本课题中,以金融市场中存在的主要问题为切入点,以上市企业为对象,探讨如何对基本面数据以及股票交易展开挖掘分析,并且详细阐述了K-means算法的基本原理、神经网络在选股的作用原理;在现有股票预测原理的基础上,提出的多聚类分析股票数据的方法,结合分类算法对股票数据进行训练,形成基于聚类分析的智能选股算法。然后对算法进行验证,对整体方案框架进行设计,通过MATLAB进行聚类实现,最后得出最优参数下的聚类结果。最后结合聚类分析出来的选股算法,利用HADOOP技术设计一个简单,稳定高性能的智能选股系统。实验结果显示,开发出的模型分析选股系统可以对股票数据进行多维的分析预测,作为投资者的投资决策的辅助工具,是利用数据挖掘技术结合多聚类分析股票数据的方法,分析大量与股票相关数据,并做出未来走势预测,具有一定的实用意义。关键词:私募基金;数据挖掘;证券分析;互联网大数据;交易数据AbstractWiththecontinuousdevelopmentoftechnologyandthecontinuousexpansionofnetworkcoverage,theeraofbigdatahaslongbeenquietlyemergingandthereismassiveandcomplicateddataineveryindustry.Researchershavebeensparingnoeffortinresearchingandconstantlyprovidingvariousnewtechnologiesforhowtoeffectivelycollectinformationandhowtofindouttheinternallawsofdata.Fordomesticprivateequityfunds,thevalueofbigdataandthenewbusinessopportunitiesbigdatacontainsarenotfullydemonstrated,butitiscertainthatbigdataplaysanincreasinglyimportantroleinthedecision-makingprocessofprivateequityfunds.Thefundindustryisabigdatamarket.Dailytradesinthestockmarketresultinalargeamountofdata.Analysistechnologyofbigdataisconstantlyupdatedwiththeindustrybeingmature.Inthissubject,takeingthemainproblemsexistinginthefinancialmarketasthestartingpointandthelistedcompaniesastheobject,theauthorexploreshowtoexcavateandanalyzethefundamentaldataandstocktransactions,andelaboratesthebasicprincipleofK-meansalgorithm.Basedontheexistingstockforecastingtheory,thispaperproposesamulti-clusteringmethodtoanalyzestockdataandacombinationofclassificationalgorithmstotrainstockdatatoformanintelligentstockselectionalgorithmbasedonclusteringanalysis.Thenthealgorithmisverified,theoverallprogramframeworkisdesigned,andtheclusteringisrealizedbyMATLAB.Finally,theclusteringresultsundertheoptimalparametersareobtained.Atlastcombiningwiththestockselectionalgorithmbasedonclusteringanalysis,asimplestableandhigh-performanceintelligentstockselectionsystemisdesignedbyusingHADOOPtechnology.Theexperimentalresultsshowthatthemodelstockpickingsystemcanmakemulti-dimensionalanalysisandforecastingofstockdata.Asasupportingtoolforinvestors'investmentdecision-making,ithascertainpracticalsignificance,whichusesthemethodofdataminingcombinedwithmulti-clusteranalysisofstockdata,Stock-relateddata,andmakethefuturetrendforecast.KeyWords:privatefund;datamining;securitiesanalysis;Internetbigdata目录中文摘要.....................................................................................................................................IABSTRACT..................