摘 要近年来,各个国家越来越重视非物质文化资源的发展。尤其随着互联网的快速发展,利用数字化手段保护非物质文化资源的力度越来越大。面对数量庞大,类型多样,分布广泛的非物质文化资源,数据挖掘技术在其挖掘和保护方面起了很大的作用。它不仅能够将海量的非物质文化资源进行分类,而且能够通过数据挖掘中的聚类算法挖掘其中的价值。其次,通过数据挖掘中的文本挖掘,了解非物质文化资源中文本信息的价值,本文从中国非物质文化遗产网上获得非物质文化资源的文本文档,并对文本文档作预处理,将预处理后的文本分解为音乐、舞蹈、民间文学、戏剧、曲艺、民俗、医药、技艺、美术、体育十个类别,对比 10 个类别的文本数据的聚类效果,判断文本蕴含的文本信息和文本挖掘价值,通过实验证明,体杂文本聚类效果最好,说明体杂文本所蕴含的信息量小,文本挖掘的价值小;反之,技艺文本的聚类效果越差,说明技艺文本所蕴含的文本信息大,文本挖掘的价值大。最后论文进行总结,对以后的发展做出展望。关键词:非物质文化资源;数据挖掘;K-means 算法PAGE \* MERGEFORMAT2AbstractIn recent years, each country pays more and more attention to the development of immaterial cultural resources. Especially with the rapid development of the Internet, more and more efforts have been made to protect intangible cultural resources by digital means. Data mining technology plays an important role in the mining and protection of the huge, diverse and widely distributed non-material cultural resources. It can not only classify massive immaterial cultural resources, but also mine the value of them by clustering algorithm in data mining. In addition, through text mining in data mining, we can understand the value of immaterial cultural resources' Chinese text information. This article from the Chinese intangible cultural heritage get intangible cultural resources online text documents, and pretreatment of text documents, after preprocessing of the text is decomposed into music, dance, folk literature, drama, folk art, folk,...