I北京理工大学珠海学院 2020 届本科毕业论文数据科学——数据采集与加工技术研究数据科学——数据采集与加工技术研究摘 要近年来,随着互联网的全面普及,丰富的信息资源成为了社会最重要的财富。信息数据的海量扩增,促使着我们进入大数据时代。大数据影响着我们的日常生活左右着社会的发展。本文将对大数据的采集与加工技术开始研究分析。首先会 介绍课题研究背景及意义,然后分析国内外研究现状,以及大数据技术的发展趋势。然后从多角度对采集技术和加工技术进行研究分析,例如数据采集方式会从 Web 爬虫和系统日志采集进行详细介绍,加工方法主要介绍数据清理等。关键词:大数据;Web 爬虫;系统日志;数据清理;II北京理工大学珠海学院 2020 届本科毕业论文Data science ——research on data acquisition and processing technologyAbstract In recent years, with the full popularity of the Internet, rich information resources have become the most important wealth of society.The massive expansion of information data has prompted us to enter the era of big data.Big data affects our daily lives and influences the development of society.This article will begin to study and analyze the big data collection and processing technology.First, it will introduce the research background and significance of the subject, and then analyze the current status of research at home and abroad, as well as the development trend of big data technology.Then research and analyze the collection technology and processing technology from multiple angles, for example, the data collection method will be introduced in detail from the web crawler and system log collection,the processing method mainly introduces data cleaning.Keywords: big data; web crawler; system log; data cleaning;III北京理工大学珠海学院 2020 届本科毕业论文目录摘 要................................................................................................................................................................I一...