摘要:本文记录了Hadoop2.2.0版本多节点集群安装过程,并做了基本配置,启动运行和测试了一个单词统计例子。环境说明:基于Windows下的VMwarePlayer4.0.3中的Ubuntu12.04-64server安装,先把基础软件安装到一个虚拟机中,然后拷贝两份再配置下即可。三台机器的分工如下:Hadoop1(Master):NameNode/ResouceManagerHadoop2(Slave):DataNode/NodeManagerHadoop3(Slave):DataNode/NodeManager假定三台虚拟机的IP地址如下,后面会用到。Hadoop1:192.168.128.130Hadoop2:192.168.128.131Hadoop3:192.168.128.1321、环境准备:下载免费的VMwarePlayer并安装好;下载免费的Ubuntu12.04server版并在VMware中安装好;2、基础安装:执行如下命令升级部分软件和把ssh安装好(1)sudoapt-getupdate;(2)sudoapt-getupgrade;(3)sudoapt-getinstallopenssh-server;安装OracleJDK通过webupd8team自动安装,执行命令如下:(1)sudoapt-getinstallpython-software-properties(2)sudoadd-apt-repositoryppa:webupd8team/java(3)sudoapt-getupdate(4)sudoapt-getinstalloracle-java6-installer创建hadoop用户(1)sudoaddgrouphadoop(2)sudoadduser–ingrouphadoophduser编辑/etc/sudoers编辑文件,在rootALL=(ALL)ALL行下添加hduserALL=(ALL)ALL。如果不添加这行,hduser将不能执行sudo操作。★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★注:以下操作均用hduser用户登录后操作。★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★3、公共安装:下载Hadoop2.2.0版本(1)$cd/home/hduser(2)$wgethttp://apache.dataguru.cn/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz(3)$tarzxfhadoop-2.2.0.tar.gz(4)$mvhadoop-2.2.0hadoop配置Hadoop:(1)配置/home/hduser/hadoop/etc/hadoop/hadoop-env.sh替换exportJAVA_HOME=${JAVA_HOME}为如下:exportJAVA_HOME=/usr/lib/jvm/java-6-oracle(2)配置/home/hduser/hadoop/etc/hadoop/core-site.xml,在
中添加如下:hadoop.tmp.dir/home/hduser/hadoop/tmp/hadoop-${user.name}Abaseforothertemporarydirectories.fs.default.namehdfs://192.168.128.130:8010Thenameofthedefaultfilesystem.AURIwhoseschemeandauthoritydeterminetheFileSystemimplementation.Theuri’sschemedeterminestheconfigproperty(fs.SCHEME.impl)namingtheFileSystemimplementationclass.Theuri’sauthorityisusedtodeterminethehost,port,etc.forafilesystem.注意:以下两点务必确保正确,否则后面会出错。a.需执行mkdirhome/hduser/hadoop/tmp创建这个临时目录;b.这个fd.default.name值的IP地址为NameNode的地址,即Hadoop1。配置/home/hduser/hadoop/etc/hadoop/mapred-site.xml(1)cp/home/hduser/hadoop/etc/hadoop/mapred-site.xml.template/home/hduser/hadoop/etc/hadoop/mapred-site.xml(2)在中添加如下:mapred.job.tracker192.168.128.130:54311ThehostandportthattheMapReducejobtrackerrunsat.If”local”,thenjobsarerunin-processasasinglemapandreducetask.配置/home/hduser/hadoop/etc/hadoop/hdfs-site.xml在中添加如下:dfs.replication2Defaultblockreplication.Theactualnumberofreplicationscanbespecifiedwhenthefileiscreated.Thedefaultisusedifreplicationisnotspecifiedincreatetime.4、整体安装将上面安装和配置好的虚拟机拷贝两份,即Hadoop2和Hadoop3。分别修改三台虚拟机的/etc/hostname中的内容改为相应的主机名,即hadoop1的hostname为hadoop1,其他类推。修改完成后需要重启,并通过命令hostname确认已经生效。分别检查并修改三台虚拟机的/etc...