返回文章列表

Install Hadoop 2.6.0 on Ubuntu 14.04

2017-06-21
1 分鐘
HadoopLinuxDevOps

1. Download Hadoop

從 Hadoop 官網下載,或透過 wget 直接下載。(Hadoop download link

2. Extract, Move, and Create Symlink

$ tar -zxf hadoop-2.6.0.tar.gz
$ sudo mv hadoop-2.6.0/ /usr/local/
$ cd /usr/local
$ sudo ln -s hadoop-2.6.0/ hadoop

3. Create a Dedicated Hadoop User

$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser
$ sudo adduser hduser sudo
$ sudo chown -R hduser:hadoop /usr/local/hadoop/

4. Configure SSH Passwordless Login

$ su hduser
$ sudo apt-get install ssh
$ ssh-keygen -t rsa -P ""
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 600 ~/.ssh/authorized_keys && chmod 700 ~/.ssh

5. Adjust /etc/sysctl.conf File

在檔案底部加入:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

修改後:$ sudo service networking restart$ sudo shutdown -r now

6. Test SSH Connection

$ su hduser
$ ssh localhost
'Are you sure you want to continue connecting?'(yes/no)yes

7. Update and Install Java

$ sudo apt-get update
$ sudo apt-get install default-jdk

若已安裝,找到 JAVA_HOME 路徑:$ which java | sed -e 's/\(.*\)\/bin\/java/\1/g'

切換為 hduser($ su hduser)並修改 ~/.bashrc,在底部加入:

export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr

重新載入設定:$ source ~/.bashrc

8. Adjust Hadoop Configuration Files

建立 HDFS 目錄

$ su hduser
$ mkdir /usr/local/hadoop/data

修改 hadoop-env.sh

JAVA_HOME=${JAVA_HOME} 改為 JAVA_HOME=/usr,並修改 HADOOP_OPTS:

hadoop-env.sh
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.library.path=${HADOOP_PREFIX}/lib"
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native

修改 yarn-env.sh

yarn-env.sh
export HADOOP_CONF_LIB_NATIVE_DIR=${HADOOP_PREFIX:-"/lib/native"}
export HADOOP_OPTS="-Djava.library.path=${HADOOP_PREFIX}/lib"

修改 core-site.xml

core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/data</value>
  </property>
</configuration>

修改 mapred-site.xml

mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

修改 hdfs-site.xml

hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
</configuration>

修改 yarn-site.xml

yarn-site.xml
<?xml version="1.0"?>
<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>localhost:8025</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>localhost:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>localhost:8050</value>
  </property>
</configuration>

9. Format Namenode and Start Hadoop

$ /usr/local/hadoop/bin/hadoop namenode -format
$ /usr/local/hadoop/sbin/start-dfs.sh
$ /usr/local/hadoop/sbin/start-yarn.sh

10. Verify Installation

執行 $ jps 確認各 process 正常運行。

開啟瀏覽器並前往 http://localhost:50070 查看 Web 介面。

原文發表於 Medium

Command Palette

Search for a command to run...