阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Hadoop2.5.0伪分布式环境搭建

157次阅读
没有评论

共计 6799 个字符,预计需要花费 17 分钟才能阅读完成。

本章主要介绍下在 Linux 系统下的 Hadoop2.5.0 伪分布式环境搭建步骤。首先要搭建 Hadoop 伪分布式环境,需要完成一些前置依赖工作,包括创建用户、安装 JDK、关闭防火墙等。

一、创建 hadoop 用户

使用 root 账户创建 hadoop 用户,为了在实验环境下便于操作,赋予 hadoop 用户 sudo 权限。具体操作代码如下:

useradd hadoop # 添加 hadoop 用户
passwd hadoop # 设置密码
visudo
hadoop ALL=(root)NOPASSWD:ALL

二、Hadoop 伪分布式环境搭建

1、关闭 Linux 中的防火墙和 selinux

禁用 selinux,代码如下:

sudo vi /etc/sysconfig/selinux # 打开 selinux 配置文件
SELINUX=disabled # 修改 SELINUX 属性值为 disabled

关闭防火墙,代码如下:

sudo service iptables status # 查看防火墙状态
sudo service iptables stop # 关闭防火墙
sudo chkconfig iptables off # 关闭防火墙开机启动设置

2、安装 jdk

首先,查看系统中是否有安装自带的 jdk,如果存在,则先卸载,代码如下:

rpm -qa | grep java # 查看是否有安装 jdk
sudo rpm -e –nodeps java-1.6.0-openjdk-1.6.0.0-1.50.1.11.5.el6_3.x86_64 tzdata-java-2012j-1.el6.noarch java-1.7.0-openjdk-1.7.0.9-2.3.4.1.el6_3.x86_64 # 卸载自带 jdk

接着,安装 jdk,步骤如下:

step1. 解压安装包:

tar -zxf jdk-7u67-linux-x64.tar.gz -C /usr/local/

step2. 配置环境变量及检查是否安装成功:

sudo vi /etc/profile # 打开 profile 文件
##JAVA_HOME
export JAVA_HOME=/usr/local/jdk1.7.0_67
export PATH=$PATH:$JAVA_HOME/bin

# 生效文件
source /etc/profile # 使用 root 用户操作

# 查看是否配置成功
java -version

3、安装 hadoop

step1:解压 hadoop 安装包

tar -zxvf /opt/software/hadoop-2.5.0.tar.gz -C /opt/software/

建议:将 /opt/software/hadoop-2.5.0/share 下的 doc 目录删除。

step2:修改 etc/hadoop 目录下 hadoop-env.sh、mapred-env.sh、yarn-env.sh 三个配置文件中的 JAVA_HOME

export JAVA_HOME=/usr/local/jdk1.7.0_67

step3:修改 core-site.xml

<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2019-04/configuration.xsl”?>
<!–
  Licensed under the Apache License, Version 2.0 (the “License”);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an “AS IS” BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
–>

<!– Put site-specific property overrides in this file. –>

<configuration>
    <property>
        <name>name</name>
        <value>my-study-cluster</value>
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://bigdata01:8020</value>
    </property>
        <!– 指定 Hadoop 系统生成文件的临时目录地址 –>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/software/hadoop-2.5.0/data/tmp</value>
    </property>
    <property>
        <name>fs.trash.interval</name>
        <value>1440</value>
    </property>
    <property>
        <name>hadoop.http.staticuser.user</name>
        <value>hadoop</value>
    </property>
        <property>
                <name>hadoop.proxyuser.hadoop.hosts</name>
                <value>bigdata01</value>
        </property>
        <property>
                <name>hadoop.proxyuser.hadoop.groups</name>
                <value>*</value>
        </property>
</configuration>

step4:修改 hdfs-site.xml

<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2019-04/configuration.xsl”?>
<!–
  Licensed under the Apache License, Version 2.0 (the “License”);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an “AS IS” BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
–>

<!– Put site-specific property overrides in this file. –>

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/opt/software/hadoop-2.5.0/data/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/opt/software/hadoop-2.5.0/data/data</value>
    </property>
</configuration>

step5:修改 mapred-site.xml

<?xml version=”1.0″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2019-04/configuration.xsl”?>
<!–
  Licensed under the Apache License, Version 2.0 (the “License”);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an “AS IS” BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
–>

<!– Put site-specific property overrides in this file. –>

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>bigdata01:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>bigdata01:19888</value>
    </property>
</configuration>

step6:修改 yarn-site.xml

<?xml version=”1.0″?>
<!–
  Licensed under the Apache License, Version 2.0 (the “License”);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an “AS IS” BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
–>
<configuration>

<!– Site specific YARN configuration properties –>

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>bigdata01</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>106800</value>
    </property>
    <property>
        <name>yarn.log.server.url</name>
        <value>http://bigdata01:19888/jobhistory/job/</value>
    </property>
</configuration>

step7:修改 slaves 文件

bigdata01

step8:格式化 namenode

bin/hdfs namenode -format

step9:启动进程

## 方式一:单独启动一个进程
# 启动 namenode
sbin/hadoop-daemon.sh start namenode
# 启动 datanode
sbin/hadoop-daemon.sh start datanode
# 启动 resourcemanager
sbin/yarn-daemon.sh start resourcemanager
# 启动 nodemanager
sbin/yarn-daemon.sh start nodemanager
# 启动 secondarynamenode
sbin/hadoop-daemon.sh start secondarynamenode
# 启动历史服务器
sbin/mr-jobhistory-daemon.sh start historyserver

## 方式二:
sbin/start-dfs.sh # 启动 namenode、datanode、secondarynamenode
sbin/start-yarn.sh # 启动 resourcemanager、nodemanager
sbin/mr-jobhistory-daemon.sh start historyserver # 启动历史服务器

step10:检查

1. 通过浏览器访问 HDFS 的外部 UI 界面,加上外部交互端口号:50070

http://bigdata01:50070

2. 通过浏览器访问 YARN 的外部 UI 界面,加上外部交互端口号:8088

http://bigdata01:8088

3. 执行 Wordcount 程序

bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount input output

注:输入输出目录自定义

结束!

以上为 Hadoop2.5.0 伪分布式环境搭建步骤,如有问题,请指出,谢谢!

更多 Hadoop 相关信息见 Hadoop 专题页面 https://www.linuxidc.com/topicnews.aspx?tid=13

正文完
星哥说事-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-21发表,共计6799字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中