阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Hadoop动态添加删除节点datanode及恢复

148次阅读
没有评论

共计 7386 个字符,预计需要花费 19 分钟才能阅读完成。

1. 配置系统环境

主机名,ssh 互信, 环境变量等

本文略去 jdk 安装,请将 datanode 的 jdk 安装路径与 /etc/Hadoop/hadoop-evn.sh 中的 java_home 保持一致,版本 hadoop2.7.5

修改 /etc/sysconfig/network

然后执行命令
hostname 主机名
这个时候可以注销一下系统,再重登录之后就行了

[root@localhost ~]# hostname
localhost.localdomain
[root@localhost ~]# hostname -i
::1 127.0.0.1
[root@localhost ~]#
[root@localhost ~]# cat /etc/sysconfig/network
# Created by anaconda
NETWORKING=yes
HOSTNAME=slave2
GATEWAY=192.168.48.2
# Oracle-rdbms-server-11gR2-preinstall : Add NOZEROCONF=yes
NOZEROCONF=yes
[root@localhost ~]# hostname slave2
[root@localhost ~]# hostname
slave2
[root@localhost ~]# su – hadoop
Last login: Sat Feb 24 14:25:48 CST 2018 on pts/1
[hadoop@slave2 ~]$ su – root

建 datanode 目录并改所有者

(此处的具体路径值,请参照 namenode 中 /usr/hadoop/hadoop-2.7.5/etc/hadoop/hdfs-site.xml,core-site.xml 中的 dfs.name.dir,dfs.data.dir,dfs.tmp.dir 等)

Su – root

# mkdir -p /usr/local/hadoop-2.7.5/tmp/dfs/data

# chmod -R 777 /usr/local/hadoop-2.7.5/tmp

# chown -R hadoop:hadoop /usr/local/hadoop-2.7.5

[root@slave2 ~]# mkdir -p /usr/local/hadoop-2.7.5/tmp/dfs/data
[root@slave2 ~]# chmod -R 777 /usr/local/hadoop-2.7.5/tmp
 [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/hadoop-2.7.5
 [root@slave2 ~]# pwd
/root
[root@slave2 ~]# cd /usr/local/
[root@slave2 local]# ll
total 0
drwxr-xr-x. 2 root  root  46 Mar 21  2017 bin
drwxr-xr-x. 2 root  root    6 Jun 10  2014 etc
drwxr-xr-x. 2 root  root    6 Jun 10  2014 games
drwxr-xr-x  3 hadoop hadoop 16 Feb 24 18:18 hadoop-2.7.5
drwxr-xr-x. 2 root  root    6 Jun 10  2014 include
drwxr-xr-x. 2 root  root    6 Jun 10  2014 lib
drwxr-xr-x. 2 root  root    6 Jun 10  2014 lib64
drwxr-xr-x. 2 root  root    6 Jun 10  2014 libexec
drwxr-xr-x. 2 root  root    6 Jun 10  2014 sbin
drwxr-xr-x. 5 root  root  46 Dec 17  2015 share
drwxr-xr-x. 2 root  root    6 Jun 10  2014 src
[root@slave2 local]#

ssh 互信,即实现 master–>slave2 免密码

master:

[root@hadoop-master ~]# cat /etc/hosts

127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4

::1        localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.48.129    hadoop-master

192.168.48.132    slave1

192.168.48.131    slave2

[hadoop@hadoop-master ~]$ scp /usr/hadoop/.ssh/authorized_keys hadoop@slave2:/usr/hadoop/.ssh

The authenticity of host ‘slave2 (192.168.48.131)’ can’t be established.

ECDSA key fingerprint is 1e:cd:d1:3d:b0:5b:62:45:a3:63:df:c7:7a:0f:b8:7c.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added ‘slave2,192.168.48.131’ (ECDSA) to the list of known hosts.

hadoop@slave2’s password:

authorized_keys       

[hadoop@hadoop-master ~]$ ssh hadoop@slave2

Last login: Sat Feb 24 18:27:33 2018

[hadoop@slave2 ~]$

[hadoop@slave2 ~]$ exit

logout

Connection to slave2 closed.

[hadoop@hadoop-master ~]$

2. 修改 namenode 节点的 slave 文件, 增加新节点信息

[hadoop@hadoop-master hadoop]$ pwd

/usr/hadoop/hadoop-2.7.5/etc/hadoop

[hadoop@hadoop-master hadoop]$ vi slaves

slave1

slave2

3. 在 namenode 节点上, 将 hadoop-2.7.3 复制到新节点上, 并在新节点上删除 data 和 logs 目录中的文件

Master

[hadoop@hadoop-master ~]$ scp -R hadoop-2.7.5 hadoop@slave2:/usr/hadoop

Slave2

[hadoop@slave2 hadoop-2.7.5]$ ll

total 124

drwxr-xr-x 2 hadoop hadoop  4096 Feb 24 14:29 bin

drwxr-xr-x 3 hadoop hadoop    19 Feb 24 14:30 etc

drwxr-xr-x 2 hadoop hadoop  101 Feb 24 14:30 include

drwxr-xr-x 3 hadoop hadoop    19 Feb 24 14:29 lib

drwxr-xr-x 2 hadoop hadoop  4096 Feb 24 14:29 libexec

-rw-r–r– 1 hadoop hadoop 86424 Feb 24 18:44 LICENSE.txt

drwxrwxr-x 2 hadoop hadoop  4096 Feb 24 14:30 logs

-rw-r–r– 1 hadoop hadoop 14978 Feb 24 18:44 NOTICE.txt

-rw-r–r– 1 hadoop hadoop  1366 Feb 24 18:44 README.txt

drwxr-xr-x 2 hadoop hadoop  4096 Feb 24 14:29 sbin

drwxr-xr-x 4 hadoop hadoop    29 Feb 24 14:30 share

[hadoop@slave2 hadoop-2.7.5]$ pwd

/usr/hadoop/hadoop-2.7.5

[hadoop@slave2 hadoop-2.7.5]$ rm -R logs/*

4. 启动新 datanode 的 datanode 和 nodemanger 进程

先确认 namenode 和当前的 datanode 中,etc/hoadoop/excludes 文件中无待加入的主机,再进行下面操作

[hadoop@slave2 hadoop-2.7.5]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/hadoop/hadoop-2.7.5/logs/hadoop-hadoop-datanode-slave2.out
[hadoop@slave2 hadoop-2.7.5]$ sbin/yarn-daemon.sh start nodemanager
starting datanode, logging to /usr/hadoop/hadoop-2.7.5/logs/yarn-hadoop-datanode-slave2.out
[hadoop@slave2 hadoop-2.7.5]$
[hadoop@slave2 hadoop-2.7.5]$ jps
3897 DataNode
6772 NodeManager
8189 Jps
[hadoop@slave2 ~]$

5、在 NameNode 上刷新节点

[hadoop@hadoop-master ~]$ hdfs dfsadmin -refreshNodes
Refresh nodes successful
[hadoop@hadoop-master ~]$sbin/start-balancer.sh

6. 在 namenode 查看当前集群情况,

确认信节点已经正常加入

[hadoop@hadoop-master hadoop]$ hdfs dfsadmin -report
Configured Capacity: 58663657472 (54.63 GB)
Present Capacity: 15487176704 (14.42 GB)
DFS Remaining: 15486873600 (14.42 GB)
DFS Used: 303104 (296 KB)
DFS Used%: 0.00%
Under replicated blocks: 5
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

————————————————-
Live datanodes (2):

Name: 192.168.48.131:50010 (slave2)
Hostname: 183.221.250.11
Decommission Status : Normal
Configured Capacity: 38588669952 (35.94 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 36887191552 (34.35 GB)
DFS Remaining: 1701470208 (1.58 GB)
DFS Used%: 0.00%
DFS Remaining%: 4.41%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 01 19:36:33 PST 2018

Name: 192.168.48.132:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 20074987520 (18.70 GB)
DFS Used: 294912 (288 KB)
Non DFS Used: 6289289216 (5.86 GB)
DFS Remaining: 13785403392 (12.84 GB)
DFS Used%: 0.00%
DFS Remaining%: 68.67%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 01 19:36:35 PST 2018

[hadoop@hadoop-master hadoop]$

7 动态删除 datanode

7.1 配置 NameNode 的 hdfs-site.xml,

适当减小 dfs.replication 副本数,增加 dfs.hosts.exclude 配置

[hadoop@hadoop-master hadoop]$ pwd
/usr/hadoop/hadoop-2.7.5/etc/hadoop
[hadoop@hadoop-master hadoop]$ cat hdfs-site.xml
<configuration>
<property>
      <name>dfs.replication</name>
      <value>3</value>
</property>
  <property>
      <name>dfs.name.dir</name>
      <value>/usr/local/hadoop-2.7.5/tmp/dfs/name</value>
</property>
    <property>
      <name>dfs.data.dir</name>
      <value>/usr/local/hadoop-2.7.5/tmp/dfs/data</value>
    </property>
<property>
    <name>dfs.hosts.exclude</name>
    <value>/usr/hadoop/hadoop-2.7.5/etc/hadoop/excludes</value>
  </property>

</configuration>

7.2 在 namenode 对应路径(/etc/hadoop/)下新建 excludes 文件,

并写入待删除 DataNode 的 ip 或域名

[hadoop@hadoop-master hadoop]$ pwd
/usr/hadoop/hadoop-2.7.5/etc/hadoop
[hadoop@hadoop-master hadoop]$ vi excludes
####slave2
192.168.48.131[hadoop@hadoop-master hadoop]$

7.3 在 NameNode 上刷新所有 DataNode

hdfs dfsadmin -refreshNodes
sbin/start-balancer.sh

7.4 在 namenode 查看当前集群情况,

确认信节点已经正常删除, 结果中已无 slave2

[hadoop@hadoop-master hadoop]$ hdfs dfsadmin -report

或者可以在 web 检测界面(ip:50070)上可以观测到 DataNode 逐渐变为 Dead。

http://192.168.48.129:50070/

在 datanode 项,Admin state 已经由“In Service“变为”Decommissioned“,则表示删除成功

7.5 停止已删除的节点相关进程

[hadoop@slave2 hadoop-2.7.5]$ jps
9530 Jps
3897 DataNode
6772 NodeManager
[hadoop@slave2 hadoop-2.7.5]$ sbin/hadoop-daemon.sh stop datanode
stopping datanode
[hadoop@slave2 hadoop-2.7.5]$ sbin/yarn-daemon.sh stop nodemanager
stopping nodemanager
[hadoop@slave2 hadoop-2.7.5]$ jps
9657 Jps
[hadoop@slave2 hadoop-2.7.5]$

8 恢复已删除节点

执行 7.2 中删除相关信息,然后 4,5,6 即可。

Hadoop2.3-HA 高可用集群环境搭建  https://www.linuxidc.com/Linux/2017-03/142155.htm
Hadoop 项目之基于 CentOS7 的 Cloudera 5.10.1(CDH)的安装部署  https://www.linuxidc.com/Linux/2017-04/143095.htm
Hadoop2.7.2 集群搭建详解(高可用)https://www.linuxidc.com/Linux/2017-03/142052.htm
使用 Ambari 来部署 Hadoop 集群(搭建内网 HDP 源)https://www.linuxidc.com/Linux/2017-03/142136.htm
Ubuntu 14.04 下 Hadoop 集群安装  https://www.linuxidc.com/Linux/2017-02/140783.htm
CentOS 6.7 安装 Hadoop 2.7.2  https://www.linuxidc.com/Linux/2017-08/146232.htm
Ubuntu 16.04 上构建分布式 Hadoop-2.7.3 集群  https://www.linuxidc.com/Linux/2017-07/145503.htm
CentOS 7 下 Hadoop 2.6.4 分布式集群环境搭建  https://www.linuxidc.com/Linux/2017-06/144932.htm
Hadoop2.7.3+Spark2.1.0 完全分布式集群搭建过程  https://www.linuxidc.com/Linux/2017-06/144926.htm

更多 Hadoop 相关信息见Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13

正文完
星哥说事-微信公众号
post-qrcode
 
星锅
版权声明:本站原创文章,由 星锅 2022-01-21发表,共计7386字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中