阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Hadoop reduce阶段出现Failed to fetch错误及解决

144次阅读
没有评论

共计 2405 个字符,预计需要花费 7 分钟才能阅读完成。

最近运行 Hadoop1.1 出现 map 执行 100%,reduce 卡在 0% 的情况,甚至会出现无法启动 datanode 的情况。查看了一下日志,大致看到 Failed to fetch 字段,以及拒绝连接错误 connection refused,查看配置没有发现问题,于是怀疑 /etc/hosts 文件配置的影响,参考了一下 wiki 的关于 hadoop 的 connection refused 页面,大致如下:

http://wiki.apache.org/hadoop/ConnectionRefused

Connection Refused

You get a ConnectionRefused Exception when there is a machine at the address specified, but there is no program listening on the specific TCP port the client is using -and there is no firewall in the way silently dropping TCP connection requests. If you do not know what a TCP connection request is, please consult the specification.

Unless there is a configuration error at either end, a common cause for this is the Hadoop service isn’t running.

  1. Check the hostname the client using is correct
  2. Check the IP address the client gets for the hostname is correct.
  3. Check that there isn’t an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this)
  4. Check the port the client is using matches that the server is offering a service on.
  5. On the server, try a telnet localhost <port> to see if the port is open there.

  6. On the client, try a telnet <server> <port> to see if the port is accessible remotely.

  7. Try connecting to the server/port from a different machine, to see if it just the single client misbehaving.

None of these are Hadoop problems, they are host, network and firewall configuration issues. As it is your cluster, only you can find out and track down the problem.

其中很关键的一条:一定要将节点的 hostname 与其在 hadoop 配置中的 IP 地址(或域名,在 slaves 或 master 文件中)绑定。例如:

192.168.1.101  hadoop01

另如果节点 hostname 未曾更改过,hosts 文件会有 hostname 与 127.0.,0,1 的绑定:

127.0.0.1  localhost localhost.localdomain

用 hostname 命令查看本机域名,可能是 localhost.localdomain 或 localhost,需要将其屏蔽掉。值此问题解决,任务可以正常执行。但是还是无法在 hostname:50070 上查看 hdfs 上的文件(Browse the filesystem 打不开)。

Hadoop 项目之基于 CentOS7 的 Cloudera 5.10.1(CDH)的安装部署  http://www.linuxidc.com/Linux/2017-04/143095.htm

Hadoop2.7.2 集群搭建详解(高可用)http://www.linuxidc.com/Linux/2017-03/142052.htm

使用 Ambari 来部署 Hadoop 集群(搭建内网 HDP 源)http://www.linuxidc.com/Linux/2017-03/142136.htm

Ubuntu 14.04 下 Hadoop 集群安装  http://www.linuxidc.com/Linux/2017-02/140783.htm

CentOS 6.7 安装 Hadoop 2.7.2  http://www.linuxidc.com/Linux/2017-08/146232.htm

Ubuntu 16.04 上构建分布式 Hadoop-2.7.3 集群  http://www.linuxidc.com/Linux/2017-07/145503.htm

CentOS 7.3 下 Hadoop2.8 分布式集群安装与测试  http://www.linuxidc.com/Linux/2017-09/146864.htm

CentOS 7 下 Hadoop 2.6.4 分布式集群环境搭建  http://www.linuxidc.com/Linux/2017-06/144932.htm

Hadoop2.7.3+Spark2.1.0 完全分布式集群搭建过程  http://www.linuxidc.com/Linux/2017-06/144926.htm

更多 Hadoop 相关信息见 Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13

本文永久更新链接地址 :http://www.linuxidc.com/Linux/2017-11/148344.htm

正文完
星哥说事-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-21发表,共计2405字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中