Hadoop 0.20.2 在Mac OS 10.9 环境下 pseudo distributed 配置

103次阅读

没有评论

共计 2408 个字符，预计需要花费 7 分钟才能阅读完成。

1. 下载 Hadoop-0.20.2 版本并解压, tar -xvzf

2. edit the file conf/hadoop-env.sh to define at least JAVA_HOME to be the root of your Java installation.

加上这一句 export JAVA_HOME=/Library/Java/Home

3.Try the following command: $ bin/hadoop This will display the usage documentation for the hadoop script.

4. 更改 conf 文件夹里的配置文件

conf/core-site.xml:

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

conf/hdfs-site.xml:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

conf/mapred-site.xml:

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

5. 配置 ssh

Now check that you can ssh to the localhost without a passphrase:
$ ssh localhost

If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P ” -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

6. 执行 Hadoop

Format a new distributed-filesystem:
$ bin/hadoop namenode -format

Start the hadoop daemons:
$ bin/start-all.sh

The hadoop daemon log output is written to the ${HADOOP_LOG_DIR} directory (defaults to ${HADOOP_HOME}/logs).

Browse the web interface for the NameNode and the JobTracker; by default they are available at:

NameNode – http://localhost:50070/
JobTracker – http://localhost:50030/

7. 运行 hadoop 的 hello word 程序

mkdir input

并在里面放置你想统计的文本文件

将文件夹放入 hdfs 里面

bin/hadoop dfs -put input input

执行 example word count 程序，输入文件夹为 hdfs 里面名字为 input 的，输出文件夹为 output

bin/hadoop jar hadoop-0.20.2-examples.jar wordcount input output

将 hdfs 里面的 output 文件夹取出到本地命名为 output
bin/hadoop -dfs get output output

查看里面的词频统计内容

cat output/*

8. helpful link
http://www.cs.brandeis.edu/~rshaull/cs147a-fall-2008/hadoop-troubleshooting/

里面列出来了一些安装出现的简单问题，譬如当我运行 eamples 时遇见过

java.io.IOException: Not a file:
hdfs://localhost:9000/user/ross/input/conf

里面列出来了是因为在 hdfs 里面的 input 文件夹没有删除，我们需要

bin/hadoop dfs -rmr input
bin/hadoop dfs -put conf input

9. 引用

安装流程参考 https://hadoop.apache.org/docs/r1.2.1/single_node_setup.html

相关阅读 ：

Ubuntu 13.04 上搭建 Hadoop 环境 http://www.linuxidc.com/Linux/2013-06/86106.htm

Ubuntu 12.10 +Hadoop 1.2.1 版本集群配置 http://www.linuxidc.com/Linux/2013-09/90600.htm

Ubuntu 上搭建 Hadoop 环境（单机模式 + 伪分布模式）http://www.linuxidc.com/Linux/2013-01/77681.htm

Ubuntu 下 Hadoop 环境的配置 http://www.linuxidc.com/Linux/2012-11/74539.htm

单机版搭建 Hadoop 环境图文教程详解 http://www.linuxidc.com/Linux/2012-02/53927.htm

搭建 Hadoop 环境（在 Winodws 环境下用虚拟机虚拟两个 Ubuntu 系统进行搭建）http://www.linuxidc.com/Linux/2011-12/48894.htm

更多 Hadoop 相关信息见 Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13

正文完

星哥说事-微信公众号

发表至：服务器应用

2022-01-20

0

转载说明：除特殊说明外本站文章皆由CC-4.0协议发布，转载请注明出处。

CentOS6.5+Puppet3.7.3 安装、配置及测试

在 ModeShape 4.0 的事件使用循环缓冲区

最小 Docker 镜像 hello-world 剖析

Docker环境中部署OwnCloud 9.0

Nginx+PHP-FPM 访问出现 502错误

Nginx前端根据$remote_addr分发方法

用Mikrotik Router搭建GRE over IPSec 备用链路

Hadoop 启动节点Datanode失败解决

给 Nginx 增加 OAuth 支持（nginx-lua）

Hadoop 0.20.2 在Mac OS 10.9 环境下 pseudo distributed 配置

历史往往不是由问题决定的，而是由对问题的应对方式决定的【转】

阿里云免费SSL证书有效期从1年缩短至3个月！

阿里云2核4G4M轻量应用服务器_297元/年【优惠购买入口】

企业邮箱给谷歌Gmail报错550-5.7.25解决方案

宝塔面板网站安装SSL证书

彩云天气 Pro – 能准确“提前通知”你所在街道小区几分钟后下雨的天气预报提醒 APP

阿里云免费SSL证书有效期从1年缩短至3个月！

VidHub 免费 iOS / Apple TV 个人影视库播放器 – 自动刮削海报墙 (阿里网盘 NAS 播放/投屏)

做动画像做 PPT 一样简单的工具 – 来画视频 (免费送 VIP 会员激活码)

杀手 6 HITMAN – 超经典潜入暗杀游戏！杀1个人，我能有几十种方式