阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

CentOS6下配置Spark+Python开发环境记录

477次阅读
没有评论

共计 18526 个字符,预计需要花费 47 分钟才能阅读完成。

1. 使用 $SPARK_HOME/sbin/ 下的 pyspark 启动时,报错 Traceback (most recent call last):

File “/home/joy/spark/spark/Python/pyspark/shell.py”, line 28, in

首先按照搜索结果使用 yum install -y zlib* 安装了欠缺的包,但是仍报错,后使用 sudo 命令执行./pyspark 即可正常执行。目前必须使用 sudo 命令才能正常执行,可能与环境设置有关,待解决——因为使用 sudo 命令安装,所以文件的所有者为 root,chown 更改所有者。
但是这样必须使用 sudo 安装 pip,为了一劳永逸,重新编译 python
解决方法:

1、安装依赖 zlib、zlib-devel

2、重新编译安装 Python

./configure
编辑 Modules/Setup 文件
找到下面这句,去掉注释

zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz

重新编译安装:make & make install
编译后报错仍有部分模块未编译成功
Python build finished, but the necessary bits to build these modules were not found:
_bsddb _curses _curses_panel
_sqlite3 _ssl _tkinter
bsddb185 bz2 dbm
dl gdbm imageop

无论报错信息如何,意思很明确,我们编译的时候,系统没有办法找到对应的模块信息,为了解决这些报错,我们就需要提前安装依赖包,这些依赖包对应列表如下(不一定完全):

模块 依赖 说明
_bsddb bsddb Interface to Berkeley DB library。Berkeley 数据库的接口.bsddb is deprecated since 2.6. The ideal is to use the bsddb3 module.
_curses ncurses Terminal handling for character-cell displays。
_curses_panel ncurses A panel stack extension for curses。
_sqlite3 sqlite DB-API 2.0 interface for SQLite databases。SqlLite,CentOS 可以安装 sqlite-devel
_ssl openssl-devel.i686 TLS/SSL wrapper for socket objects。
_tkinter N/A a thin object-oriented layer on top of Tcl/Tk。如果不使用桌面程序可以忽略 TKinter
bsddb185 old bsddb module 老的 bsddb 模块,可忽略。
bz2 bzip2-devel.i686 Compression compatible with bzip2。bzip2-devel
dbm bsddb Simple“database”interface。
dl N/A Call C functions in shared objects.Python2.6 开始,已经弃用。
gdbm gdbm-devel.i686 GNU’s reinterpretation of dbm
imageop N/A Manipulate raw image data。已经弃用。
readline readline-devel GNU readline interface
sunaudiodev N/A Access to Sun audio hardware。这个是针对 Sun 平台的,CentOS 下可以忽略
zlib Zlib Compression compatible with gzip

在 CentOS 下,可以安装这些依赖包:readline-devel,sqlite-devel,bzip2-devel.i686,openssl-devel.i686,gdbm-devel.i686,libdbi-devel.i686,ncurses-libs,zlib-devel.i686。完成这些安装之后,可以再次编译,上表中指定为弃用或者忽略的模块错误可以忽略。

在编译完成之后,就可以接着上面的第六步安装 Python 到指定目录下。安装完成之后,我们可以到安装目录下查看 Python 是否正常安装。

3. SparkSQL 准备

首先呢,看使用 HiveContext 都需要哪些要求,文章中有这么三个要求:
1、检查 $SPARK_HOME/lib 目录下是否有 datanucleus-api-jdo-3.2.1.jar、datanucleus-rdbms-3.2.1.jar
、datanucleus-core-3.2.2.jar 这几个 jar 包。
2、检查 $SPARK_HOME/conf 目录下是否有从 $HIVE_HOME/conf 目录下拷贝过来的 hive-site.xml。
3、提交程序的时候将数据库驱动程序的 jar 包指定到 DriverClassPath,如 bin/spark-submit –driver-class-path *.jar。或者在 spark-env.sh 中设置 SPARK_CLASSPATH。

参考文章,将 $HIVE_HOME/lib 下以 datanucleus 开头的几个 jar 包复制到 $SPARK_HOME/lib 下;$HIVE_HOME/conf 下的 hive-site.xml 复制到 $SPARK_HOME/conf 下;将 $HIVE_HOME/lib 下的 MySQL-connector 复制到 $SPARK_HOME/jars 下,

2. 启动 spark-shell 时报错

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/01/17 11:42:58 WARN SparkContext: Support for Java 7 is deprecated as of Spark 2.0.0
17/01/17 11:43:00 WARN NativeCodeLoader: Unable to load native-Hadoop library for your platform... using builtin-java classes where applicable
17/01/17 11:43:00 WARN Utils: Your hostname, node1 resolves to a loopback address: 127.0.0.1; using 192.168.85.128 instead (on interface eth1)
17/01/17 11:43:00 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/01/17 11:43:11 WARN HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.server2.thrift.http.min.worker.threads does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.mapjoin.optimized.keys does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.mapjoin.lazy.hashtable does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.datampi.maxslots does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.metastore.ds.retry.attempts does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.server2.thrift.http.max.worker.threads does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.datampi.sendqueue does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.optimize.multigroupby.common.distincts does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.metastore.ds.retry.interval does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.datampi.parallelism does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.stats.map.parallelism does not exist
17/01/17 11:43:11 WARN HiveConf: HiveConf of name hive.datampi.memusedpercent does not exist
17/01/17 11:43:12 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/home/joy/spark/spark-2.1.0-bin-hadoop2.6/jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/home/joy/spark/spark/jars/datanucleus-rdbms-3.2.9.jar."
17/01/17 11:43:12 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/home/joy/spark/spark-2.1.0-bin-hadoop2.6/jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/home/joy/spark/spark/jars/datanucleus-api-jdo-3.2.6.jar."
17/01/17 11:43:12 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/home/joy/spark/spark/jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/home/joy/spark/spark-2.1.0-bin-hadoop2.6/jars/datanucleus-core-3.2.10.jar."
17/01/17 11:43:16 WARN HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.server2.thrift.http.min.worker.threads does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.mapjoin.optimized.keys does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.mapjoin.lazy.hashtable does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.datampi.maxslots does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.metastore.ds.retry.attempts does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.server2.thrift.http.max.worker.threads does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.datampi.sendqueue does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.optimize.multigroupby.common.distincts does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.metastore.ds.retry.interval does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.datampi.parallelism does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.stats.map.parallelism does not exist
17/01/17 11:43:16 WARN HiveConf: HiveConf of name hive.datampi.memusedpercent does not exist
17/01/17 11:43:22 ERROR ObjectStore: Version information found in metastore differs 0.13.0 from expected schema version 1.2.0. Schema verififcation is disabled hive.metastore.schema.verification so setting version.
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':
  at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
  at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
  at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
  at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
  at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
  at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
  at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)
  at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
  ... 47 elided
Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
  ... 58 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
  at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
  at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86)
  at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
  at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
  at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
  at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157)
  at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32)
  ... 63 more
Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.io.FileNotFoundException: File /hive/tmp does not exist
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
  ... 71 more
**Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.io.FileNotFoundException: File /hive/tmp does not exist**
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
  at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366)
  at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270)
  at org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:65)
  ... 76 more
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File /hive/tmp does not exist
  at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
  at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:192)
  ... 84 more
Caused by: java.io.FileNotFoundException: File /hive/tmp does not exist
  at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
  at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
  at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
  at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
  at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:599)
  at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
  at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
  ... 85 more

分析报错信息,发现出错原因为 /hive/tmp 不存在的 FileNotExist 错误,查找 hive-site.xml 文件,该路径为 hive.exec.scratchdir 值,hive.exec.scratchdir 为 HDFS 路径,用于存储不同 map/reduce 阶段的执行计划和这些阶段的中间输出结果。

在终端输入hadoop fs -ls /hive,执行结果为

Found 2 items
drwxr-xr-x   - joy supergroup          0 2016-06-12 21:35 /hive/log
drwxr-xr-x   - joy supergroup          0 2017-01-16 14:17 /hive/tmp

权限分配不对,应该增加 g +w,hadoop fs -chmod g+w /hive/tmp 以及hadoop fs -chmod g+w /hive/log,但是依然报错不存在

在 $SPARK_HOME/conf 下的 spark-env.sh 中增加 HADOOP_CONF_DIR,增加后报错信息变更为

java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':
  at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
  at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
  at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
  at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
  at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
  at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
  at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)
  at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
  ... 47 elided
Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
  ... 58 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
  at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
  at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86)
  at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
  at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
  at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
  at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157)
  at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32)
  ... 63 more
Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /hive/tmp on HDFS should be writable. Current permissions are: rwxrwxr-x
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
  ... 71 more
Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /hive/tmp on HDFS should be writable. Current permissions are: rwxrwxr-x
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
  at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366)
  at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270)
  at org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:65)
  ... 76 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /hive/tmp on HDFS should be writable. Current permissions are: rwxrwxr-x
  at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
  at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:192)
  ... 84 more
Caused by: java.lang.RuntimeException: The root scratch dir: /hive/tmp on HDFS should be writable. Current permissions are: rwxrwxr-x
  at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
  at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
  at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
  ... 85 more
<console>:14: error: not found: value spark
       import spark.implicits._
              ^
<console>:14: error: not found: value spark
       import spark.sql

出错信息指出文件夹权限不正确,再次使用hadoop fs -ls /hive

drwxrwxr-x   - joy supergroup          0 2016-06-12 21:35 /hive/log
drwxrwxr-x   - joy supergroup          0 2017-01-16 14:17 /hive/tmp

将文件夹权限改为 777,最终启动成功

4. KeyError: u’y’

出错信息类似于以下:

Traceback (most recent call last):
  File "/Users/lyj/Programs/kiseliugit/MyPysparkCodes/test/spark2.0.py", line 5, in <module>
spark = SparkSession.builder.master("local").appName('test 2.0').config(conf=SparkConf()).getOrCreate()
  File "/Users/lyj/Programs/Apache/Spark2/python/pyspark/conf.py", line 104, in __init__
SparkContext._ensure_initialized()
  File "/Users/lyj/Programs/Apache/Spark2/python/pyspark/context.py", line 243, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
  File "/Users/lyj/Programs/Apache/Spark2/python/pyspark/java_gateway.py", line 116, in launch_gateway
java_import(gateway.jvm, "org.apache.spark.SparkConf")
  File "/Library/Python/2.7/site-packages/py4j/java_gateway.py", line 90, in java_import
return_value = get_return_value(answer, gateway_client, None, None)
  File "/Library/Python/2.7/site-packages/py4j/protocol.py", line 306, in get_return_value
value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
KeyError: u'y'

出错原因为 py4j 版本过低,使用 pip upgrade 升级即可
参考:http://stackoverflow.com/questions/38637988/how-could-i-write-the-right-entry-point-in-spark-2-0-program-actually-pyspark-2

更多 Spark 相关教程见以下内容

CentOS 7.0 下安装并配置 Spark  http://www.linuxidc.com/Linux/2015-08/122284.htm

Spark1.0.0 部署指南 http://www.linuxidc.com/Linux/2014-07/104304.htm

Spark2.0 安装配置文档  http://www.linuxidc.com/Linux/2016-09/135352.htm

Spark 1.5、Hadoop 2.7 集群环境搭建  http://www.linuxidc.com/Linux/2016-09/135067.htm

Spark 官方文档 – 中文翻译  http://www.linuxidc.com/Linux/2016-04/130621.htm

CentOS 6.2(64 位)下安装 Spark0.8.0 详细记录 http://www.linuxidc.com/Linux/2014-06/102583.htm

Spark2.0.2 Hadoop2.6.4 全分布式配置详解 http://www.linuxidc.com/Linux/2016-11/137367.htm

Ubuntu 14.04 LTS 安装 Spark 1.6.0(伪分布式)http://www.linuxidc.com/Linux/2016-03/129068.htm

Spark 的详细介绍:请点这里
Spark 的下载地址:请点这里

本文永久更新链接地址:http://www.linuxidc.com/Linux/2017-03/141725.htm

正文完
星哥玩云-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-21发表,共计18526字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中

星哥玩云

星哥玩云
星哥玩云
分享互联网知识
用户数
4
文章数
19351
评论数
4
阅读量
7991571
文章搜索
热门文章
星哥带你玩飞牛NAS-6:抖音视频同步工具,视频下载自动下载保存

星哥带你玩飞牛NAS-6:抖音视频同步工具,视频下载自动下载保存

星哥带你玩飞牛 NAS-6:抖音视频同步工具,视频下载自动下载保存 前言 各位玩 NAS 的朋友好,我是星哥!...
星哥带你玩飞牛NAS-3:安装飞牛NAS后的很有必要的操作

星哥带你玩飞牛NAS-3:安装飞牛NAS后的很有必要的操作

星哥带你玩飞牛 NAS-3:安装飞牛 NAS 后的很有必要的操作 前言 如果你已经有了飞牛 NAS 系统,之前...
我把用了20年的360安全卫士卸载了

我把用了20年的360安全卫士卸载了

我把用了 20 年的 360 安全卫士卸载了 是的,正如标题你看到的。 原因 偷摸安装自家的软件 莫名其妙安装...
再见zabbix!轻量级自建服务器监控神器在Linux 的完整部署指南

再见zabbix!轻量级自建服务器监控神器在Linux 的完整部署指南

再见 zabbix!轻量级自建服务器监控神器在 Linux 的完整部署指南 在日常运维中,服务器监控是绕不开的...
飞牛NAS中安装Navidrome音乐文件中文标签乱码问题解决、安装FntermX终端

飞牛NAS中安装Navidrome音乐文件中文标签乱码问题解决、安装FntermX终端

飞牛 NAS 中安装 Navidrome 音乐文件中文标签乱码问题解决、安装 FntermX 终端 问题背景 ...
阿里云CDN
阿里云CDN-提高用户访问的响应速度和成功率
随机文章
在Windows系统中通过VMware安装苹果macOS15

在Windows系统中通过VMware安装苹果macOS15

在 Windows 系统中通过 VMware 安装苹果 macOS15 许多开发者和爱好者希望在 Window...
星哥带你玩飞牛NAS硬件 01:捡垃圾的最爱双盘,暴风二期矿渣为何成不老神话?

星哥带你玩飞牛NAS硬件 01:捡垃圾的最爱双盘,暴风二期矿渣为何成不老神话?

星哥带你玩飞牛 NAS 硬件 01:捡垃圾的最爱双盘,暴风二期矿渣为何成不老神话? 前言 在选择 NAS 用预...
我把用了20年的360安全卫士卸载了

我把用了20年的360安全卫士卸载了

我把用了 20 年的 360 安全卫士卸载了 是的,正如标题你看到的。 原因 偷摸安装自家的软件 莫名其妙安装...
星哥带你玩飞牛NAS-5:飞牛NAS中的Docker功能介绍

星哥带你玩飞牛NAS-5:飞牛NAS中的Docker功能介绍

星哥带你玩飞牛 NAS-5:飞牛 NAS 中的 Docker 功能介绍 大家好,我是星哥,今天给大家带来如何在...
每年0.99刀,拿下你的第一个顶级域名,详细注册使用

每年0.99刀,拿下你的第一个顶级域名,详细注册使用

每年 0.99 刀,拿下你的第一个顶级域名,详细注册使用 前言 作为长期折腾云服务、域名建站的老玩家,星哥一直...

免费图片视频管理工具让灵感库告别混乱

一言一句话
-「
手气不错
星哥带你玩飞牛NAS硬件03:五盘位+N5105+双网口的成品NAS值得入手吗

星哥带你玩飞牛NAS硬件03:五盘位+N5105+双网口的成品NAS值得入手吗

星哥带你玩飞牛 NAS 硬件 03:五盘位 +N5105+ 双网口的成品 NAS 值得入手吗 前言 大家好,我...
让微信公众号成为 AI 智能体:从内容沉淀到智能问答的一次升级

让微信公众号成为 AI 智能体:从内容沉淀到智能问答的一次升级

让微信公众号成为 AI 智能体:从内容沉淀到智能问答的一次升级 大家好,我是星哥,之前写了一篇文章 自己手撸一...
星哥带你玩飞牛NAS-11:咪咕视频订阅部署全攻略

星哥带你玩飞牛NAS-11:咪咕视频订阅部署全攻略

星哥带你玩飞牛 NAS-11:咪咕视频订阅部署全攻略 前言 在家庭影音系统里,NAS 不仅是存储中心,更是内容...
4盘位、4K输出、J3455、遥控,NAS硬件入门性价比之王

4盘位、4K输出、J3455、遥控,NAS硬件入门性价比之王

  4 盘位、4K 输出、J3455、遥控,NAS 硬件入门性价比之王 开篇 在 NAS 市场中,威...
浏览器自动化工具!开源 AI 浏览器助手让你效率翻倍

浏览器自动化工具!开源 AI 浏览器助手让你效率翻倍

浏览器自动化工具!开源 AI 浏览器助手让你效率翻倍 前言 在 AI 自动化快速发展的当下,浏览器早已不再只是...