Spark 0.8 集群(CentOS6.4)-简单统计测试
环境:CentOS 6.4, hadoop-2.0.0-cdh4.2.0, JDK 1.6, spark-0.8.0-incubating-bin-cdh4.tar.gz,Scala 2.9.3 1. 安装、部署集群环境 参考前章《安装Spark 0.8集群(CentOS6.4) - 大数据之内存计算》 2.测试描述 使用在线测试数据生工具,动态生成如下json数据(名称DATA[1-9].json): {"id":10,"first_name":"Ralph","last_name":"Kennedy","country":"Colombia","ip_address":"12.211.41.162","email":"rkennedy@oyonder.net"}, {"id":11,"first_name":"Gary","last_name":"Cole","country":"Nepal","ip_address":"242.67.150.18","email":"gcole@browsebug.info"}, … 可以数据可以先生成100M左右,然后通过linux ...