Windows下调试hadoop

2017-11-07 705

1. 本地模式

本地模式下调试hadoop：下载winutils.exe和hadoop.dll hadoop.lib等windows的hadoop依赖文件放在D:\proc\hadoop\bin目录下

并设置环境变量：HADOOP_HOME=D:\proc\hadoop

添加PATH=%HADOOP_HOME%\bin

D:\proc\hadoop 是一个空目录就可以.

机器是32位的请下载,机器是64位的请下载;

关闭eclipse再重新启动来获取新的环境变量。

然后创建WorldCount.java：

package cn.zenith.mr;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

publicclass WordCount {

publicstaticclass TokenizerMapper 
extends Mapper<Object, Text, Text, IntWritable>{

privatefinalstatic IntWritable one = new IntWritable(1);
private Text word = new Text();

publicvoid map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
      }
    }
  }

publicstaticclass IntSumReducer 
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();

publicvoid reduce(Text key, Iterable<IntWritable>values, 
                       Context context
                       ) throws IOException, InterruptedException {
intsum = 0;
for (IntWritable val : values) {
sum += val.get();
      }
result.set(sum);
context.write(key, result);
    }
  }

publicstaticvoid main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length< 2) {
      System.err.println("Usage: wordcount <in> [<in>...] <out>");
      System.exit(2);
    }
    Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
for (inti = 0; i<otherArgs.length - 1; ++i) {
      FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
    }
    FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

运行时：可以指定

运行时候指定本地的路径：如图：

或者远程目录：

Debug或者run下结果：

2. 集群模式

集群模式是本地向集群提交作业。

1、将集群中的配置文件core-site.xml，hdfs-site.xml，mapred-site.xml，yarn-site.xml文件放在项目的resources目录下

2、在mapred-site.xml中添加：

<name>mapreduce.app-submission.cross-platform</name>

</property>

<name>mapred.jar</name>

<value>D:\\works\\cr_teach\\target\\teach-1.0-SNAPSHOT-jar-with-dependencies.jar</value>

</property>

Mapred.jar目录根据你自己的包名字来定。

3、Maven 打包 mvn clean install

4、运行。

如果提示：

Permission denied: user=zenith, access=EXECUTE, inode="/tmp/hadoop-yarn":root:supergroup:drwx------

给文件增加执行权限 hdfs dfs -chmod -R a+x /tmp

本文转自SummerChill博客园博客，原文链接：http://www.cnblogs.com/DreamDrive/p/6885585.html，如需转载请自行联系原作者

微信关注我们

原文链接：https://yq.aliyun.com/articles/376393

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

一些开源搜索引擎实现——倒排使用原始文件，列存储Hbase，KV store如levelDB、mongoDB、redis，以及SQL的，如s...

本文说明：除开ES，Solr，sphinx系列的其他开源搜索引擎汇总于此。 A search engine based on Node.js and LevelDB A persistent, network resilient, full text search library for the browser and Node.js https://github.com/fergiemcdowall/norch https://github.com/fergiemcdowall/search-index 使用的是levelDB存储索引，不过目前没有明白，对于倒排索引，其是否适合？类似思路还有： https://github.com/patrickfrey/strus Library implementing the storage and the query evaluation for a text search engine. It uses on a key value store database interface to store its data. Currently...

2017-11-07

863

centos7 下面安装elasticsearch5 1，首先安装好jdk，如果需要请查看jdk1.8安装配置 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #官方下载地址： http: //www .oracle.com /technetwork/java/javase/downloads/jdk8-downloads-2133151 .html #解压包 tar xfjdk-8u131-linux-x64. tar .gz-C /usr/local #修改系统变量 vim /etc/profile 最下面添加 JAVA_HOME= /usr/local/jdk1 .8.0_131 JRE_HOME=$JAVA_HOME /jre PATH=$PATH:$JAVA_HOME /bin :$JRE_HOME /bin CLASSPATH=:$JAVA_HOME /lib/dt .jar:$JAVA_HOME /lib/tools .jar:$JRE_HOME /lib/dt .jar export JAVA_...

2017-11-07

709

资源下载

更多资源

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

Spring

Spring框架（Spring Framework）是由Rod Johnson于2002年提出的开源Java企业级应用框架，旨在通过使用JavaBean替代传统EJB实现方式降低企业级编程开发的复杂性。该框架基于简单性、可测试性和松耦合性设计理念，提供核心容器、应用上下文、数据访问集成等模块，支持整合Hibernate、Struts等第三方框架，其适用范围不仅限于服务器端开发，绝大多数Java应用均可从中受益。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

WebStorm

WebStorm 是jetbrains公司旗下一款JavaScript 开发工具。目前已经被广大中国JS开发者誉为“Web前端开发神器”、“最强大的HTML5编辑器”、“最智能的JavaScript IDE”等。与IntelliJ IDEA同源，继承了IntelliJ IDEA强大的JS部分的功能。

Windows下调试hadoop

1. 本地模式

2. 集群模式

一些开源搜索引擎实现——倒排使用原始文件，列存储Hbase，KV store如levelDB、mongoDB、redis，以及SQL的，如s...

Elasticsearch5.4安装配置

相关文章

发表评论

资源下载

Mario

Spring

Rocky Linux

WebStorm

欢迎您来访！