Hadoop2.7实战v1.0之YARN HA
YARN HA实战v1.0
当前环境:hadoop+zookeeper(namenode,resourcemanager HA)
resourcemanager | serviceId | init status |
sht-sgmhadoopnn-01 | rm1 | active |
sht-sgmhadoopnn-02 | rm2 | standby |
参考:
http://blog.csdn.net/u011414200/article/details/50336735
http://blog.csdn.net/u011414200/article/details/50276257
一.查看resourcemanager是active还是standby
1.打开网页
http://172.16.101.55:8088/cluster/cluster
http://172.16.101.56:8088/cluster/cluster
2.查看resourcemanager日志
点击(此处)折叠或打开
- [root@sht-sgmhadoopnn-01 logs]# more yarn-root-resourcemanager-sht-sgmhadoopnn-01.telenav.cn.log
- …………………..
- 2016-03-03 18:10:01,289 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to active state
点击(此处)折叠或打开
- [root@sht-sgmhadoopnn-02 logs]# more yarn-root-resourcemanager-sht-sgmhadoopnn-02.telenav.cn.log
- …………………..
- 2016-03-03 18:10:34,250 INFO org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher: YARN system metrics publishing service is not enabled
- 2016-03-03 18:10:34,250 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
3. yarn rmadmin –getServiceState
点击(此处)折叠或打开
- ###$HADOOP_HOME/etc/hadoop/yarn-site.xml
- <property>
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>rm1,rm2</value>
- </property>
-
-
- [root@sht-sgmhadoopnn-02 logs]# yarn rmadmin -getServiceState rm1
- active
- [root@sht-sgmhadoopnn-02 logs]# yarn rmadmin -getServiceState rm2
- standby
点击(此处)折叠或打开
- [root@sht-sgmhadoopnn-01 bin]# yarn --help
- Usage: yarn [--config confdir] [COMMAND | CLASSNAME]
- CLASSNAME run the class named CLASSNAME
- or
- where COMMAND is one of:
- resourcemanager -format-state-store deletes the RMStateStore
- resourcemanager run the ResourceManager
- nodemanager run a nodemanager on each slave
- timelineserver run the timeline server
- rmadmin admin tools
- sharedcachemanager run the SharedCacheManager daemon
- scmadmin SharedCacheManager admin tools
- version print the version
- jar <jar> run a jar file
- application prints application(s)
- report/kill application
- applicationattempt prints applicationattempt(s)
- report
- container prints container(s) report
- node prints node report(s)
- queue prints queue information
- logs dump container logs
- classpath prints the class path needed to
- get the Hadoop jar and the
- required libraries
- cluster prints cluster information
- daemonlog get/set the log level for each
- daemon
-
- ###########################################################################
- [root@sht-sgmhadoopnn-01 bin]# yarn rmadmin --help
- -help: Unknown command
- Usage: yarn rmadmin
- -refreshQueues
- -refreshNodes
- -refreshSuperUserGroupsConfiguration
- -refreshUserToGroupsMappings
- -refreshAdminAcls
- -refreshServiceAcl
- -getGroups [username]
- -addToClusterNodeLabels [label1,label2,label3] (label splitted by ",")
- -removeFromClusterNodeLabels [label1,label2,label3] (label splitted by ",")
- -replaceLabelsOnNode [node1[:port]=label1,label2 node2[:port]=label1,label2]
- -directlyAccessNodeLabelStore
- -transitionToActive [--forceactive] <serviceId>
- -transitionToStandby <serviceId>
- -failover [--forcefence] [--forceactive] <serviceId> <serviceId>
- -getServiceState <serviceId>
- -checkHealth <serviceId>
- -help [cmd]
failover 初始化一个故障恢复。该命令会从一个失效的resourcemanager切换到另一个上面(不支持在自动切换的环境操作)。
getServiceState 获取当前resourcemanager的状态。
checkHealth 检查resourcemanager的状态。正常就返回0,否则返回非0值。
三.实验
1.测试YARN的手工切换功能(失败)
点击(此处)折叠或打开
- [root@sht-sgmhadoopnn-01 ~]# yarn rmadmin -failover --forceactive rm1 rm2
- forcefence and forceactive flags not supported with auto-failover enabled.
#yarn-site.xml 中设置yarn.resourcemanager.ha.automatic-failover.enabled为 true,故提示不能手动切换
2.测试YARN的自动切换功能(成功)
a.在active resoucemanager机器上通过jps命令查找到resoucemanager的进程号,然后通过kill -9的方式杀掉进程,观察另一个resoucemanager节点是否会从状态standby变成active状态
点击(此处)折叠或打开
- [root@sht-sgmhadoopnn-01 ~]# yarn rmadmin -getServiceState rm1
- active
- [root@sht-sgmhadoopnn-01 ~]# yarn rmadmin -getServiceState rm2
- standby
-
- [root@sht-sgmhadoopnn-01 bin]# jps
- 2583 Jps
- 10162 DFSZKFailoverController
- 28432 ResourceManager
- 21679 NameNode
-
- [root@sht-sgmhadoopnn-01 ~]# kill -9 28432
-
- [root@sht-sgmhadoopnn-02 bin]# jps
- 19147 ResourceManager
- 17837 NameNode
- 17970 DFSZKFailoverController
- 27330 Jps
-
-
- [root@sht-sgmhadoopnn-01 bin]# yarn rmadmin -getServiceState rm1
- 16/03/03 19:23:39 INFO ipc.Client: Retrying connect to server: sht-sgmhadoopnn-01/172.16.101.55:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
- Operation failed: Call From sht-sgmhadoopnn-01.telenav.cn/172.16.101.55 to sht-sgmhadoopnn-01:8033 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
-
-
-
- [root@sht-sgmhadoopnn-01 bin]# yarn rmadmin -getServiceState rm2
- active
- [root@sht-sgmhadoopnn-01 bin]#
#### sht-sgmhadoopnn-01 机器上resourcemanager进程已经起来,且状态为standby
c. 再次切换
点击(此处)折叠或打开
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToStandby rm2
- Automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@11f69937
- Refusing to manually manage HA state, since it may cause
- a split-brain scenario or other incorrect state.
- If you are very sure you know what you are doing, please
- specify the --forcemanual flag.
-
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToStandby --forcemanual rm2
- You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably.
-
- It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state.
-
- You may abort safely by answering 'n' or hitting ^C now.
-
- Are you sure you want to continue? (Y or N) Y
- 16/03/03 19:29:36 WARN ha.HAAdmin: Proceeding with manual HA state management even though
- automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@4e33967b
-
-
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm1
- standby
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm2
- standby
-
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToActive rm1
- Automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@54c4f317
- Refusing to manually manage HA state, since it may cause
- a split-brain scenario or other incorrect state.
- If you are very sure you know what you are doing, please
- specify the --forcemanual flag.
-
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToActive --forcemanual rm1
- You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably.
-
- It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state.
-
- You may abort safely by answering 'n' or hitting ^C now.
-
- Are you sure you want to continue? (Y or N) Y
- 16/03/03 19:32:46 WARN ha.HAAdmin: Proceeding with manual HA state management even though
- automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@54c4f317
-
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm1
- 16/03/03 19:32:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- active
- [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm2
- 16/03/03 19:33:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- standby
- [root@sht-sgmhadoopnn-01 sbin]#
#和HDFS HA切换实验不一样, -transitionToStandby,会自动将standby—>active,active-->standby;
而YARN HA就不一样,需要还有手动再执行一下-transitionToActive.

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。
持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。
转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。
- 上一篇
Hadoop2.7实战v1.0之HDFS HA
HDFS HA实战v1.0 当前环境: hadoop+zookeeper(namenode,resourcemanager HA) namenode serviceId Init status sht-sgmhadoopnn-01 nn1 active sht-sgmhadoopnn-02 nn2 standby 参考: http://blog.csdn.net/u011414200/article/details/50336735 一.查看namenode是active还是standby 1.打开网页 2.查看zkfc日志 点击(此处)折叠或打开 [root@sht-sgmhadoopnn-01 logs]# more hadoop-root-zkfc-sht-sgmhadoopnn-01.telenav.cn.log ………………….. 2016-02-28 00:24:00,692 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode at sht-sgmhadoopnn-01/172.16...
- 下一篇
Hadoop2.x运维实战之入门手册v1.0
Hadoop2.x运维实战之入门手册V1.0 0.Hadoop2.x生态圈介绍 1.常用组件介绍(体系结构+进程) 1.1HDFS 1.2MapReduce 1.3Yarn 1.4Hive 1.5Hbase 1.6Zookeeper 1.7Flume 1.8Kafka 1.9Sqoop 1.Hadoop2.6.0的伪分布环境搭建 2.Hadoop-2.7.2+Zookeeper-3.4.6完全分布式环境搭建(HDFS,YARN HA) 3.Hadoop 2.x HDFS和YARN的启动方式 4.Hadoop2.x常用端口及定义方法 5.Hadoop2.x常用命令 5.1学会怎样查看命令帮助 5.2hadoop fs 5.3hdfs dfs 5.4hdfs dfsadmin 5.5hdfs haadmin 5.6hdfs fsck 5.7yarn rmadmin 5.8其他命令 6.HDFS HA实战 7.YARN HA实战 8.动态添加DataNode(含NodeManager)节点(不修改dfs.replication) 9.添加Da...
相关文章
文章评论
共有0条评论来说两句吧...
文章二维码
点击排行
推荐阅读
最新文章
- Springboot2将连接池hikari替换为druid,体验最强大的数据库连接池
- 设置Eclipse缩进为4个空格,增强代码规范
- CentOS8编译安装MySQL8.0.19
- CentOS8,CentOS7,CentOS6编译安装Redis5.0.7
- Eclipse初始化配置,告别卡顿、闪退、编译时间过长
- Docker安装Oracle12C,快速搭建Oracle学习环境
- SpringBoot2编写第一个Controller,响应你的http请求并返回结果
- SpringBoot2整合MyBatis,连接MySql数据库做增删改查操作
- Mario游戏-低调大师作品
- CentOS7编译安装Gcc9.2.0,解决mysql等软件编译问题