您现在的位置是:首页 > 文章详情

基于Yarn API的Spark程序监控

日期:2019-07-25点击:631

一.简述

基于对Yarn ResourceManager中运行程序的状态(RUNNING、KILLED、FAILED、FINISHED)以及ApplicationMaster中Application的Job执行时长超过批次时间的监控,来达到对Spark on Yarn程序的失败重启、超时重启等功能

二.Yarn主要的几类API

1.查询整个集群指标

GET http:// http address:port>/ws/v1/cluster/metrics

2.查询集群调度器详情

GET http:// http address:port>/ws/v1/cluster/scheduler

3.监控任务

curl 'http:// http address:port>/ws/v1/cluster/apps//state'
GET http:// http address:port>/ws/v1/cluster/apps//state

4.查看指定任务

GET http:// http address:port>/ws/v1/cluster/apps/

5.查看指定任务的详细信息

curl http:// http address:port>/proxy//ws/v2/mapreduce/info"

6.杀死任务

yarn application -kill application_id
curl -v -X PUT -d '{"state": "KILLED"}''http:// http address:port>/ws/v1/cluster/apps/'
PUT http:// http address:port>/ws/v1/cluster/apps//state

三.YarnMonitor

Ⅰ. Setup

1. install yarn-api-client

you mast install yarn-api-client when you use this yarn monitor

  • python setup.py build
  • python setup.py install

2. uninstall yarn-api-client

when you need uninstall this yarn-api-client model,use this

  • pip list
  • pip uninstall yarn-api-client

3. upate yarn-api-client

when you need update python model,you need unintall and update

  • update yarn-api-client
  • cp yarn-api-client/base.py base.py.bak

4. offline intall python model

when you need intall other python model,you can do this

  • pip freeze > yarn.txt
  • mkdir yarnpackage
  • pip install --no-index --find-links=yarnpackage/ -r yarn.txt

Ⅱ. Command

you should modify the script permissions

  • chmod 774 start_prmsbd.sh

Ⅲ. Crontab

configure crontab task

  • crontab -l
  • crontab -e

1.start yarn monitor

*/1 * * * * ../yarnmonitor/yarn-monitor/command/start_yarn_monitor.sh >> ../yarnmonitor/yarn-monitor/logs/yarn-corntab.log 2>&1 * * * * * sleep 60; ../yarnmonitor/yarn-monitor/command/start_yarn_monitor.sh

2.clear yarn monitor log

0 4 * * * ../yarn-monitor/command/clear_log_opm.sh >> ../yarn-monitor/logs/yarn-corntab.log 2>&1 0 4 * * * ../yarnmonitor/yarn-monitor/command/clear_log_opm.sh >> ../yarnmonitor/yarn-monitor/logs/yarn-corntab.log 2>&1

Ⅳ. Just for test

1. start yarn command

../python ./yarn-monitor/YarnMonitor.py

2. Application_Master API

curl --compressed -H "Accept: application/json" -X GET "http://***:8088/ws/v1/cluster/apps" curl --compressed -H "Accept: application/json" -X GET "http://***:8088/ws/v1/cluster/apps" curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1549963435527_0001/ws/v1/mapreduce/info" curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1535085750394_0017/ws/v1/mapreduce/info" curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1535085750394_0017/ws/v2/mapreduce/info" curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1535085750394_0017/ws/v1/mapreduce/jobs/4536" curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1548125170651_0090/api/v1/applications"

四.问题

其中,在ApplicationMaster中查询Job的返回数据无法转json的异常时,需修改yarn-api-client中修改对应API返回数据,可参考:

if 'ws/v1/mapreduce/info' in path: if response.status == OK: html_content = response.read() element_html = etree.HTML(html_content) tr_list = element_html.xpath('//tbody/tr') content_list = [] for tr in tr_list: item = {} item['id'] = tr.xpath('./td[1]/text()')[0].replace('\n', '').strip() item['duration'] = tr.xpath('./td[4]/text()')[0] # 打印每条信息 # logging.info(item) content_list.append(item) # print content_list return content_list response.close() return self.response_class(content_list) else: msg = 'Response finished with status: %s' % response.status raise APIError(msg)
原文链接:https://yq.aliyun.com/articles/710902
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章