基于Yarn API的Spark程序监控
一.简述 基于对Yarn ResourceManager中运行程序的状态(RUNNING、KILLED、FAILED、FINISHED)以及ApplicationMaster中Application的Job执行时长超过批次时间的监控,来达到对Spark on Yarn程序的失败重启、超时重启等功能 二.Yarn主要的几类API 1.查询整个集群指标 GET http:// http address:port>/ws/v1/cluster/metrics 2.查询集群调度器详情 GET http:// http address:port>/ws/v1/cluster/scheduler 3.监控任务 curl 'http:// http address:port>/ws/v1/cluster/apps//state' GET http:// http address:port>/ws/v1/cluster/apps//state 4.查看指定任务 GET http:// http address:port>/ws/v1/cluster/apps/ 5.查看指...