mapreduce出现大量task被KILLED_UNCLEAN的3个原因-低调大师

mapreduce出现大量task被KILLED_UNCLEAN的3个原因

2015-08-11 780

Request received to kill task 'attempt_201411191723_2827635_r_000009_0' by user
-------
Task has been KILLED_UNCLEAN by the user

原因如下：
1.An impatient user (armed with "mapred job -kill-task" command)
2.JobTracker (to kill a speculative duplicate, or when a whole job fails)
3.Fair Scheduler (but diplomatically, it calls it “preemption”)

一篇老外的文章说的更详细：

This is one of the most bloodcurling (and my favorites) stories, that we have recently seen in our 190-square-meter Hadoopland. In a nutshell, some jobs were surprisingly running extremely long, because thousands of their tasks were constantly being killed for some unknown reasons by someone (or something).

For example, a photo, taken by our detectives, shows a job running for 12hrs:20min that spawned around 13,000 tasks until that moment. However (only) 4,118 of map tasks had finished successfully, while 8,708 were killed (!) and … surprisingly only 1 task failed (?) – obviously spreading panic in the Hadoopland.

When murdering, the killer was leaving the same message each time: "KILLED_UNCLEAN by the user" (however, even our ~~uncle~~ competitor Google does not know too much what it exactly means ;)). Who is “the user”? Does the killer want to impersonate someone?

More Traces Of Crime

The detectives started looking for more traces of crime. They have noticed the killed tasks belong to ad-hoc Hive queries which are quite resource-intensive. When looking at timestamps in log files from JobTracker, TaskTracker and map tasks, they figured out that JobTracker got a request to murder the tasks…

They have also noticed that tasks were usually killed young, quickly after the start (within 6-16 minutes), while the surviving tasks are running fine long hours.. The killer is unscrupulous!

Killer’s Identity

Who can actually send a kill request to JobTracker to murder thousands of tasks? Detectives quickly selected there main candidates:

An impatient user (armed with "mapred job -kill-task" command)
JobTracker (to kill a speculative duplicate, or when a whole job fails)
Fair Scheduler (but diplomatically, it calls it “preemption”)

When looking at log messages saying that a task is "KILLED UNCLEAN by the user", one could think that some user is a prime candidate to be the serial killer. However, the citizens of our Hadoopland are friendly, patient and respective to others, so that it would be unfair to assume that somebody killed, in cold blood, 8,708 tasks from a single jobs.

JobTracker also seems to have a good alibi, because the job itself had not failed yet and the speculative execution was disabled (surprisingly Hive has own setting, hive.mapred.reduce.tasks.speculative.execution, for disabling speculative execution for reduce tasks, which is not overwritten by Hadoop’s mapred.reduce.tasks.speculative.execution).

FairScheduler Accused

For some company-specific reasons, the ad-hoc Hive queries are running as hive user in our Hadoopland. Moreover FairScheduler is configured with the default value of mapred.fairscheduler.poolnameproperty (which is user.name), so that the pools are created dynamically based on the username of user submitting the job to the cluster (“hive” in case of our ad-hoc Hive queries).

When browsing one presentation about Hadoop 2 years ago, one of the detectives just remembered that FairScheduler is usually preempting the newest tasks in an over-share pool to forcibly make some room for starved pools.

Eureka! ;)

At this movement everything became clear and a quick look at FairScheduler webpage confirmed it. “Hive” pool was running over its minimum and fair shares for a long time, while the other pools are constantly running under their minimum and fair shares. In such a case, Fair Scheduler was killing Hive tasks from time to time to reassign slots to tasks from other pools.

Less Violence, More Peace

Having the evidence, we could put Fair Scheduler in prison, and use Capacity Scheduler instead. Maybe in the future, we will do that! Today, we believe that Fair Scheduler has not committed the crimes really intentionally – we feel that we have educated it badly and gave it too much power. Today, Fair Scheduler gets the suspended sentence – we want to give it a chance to rehabilitate and become more friendly and less aggressive…

How to dignify the personality of Fair Scheduler?

Obviously tuning settings like minSharePreemptionTimeout, fairSharePreemptionTimeout, minMaps and minReduces based on the current workload could be a good way to control the aggressiveness of the preemption of Fair Scheduler. Easier said, than done, because it requires a deep understanding of and knowledge about your workload (which later may change or not).

There is a setting called mapred.fairscheduler.preemption that disables or enables preemption. However disabling preemption (or rather killing, to be precise), in our case, would just partially solve the problem. Only partially, because this issue exposed another problem in the Hadoopland – ad-hoc Hive queries are overloading the cluster.. Finally, we have not disabled preemption, because we were worrying a bit about SLA not being enforced without “any” preemption.

Having this said, the two problems to solve are:

stop mass killing Hive tasks
stop overloading the cluster by ad-hoc Hive queries

We simply limited the number of map and reduce tasks that Fair Scheduler can run in Hive pool (by setting maxMaps and maxReduces for that pool). In consequence, Hive pool could not contain too many task, so that Fair Scheduler could not kill too many of them ;) (because Hive pool’s will not be operating (too much) above its min and fair share level). Limiting the number of tasks prevents also from overloading the cluster by Hive queries (additionally one could also set the maximum number of concurrent jobs running in Hive pool using maxRunningJobs).

A nice thing to say is that Fair Scheduler is eager to cooperate, because changing the FairScheduler’s allocation file, does not require restarting of JobTracker. This file is automatically polled for changes every 10 seconds and if it has changed, it is reloaded and the pool configurations are updated on the fly. Thanks to that you can easily learn and change the personality of Fair Scheduler better. ;)

No related posts found.

微信关注我们

原文链接：https://yq.aliyun.com/articles/238645

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

Spring boot 通用配置文件模板

Spring boot 通用配置文件模板 001 # =================================================================== 002 # COMMON SPRING BOOT PROPERTIES 003 # 004 # This sample file is provided as a guideline. Do NOT copy it in its 005 # entirety to your own application. ^^^ 006 # =================================================================== 007 008 # ---------------------------------------- 009 # CORE PROPERTIES 010 # ---------------------------------------- 011 012 # SPRING CONFIG (ConfigFileApplicati...

2015-08-10

622

一、概述近年来，大数据技术如火如荼，如何存储海量数据也成了当今的热点和难点问题，而HDFS分布式文件系统作为Hadoop项目的分布式存储基础，也为HBASE提供数据持久化功能，它在大数据项目中有非常广泛的应用。 Hadoop分布式文件系统(Hadoop Distributed File System，HDFS)被设计成适合运行在通用硬件(commodity hardware)上的分布式文件系统。HDFS是Hadoop项目的核心子项目，是一种具有高容错性、高可靠性、高可扩展性、高吞吐量等特征的分布式文件系统，可用于云计算或其它大数据应用中海量数据的存储(主要为大文件的存储)。本文结合作者本人及同事对HDFS的学习和实践的理解，首先介绍HDFS的特点和重要SHELL命令(hadoop和hdfs命令)的使用，接着介绍HDFS提供的C访问接口LIB HDFS及其跟普通文件系统的C API的异同，然后介绍如何利用LIB HDFS接口实现简单的HDFS客户端并列举相关应用实例，最后针对编写HDFS客户端中遇到的问题进行描述和分析。二、HDFS简介 HDFS是Hadoop项目的核心子项目，是一...

2015-08-12

721

资源下载

更多资源

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

Nacos

Nacos /nɑ:kəʊs/ 是 Dynamic Naming and Configuration Service 的首字母简称，一个易于构建 AI Agent 应用的动态服务发现、配置管理和AI智能体管理平台。Nacos 致力于帮助您发现、配置和管理微服务及AI智能体应用。Nacos 提供了一组简单易用的特性集，帮助您快速实现动态服务发现、服务配置、服务元数据、流量管理。Nacos 帮助您更敏捷和容易地构建、交付和管理微服务平台。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

Sublime Text

Sublime Text具有漂亮的用户界面和强大的功能，例如代码缩略图，Python的插件，代码段等。还可自定义键绑定，菜单和工具栏。Sublime Text 的主要功能包括：拼写检查，书签，完整的 Python API ， Goto 功能，即时项目切换，多选择，多窗口等等。Sublime Text 是一个跨平台的编辑器，同时支持Windows、Linux、Mac OS X等操作系统。