Fair Scheduler到Capacity Scheduler 转换工具
o 一次查看多个节点
o 细粒度的锁
o 多个分配线程
o 吞吐量提高5-10倍
• 介绍Fair Scheduler-> Capacity Scheduler转换工具
• 描述其内部运作方式
• 解释命令行开关
• 提供有关如何使用该工具的示例
• 解释该工具的局限性,因为尚无法从Fair Scheduler到Capacity Scheduler 100%自动转换
• 谈论未来的调度
yarn fs2cs -y /path/to/yarn-site.xml [-f /path/to/fair-scheduler.xml] {-o /output/path/ | -p} [-t] [-s] [-d]-y /path/to/yarn-site.xml [-f /path/to/fair-scheduler.xml] {-o /output/path/ | -p} [-t] [-s] [-d]
yarn fs2cs --yarnsiteconfig /path/to/yarn-site.xml [--fsconfig /path/to/fair-scheduler.xml] {--output-directory /output/path/ | --print} [--no-terminal-rule-check] [--skip-verification] [--dry-run]--yarnsiteconfig /path/to/yarn-site.xml [--fsconfig /path/to/fair-scheduler.xml] {--output-directory /output/path/ | --print} [--no-terminal-rule-check] [--skip-verification] [--dry-run]
yarn fs2cs --yarnsiteconfig /home/hadoop/yarn-site.xml --fsconfig /home/hadoop/fair-scheduler.xml --output-directory /tmp--yarnsiteconfig /home/hadoop/yarn-site.xml --fsconfig /home/hadoop/fair-scheduler.xml --output-directory /tmp
<allocations><queue name="root"><queue name="root"><weight>1.0</weight><weight>1.0</weight><schedulingPolicy>drf</schedulingPolicy><schedulingPolicy>drf</schedulingPolicy><queue name="default"><queue name="default"><weight>1.0</weight><weight>1.0</weight><schedulingPolicy>drf</schedulingPolicy><schedulingPolicy>drf</schedulingPolicy></queue></queue><queue name="users" type="parent"><queue name="users" type="parent"><maxChildResources>memory-mb=8192, vcores=1</maxChildResources><maxChildResources>memory-mb=8192, vcores=1</maxChildResources><weight>1.0</weight><weight>1.0</weight><schedulingPolicy>drf</schedulingPolicy><schedulingPolicy>drf</schedulingPolicy></queue></queue></queue></queue><queuePlacementPolicy><queuePlacementPolicy><rule name="specified" create="true"/><rule name="specified" create="true"/><rule name="nestedUserQueue" create="true"><rule name="nestedUserQueue" create="true"><rule name="default" create="true" queue="users"/><rule name="default" create="true" queue="users"/></rule></rule><rule name="default"/><rule name="default"/></queuePlacementPolicy></queuePlacementPolicy></allocations>
yarn.scheduler.fair.allow-undeclared-pools = true.scheduler.fair.allow-undeclared-pools = trueyarn.scheduler.fair.user-as-default-queue = true.scheduler.fair.user-as-default-queue = trueyarn.scheduler.fair.preemption = false.scheduler.fair.preemption = falseyarn.scheduler.fair.preemption.cluster-utilization-threshold = 0.8.scheduler.fair.preemption.cluster-utilization-threshold = 0.8yarn.scheduler.fair.sizebasedweight = false.scheduler.fair.sizebasedweight = falseyarn.scheduler.fair.assignmultiple = true.scheduler.fair.assignmultiple = trueyarn.scheduler.fair.dynamicmaxassign = true.scheduler.fair.dynamicmaxassign = trueyarn.scheduler.fair.maxassign = -1.scheduler.fair.maxassign = -1yarn.scheduler.fair.continuous-scheduling-enabled = false.scheduler.fair.continuous-scheduling-enabled = falseyarn.scheduler.fair.locality-delay-node-ms = 2000.scheduler.fair.locality-delay-node-ms = 2000
~$ yarn fs2cs -y /home/examples/yarn-site.xml -f /home/examples/fair-scheduler.xml -o /tmp$ yarn fs2cs -y /home/examples/yarn-site.xml -f /home/examples/fair-scheduler.xml -o /tmp2020-05-05 14:22:41,384 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:prepareOutputFiles(138)) - Output directory for yarn-site.xml and capacity-scheduler.xml is: /tmp-05-05 14:22:41,384 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:prepareOutputFiles(138)) - Output directory for yarn-site.xml and capacity-scheduler.xml is: /tmp2020-05-05 14:22:41,388 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:loadConversionRules(177)) - Conversion rules file is not defined, using default conversion config!-05-05 14:22:41,388 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:loadConversionRules(177)) - Conversion rules file is not defined, using default conversion config![] output trimmed for brevity output trimmed for brevity2020-05-05 14:22:42,572 ERROR [main] converter.FSConfigToCSConfigConverterMain (MarkerIgnoringBase.java:error(159)) - -05-05 14:22:42,572 ERROR [main] converter.FSConfigToCSConfigConverterMain (MarkerIgnoringBase.java:error(159)) - Error while starting FS configuration conversion! while starting FS configuration conversion![] output trimmed for brevity output trimmed for brevityCaused by: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Rules after rule 2 in queue placement policy can never be reachedat org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementPolicy.updateRuleSet(QueuePlacementPolicy.java:115)[]
~$ yarn fs2cs -y /home/examples/yarn-site.xml -f /home/examples/fair-scheduler.xml -o /tmp --no-terminal-rule-check2020-05-05 14:41:39,189 INFO [main] capacity.CapacityScheduler (CapacityScheduler.java:initScheduler(384)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=<<memory:1024, vCores:1>>, maximumAllocation=<<memory:8192, vCores:4>>, asynchronousScheduling=false, asyncScheduleInterval=5ms,multiNodePlacementEnabled=false2020-05-05 14:41:39,190 INFO [main] converter.ConvertedConfigValidator (ConvertedConfigValidator.java:validateConvertedConfig(72)) - Capacity scheduler was successfully startedThis time, the conversion succeeded!
2020-05-05 14:41:38,908 WARN [main] converter.FSConfigToCSConfigRuleHandler (ConversionOptions.java:handleWarning(48)) - Setting <userMaxAppsDefault> is not supported, ignoring conversion2020-05-05 14:41:38,945 WARN [main] converter.FSConfigToCSConfigRuleHandler (ConversionOptions.java:handleWarning(48)) - Setting <maxChildResources> is not supported, ignoring conversion
yarn.scheduler.capacity.resource-calculator =org.apache.hadoop.yarn.util.resource.DominantResourceCalculatoryarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled = trueyarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
yarn.scheduler.capacity.root.users.maximum-capacity = 100yarn.scheduler.capacity.root.default.capacity = 50.000yarn.scheduler.capacity.root.default.ordering-policy = fairyarn.scheduler.capacity.root.users.capacity = 50.000yarn.scheduler.capacity.root.default.maximum-capacity = 100yarn.scheduler.capacity.root.queues = default,usersyarn.scheduler.capacity.root.maximum-capacity = 100yarn.scheduler.capacity.maximum-am-resource-percent = 0.5
yarn.scheduler.fair.preemption - trueyarn.scheduler.fair.sizebasedweight - trueyarn.scheduler.fair.continuous-scheduling-enabled - true
yarn.scheduler.capacity.resource-calculator =org.apache.hadoop.yarn.util.resource.DominantResourceCalculatoryarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms = 5yarn.scheduler.capacity.schedule-asynchronously.enable = trueyarn.resourcemanager.monitor.capacity.preemption.monitoring_interval = 10000yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill = 15000yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled = trueyarn.resourcemanager.scheduler.class =org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduleryarn.resourcemanager.scheduler.monitor.enable = true
yarn.scheduler.capacity.root.default.ordering-policy.fair.enable-size-based-weight = trueyarn.scheduler.capacity.root.users.ordering-policy.fair.enable-size-based-weight = trueyarn.scheduler.capacity.root.users.capacity = 50.000yarn.scheduler.capacity.root.queues = default,usersyarn.scheduler.capacity.root.users.maximum-capacity = 100yarn.scheduler.capacity.root.ordering-policy.fair.enable-size-based-weight = true[] rest is omitted because it’s the same as before
root.a = 3root.b = 1
yarn.scheduler.capacity.root.a.capacity = 75.000yarn.scheduler.capacity.root.a.maximum-capacity = 100.000yarn.scheduler.capacity.root.b.capacity = 25.000yarn.scheduler.capacity.root.b.maximum-capacity = 100.000
root = 1root.users = 20root.default = 10root.users.alice = 3root.users.bob = 1
yarn.scheduler.capacity.root.capacity = 100.000yarn.scheduler.capacity.root.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.capacity = 66.667yarn.scheduler.capacity.root.users.maximum-capacity = 100.000yarn.scheduler.capacity.root.default.capacity = 33.333yarn.scheduler.capacity.root.default.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.alice.capacity = 75.000yarn.scheduler.capacity.root.users.alice.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.bob.capacity = 25.000yarn.scheduler.capacity.root.users.bob.maximum-capacity = 100.000
• 每个用户的最大应用程序数
• <userMaxAppsDefault> –每个用户的默认最大应用程序
• <minResources> –队列的最小资源
• <maxResources> –队列的最大资源
• <maxChildResources> –动态创建的队列的最大资源
• 队列级别的DRF排序策略:在Capacity Scheduler中,DRF必须是全局的。在Fair Scheduler中,可以在DRF父项下使用常规的“ Fair”策略。
1) 在Capacity Scheduler(YARN-9936 )中将百分比向量作为资源处理。用户将不仅可以定义单个容量,还可以定义不同资源的多个值。
2) 手柄maxRunningApps 每用户userMaxAppsDefault (YARN-9930 )我们有“每用户最多的应用程序”设置,但它不是直接配置和繁琐,因为它的三个设置的组合。我们还必须注意不要破坏现有行为–如果超过了最大设置,Capacity Scheduler中的现有逻辑将拒绝提交应用程序,而在Fair Scheduler中,该应用程序始终被接受,并将在以后进行调度。
3) 处理minResources ,maxResources 和maxChildResources 这些在很大程度上取决于YARN-9936 。在Fair Scheduler中,用户可以通过多种方式(单个百分比,两个单独的百分比或绝对资源)表达这些设置。为了支持Capacity Scheduler中的类似设置,我们需要YARN-9936 。
4) 使映射规则的行为类似于Fair Scheduler中存在的实现。在“放置规则”部分中说明了如何评估映射规则。我们可能需要一种新的,可插入的方法–这样,我们就不会在已经非常复杂的现有代码库中引入回归。
5) 关于DRF和其他调度策略的改进(YARN-9892 )当前,我们有一个由属性yarn.scheduler.capacity.resource-calculator 定义的全局资源计算器。这在Fair Scheduler中更加细腻。
6) 关于整个转换过程的通用微调在Capacity Scheduler 中有一些属性,例如“ user-limit-factor”或“ minimum-user-limit-percent”。我们暂时不使用这些设置,但是事实证明,在某些配置中,它们被证明是有用的。
本文分享自微信公众号 - 大数据杂货铺(bigdataGrocery)。
如有侵权,请联系 support@oschina.cn 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。