您现在的位置是:首页 > 文章详情

Hystrix Semaphore timeout

日期:2018-10-03点击:622

When to use semaphore

For Thread isolation. there is a thread context switch cost. but this is almost can be ignored in most application. see Thread pool

For circuits that wrap very low-latency requests (such as those that primarily hit in-memory caches) the overhead can be too high and in those cases you can use another method such as tryable semaphores which, while they do not allow for timeouts, provide most of the resilience benefits without the overhead. The overhead in general, however, is small enough that Netflix in practice usually prefers the isolation benefits of a separate thread over such techniques.

Drawbacks

It does not allow for timing out and walking away.

why? show me the code.

 /** * Semaphore that only supports tryAcquire and never blocks and that supports a dynamic permit count. * <p> * Using AtomicInteger increment/decrement instead of java.util.concurrent.Semaphore since we don't need blocking and need a custom implementation to get the dynamic permit count and since * AtomicInteger achieves the same behavior and performance without the more complex implementation of the actual Semaphore class using AbstractQueueSynchronizer. */ /* package */static class TryableSemaphoreActual implements TryableSemaphore { protected final HystrixProperty<Integer> numberOfPermits; private final AtomicInteger count = new AtomicInteger(0); public TryableSemaphoreActual(HystrixProperty<Integer> numberOfPermits) { this.numberOfPermits = numberOfPermits; } @Override public boolean tryAcquire() { int currentCount = count.incrementAndGet(); if (currentCount > numberOfPermits.get()) { count.decrementAndGet(); return false; } else { return true; } } @Override public void release() { count.decrementAndGet(); } @Override public int getNumberOfPermitsUsed() { return count.get(); } } 

通过计数的形式实现信号量机制。 Since1.4.4 Hystrix 提供了信号量的timeOut 机制,但是timeout 不会中断原来的线程,但是在timeout 发生的时候,把这个信息反馈到熔断器上,这样能做到更加实时的熔断机制。timeout 也是通过Timer 来实现的。

 Observable<R> execution; if (properties.executionTimeoutEnabled().get()) { execution = executeCommandWithSpecifiedIsolation(_cmd) .lift(new HystrixObservableTimeoutOperator<R>(_cmd)); } else { execution = executeCommandWithSpecifiedIsolation(_cmd); } TimerListener listener = new TimerListener() { @Override public void tick() { // if we can go from NOT_EXECUTED to TIMED_OUT then we do the timeout codepath // otherwise it means we lost a race and the run() execution completed or did not start if (originalCommand.isCommandTimedOut.compareAndSet(TimedOutStatus.NOT_EXECUTED, TimedOutStatus.TIMED_OUT)) { // report timeout failure originalCommand.eventNotifier.markEvent(HystrixEventType.TIMEOUT, originalCommand.commandKey); // shut down the original request s.unsubscribe(); final HystrixContextRunnable timeoutRunnable = new HystrixContextRunnable(originalCommand.concurrencyStrategy, hystrixRequestContext, new Runnable() { @Override public void run() { child.onError(new HystrixTimeoutException()); } }); timeoutRunnable.run(); //if it did not start, then we need to mark a command start for concurrency metrics, and then issue the timeout } } @Override public int getIntervalTimeInMilliseconds() { return originalCommand.properties.executionTimeoutInMilliseconds().get(); } }; 

Note: if a dependency is isolated with a semaphore and then becomes latent, the parent threads will remain blocked until the underlying network calls timeout. Semaphore rejection will start once the limit is hit but the threads filling the semaphore can not walk away.

信号量机制不允许超时设定,所以会阻塞服务端的线程,这样如果当command里面的执行变得很慢的时候,就会block当前请求的线程。这个时候如果信号量设置得很大,比如100,那么这100个线程都会被阻塞。如果有多个调用如此,那客户端能处理的请求就会变少。

这里是ab测试:默认请求是3秒完成,客户端无超时时间,信号量100:

ab -n 200 -c 200 http://localhost:8990/coke/block Percentage of the requests served within a certain time (ms) 50% 3057 66% 3238 75% 3274 80% 3321 90% 3510 95% 3516 98% 3523 99% 3524 100% 3526 (longest request) 

期望:100个请求通过,100个被拒绝

Zuul and Hystrix timeout

As we know Zuul default use HystrixCommand with in ribbon http client.Default Hystrix isolation pattern (ExecutionIsolationStrategy) for all routes is SEMAPHORE.

Zuul 默认使用信号量做隔离(因为Zuul主要做请求转发已经是线程隔离的了,所以没有必要再使用一次线程隔离),超时由HttpClient的timeout 设置,当请求timeout之后抛出异常,然后才会触发对应的熔断降级。 如果使用线程池做隔离,则超时实践如下:

(ribbon.ConnectTimeout + ribbon.ReadTimeout) * (ribbon.MaxAutoRetries + 1) * (ribbon.MaxAutoRetriesNextServer + 1) 

Zuul default use serviceId as commandKey, default semophore is 100.

Sentinel

If you trust the client and you only want load shedding, you could use this approach.(Semaphore)

当我们的调用方有可能出现延迟,并且qps很高的时候(如果这个时候使用线程池,你可能需要创建很大数量的线程池比如,qps2000,响应时间1秒,这个时候就需要 2000大小的线程,会额外带来大量的线程切换开销),这个时候我们可以使用alibaba/Sentinel 来做熔断降级。

具体可以参考这里熔断

Sentinel 与 Hystrix 的对比

原文链接:https://my.oschina.net/tigerlene/blog/2222699
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章