您现在的位置是:首页 > 文章详情

记录一次线程池使用不当,导致进程一直等待的问题

日期:2020-09-12点击:826

感觉写这个标题,明眼人一看可能觉得这不就是死锁吗?但是今天说的情况还不是真正意义上的死锁,顶多算是宏观意义上的死锁。而且这个情况使用jstack工具查看不到死锁的信息。

使用线程池不当,导致的线程相互等待

今天的例子

public class Test { static ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(1, 1, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>()); public static void main(String[] args) throws ExecutionException, InterruptedException { Future<String> outterFuture = threadPoolExecutor.submit(() -> { Future<String> innerFuture = threadPoolExecutor.submit(() -> { System.out.println("inner finish"); return "inner finish"; }); String s = innerFuture.get(); System.out.println("outter get inner finish:" + s); System.out.println("outter finish"); return "outter finish"; }); String s = outterFuture.get(); System.out.println("process get outter finish:" + s); } } 

意思就是提交了一个线程1,线程1里面提交了一个线程2,线程1等待线程2的结果。可能有些人很明显就看出问题了,当然这个是简化后的结果,实际情况线程池使用可能比这隐晦的多。执行这个方法,直接就会导致两个线程相互等待。

jstack现象

2020-09-12 09:52:41 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode): "Attach Listener" #11 daemon prio=9 os_prio=0 tid=0x00007fbf38001000 nid=0x37c waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007fbf9819c800 nid=0x7932 waiting on condition [0x00007fbf77af9000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000006c8e08478> (a java.util.concurrent.FutureTask) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at Test.lambda$main$1(Test.java:24) at Test$$Lambda$1/1418481495.call(Unknown Source) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) "Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007fbf980d2000 nid=0x7930 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007fbf980c7000 nid=0x792f waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fbf980c4800 nid=0x792e waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fbf980c3000 nid=0x792d waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fbf980c0000 nid=0x792c waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fbf980be800 nid=0x792b runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fbf9808b800 nid=0x792a in Object.wait() [0x00007fbf84371000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000006c8e01a60> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked <0x00000006c8e01a60> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fbf98086800 nid=0x7929 in Object.wait() [0x00007fbf84472000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000006c8e0f950> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x00000006c8e0f950> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153) "main" #1 prio=5 os_prio=0 tid=0x00007fbf98008800 nid=0x791e waiting on condition [0x00007fbf9e635000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000006c8e177b8> (a java.util.concurrent.FutureTask) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at Test.main(Test.java:31) "VM Thread" os_prio=0 tid=0x00007fbf9807f000 nid=0x7928 runnable "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fbf9801d800 nid=0x791f runnable "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fbf9801f800 nid=0x7920 runnable "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fbf98021800 nid=0x7921 runnable "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fbf98023000 nid=0x7922 runnable "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fbf98025000 nid=0x7923 runnable "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fbf98027000 nid=0x7925 runnable "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fbf98028800 nid=0x7926 runnable "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fbf9802a800 nid=0x7927 runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007fbf980d5000 nid=0x7931 waiting on condition JNI global references: 201 

通过jstack没有主动发现死锁情况。由于真实情况业务和组件的线程很多更难判断。

线程池参数解析

下面是ThreadPoolExecutor线程池参数最对的构造函数

public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory, RejectedExecutionHandler handler) { ...... } 

函数的参数含义如下(具体细节请自行百度):

  • corePoolSize: 线程池核心线程数
  • maximumPoolSize:线程池最大数
  • keepAliveTime: 空闲线程存活时间
  • unit: 时间单位
  • workQueue: 线程池所使用的缓冲队列
  • threadFactory:线程池创建线程使用的工厂
  • handler: 线程池对拒绝任务的处理策略

原因分析1

例子中定义的核心线程数和最大线程数都是1,说明线程池只能同时有一个线程在执行。然后定义了一个线程队列存放待执行的线程。问题就在于,提交线程outter,该线程就占据了核心线程数1,然后线程outter里面提交了一个线程inner,并等待线程inner的执行结果。而线程inner一直没执行,因为线程inner需要等待线程池当前执行线程数小于最大线程数之后才能,在队列中等待的线程。导致了线程outter占据了线程池能执行任务的最大数量,等待线程inner的结果,线程inner等待线程池来执行而未返回结果。

原因分析2

其实通过jstack 的日志也是能发现问题的,如名为Reference Handler和名为Finalizer的线程中,自生waiting onlocked的条件是相同的,就是自己等自己,出现了一直等待。

死锁

这里先温习一下死锁的情况。

死锁条件

  1. 互斥使用,即当资源被一个线程使用(占有)时,别的线程不能使用
  2. 不可抢占,资源请求者不能强制从资源占有者手中夺取资源,资源只能由资源占用者主动释放
  3. 请求和保持,即当资源的请求者在请求其他的资源的同时保持对原有资源的占有
  4. 循环等待,即存在一个等待队列: P1占有P2的资源,P2占有P3的资源,P3占有P1的资源。

死锁例子

public class DeadLock implements Runnable{ private static Object obj1 = new Object(); private static Object obj2 = new Object(); private boolean flag; public DeadLock(boolean flag){ this.flag = flag; } @Override public void run(){ System.out.println(Thread.currentThread().getName() + "运行"); if(flag){ synchronized(obj1){ System.out.println(Thread.currentThread().getName() + "已经锁住obj1"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } synchronized(obj2){ // 执行不到这里 System.out.println("1秒钟后,"+Thread.currentThread().getName() + "锁住obj2"); } } }else{ synchronized(obj2){ System.out.println(Thread.currentThread().getName() + "已经锁住obj2"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } synchronized(obj1){ // 执行不到这里 System.out.println("1秒钟后,"+Thread.currentThread().getName() + "锁住obj1"); } } } } public static void main(String[] args) { Thread t1 = new Thread(new DeadLock(true), "线程1"); Thread t2 = new Thread(new DeadLock(false), "线程2"); t1.start(); t2.start(); } } 

jstack现象

Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode): "DestroyJavaVM" #13 prio=5 os_prio=0 tid=0x0000000003866000 nid=0x2ffc waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "线程2" #12 prio=5 os_prio=0 tid=0x000000001e6b8000 nid=0x20e4 waiting for monitor entry [0x000000001f8bf000] java.lang.Thread.State: BLOCKED (on object monitor) at com.wp.security.springboot.DeadLock.run(DeadLock.java:42) - waiting to lock <0x000000076b47b980> (a java.lang.Object) - locked <0x000000076b47b990> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748) "线程1" #11 prio=5 os_prio=0 tid=0x000000001eec8800 nid=0x11d8 waiting for monitor entry [0x000000001f7bf000] java.lang.Thread.State: BLOCKED (on object monitor) at com.wp.security.springboot.DeadLock.run(DeadLock.java:28) - waiting to lock <0x000000076b47b990> (a java.lang.Object) - locked <0x000000076b47b980> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748) "Service Thread" #10 daemon prio=9 os_prio=0 tid=0x000000001e607000 nid=0x3888 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C1 CompilerThread2" #9 daemon prio=9 os_prio=2 tid=0x000000001e57c800 nid=0x1a1c waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" #8 daemon prio=9 os_prio=2 tid=0x000000001e56f000 nid=0x37b4 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" #7 daemon prio=9 os_prio=2 tid=0x000000001e56e800 nid=0x1eb0 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Monitor Ctrl-Break" #6 daemon prio=5 os_prio=0 tid=0x000000001e56a800 nid=0x2298 runnable [0x000000001e9be000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) - locked <0x000000076b4cf910> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) - locked <0x000000076b4cf910> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:389) at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:61) "Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x000000001cf8a000 nid=0x1e84 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x000000001cf74000 nid=0x2330 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" #3 daemon prio=8 os_prio=1 tid=0x000000001cf4e800 nid=0x4168 in Object.wait() [0x000000001e2bf000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000076b208ed0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked <0x000000076b208ed0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:212) "Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x0000000003956000 nid=0x3478 in Object.wait() [0x000000001e1bf000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000076b206bf8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x000000076b206bf8> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153) "VM Thread" os_prio=2 tid=0x000000001cf27000 nid=0x47a4 runnable "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x000000000387b800 nid=0x1ec8 runnable "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x000000000387d000 nid=0x47a0 runnable "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x000000000387e800 nid=0x3364 runnable "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x0000000003881800 nid=0x4848 runnable "VM Periodic Task Thread" os_prio=2 tid=0x000000001e5e5800 nid=0x1318 waiting on condition JNI global references: 12 Found one Java-level deadlock: ============================= "线程2": waiting to lock monitor 0x000000001cf4b598 (object 0x000000076b47b980, a java.lang.Object), which is held by "线程1" "线程1": waiting to lock monitor 0x000000001cf4ded8 (object 0x000000076b47b990, a java.lang.Object), which is held by "线程2" Java stack information for the threads listed above: =================================================== "线程2": at com.wp.security.springboot.DeadLock.run(DeadLock.java:42) - waiting to lock <0x000000076b47b980> (a java.lang.Object) - locked <0x000000076b47b990> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748) "线程1": at com.wp.security.springboot.DeadLock.run(DeadLock.java:28) - waiting to lock <0x000000076b47b990> (a java.lang.Object) - locked <0x000000076b47b980> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748) Found 1 deadlock. 

这里看线程1线程2中的waiting to locklocked 后的资源,一目了然。而且jstack结尾也有提示发现死锁Found one Java-level deadlock

为什么jstack不能主动发现死锁

在线程池的例子中并没有明确的是通过占用锁,导致死锁,所以这个例子中不算死锁。而死锁的例子很明确,就是两个线程相互抢占锁导致的,所以这个就是死锁,在jstack中会发现死锁。

如何判断类似于死锁的相互等待

出现类似这种情况,在jstack不提示的情况下,通过分析业务逻辑的线程确实难以发现问题所在。我对比了一下这两个例子的线程dump,注意到waiting onwaiting to lockparking to wait forlocked这几个关键字。在百度查了一下。

  • waiting on condition表示非Object.wait的条件等待,比如说你调用了sleep,park等操作
  • parking to wait for 就是调用了park动作了
  • waiting to lock 就是等待一个锁对象

死锁的例子中jstack之所以能检测出死锁,我猜估计他是通过waiting to locklocked 判断,也就是真正意义上的死锁。而waiting onlocked,是今天讨论线程池中线程等待出现的情况。如果想判断线程是否出现这种类似于死锁的相互等待和死锁,其实需要判断所有的waitinglocked条件中是否相同。

原文链接:https://my.oschina.net/MyoldTime/blog/4559573
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章