您现在的位置是:首页 > 文章详情

APPARENT DEADLOCK!!! - C3P0连接池DeadLock机制分析

日期:2018-05-21点击:472

1 问题

近期,刚上线不久的生产系统的数据库连接池 C3P0版本为0.9.5.2)突然报出 APPARENT DEADLOCK!!! 错误。

1.1 错误日志

错误日志如下。

com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@7cf60134 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks! com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@7cf60134 -- APPARENT DEADLOCK!!! Complete Status: Managed Threads: 3 Active Threads: 3 Active Tasks: com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@3ce7f8aa on thread: C3P0PooledConnectionPoolManager[identityToken->1hgepsj9u1w3a4je54sly0|70165722]-HelperThread-#1 com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@17f3c7a1 on thread: C3P0PooledConnectionPoolManager[identityToken->1hgepsj9u1w3a4je54sly0|70165722]-HelperThread-#2 com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@65dc473c on thread: C3P0PooledConnectionPoolManager[identityToken->1hgepsj9u1w3a4je54sly0|70165722]-HelperThread-#0 Pending Tasks: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@744937e8 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@77e8b363 com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@7eafacf0 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@79cde1be com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@44067f18 com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@76c7bf50 Pool thread stack traces: Thread[C3P0PooledConnectionPoolManager[identityToken->1hgepsj9u1w3a4je54sly0|70165722]-HelperThread-#0,5,main] java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(SocketInputStream.java:116) java.net.SocketInputStream.read(SocketInputStream.java:171) java.net.SocketInputStream.read(SocketInputStream.java:141) oracle.net.ns.Packet.receive(Unknown Source) oracle.net.ns.DataPacket.receive(Unknown Source) oracle.net.ns.NetInputStream.getNextPacket(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099) oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070) oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478) oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:213) oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:796) oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1031) oracle.jdbc.driver.T4CPreparedStatement.executeMaybeDescribe(T4CPreparedStatement.java:836) oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1124) oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3285) oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3329) oracle.jdbc.OracleDatabaseMetaData.getTables(OracleDatabaseMetaData.java:2471) com.mchange.v2.c3p0.impl.DefaultConnectionTester$1.activeCheckConnectionNoQuery(DefaultConnectionTester.java:84) com.mchange.v2.c3p0.impl.DefaultConnectionTester$3.activeCheckConnectionNoQuery(DefaultConnectionTester.java:182) com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:275) com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(AbstractConnectionTester.java:79) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:504) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:464) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:436) com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:2211) com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696) Thread[C3P0PooledConnectionPoolManager[identityToken->1hgepsj9u1w3a4je54sly0|70165722]-HelperThread-#1,5,main] java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(SocketInputStream.java:116) java.net.SocketInputStream.read(SocketInputStream.java:171) java.net.SocketInputStream.read(SocketInputStream.java:141) oracle.net.ns.Packet.receive(Unknown Source) oracle.net.ns.DataPacket.receive(Unknown Source) oracle.net.ns.NetInputStream.getNextPacket(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099) oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070) oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478) oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:213) oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:796) oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1031) oracle.jdbc.driver.T4CPreparedStatement.executeMaybeDescribe(T4CPreparedStatement.java:836) oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1124) oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3285) oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3329) oracle.jdbc.OracleDatabaseMetaData.getTables(OracleDatabaseMetaData.java:2471) com.mchange.v2.c3p0.impl.DefaultConnectionTester$1.activeCheckConnectionNoQuery(DefaultConnectionTester.java:84) com.mchange.v2.c3p0.impl.DefaultConnectionTester$3.activeCheckConnectionNoQuery(DefaultConnectionTester.java:182) com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:275) com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(AbstractConnectionTester.java:79) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:504) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:464) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:436) com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:2211) com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696) Thread[C3P0PooledConnectionPoolManager[identityToken->1hgepsj9u1w3a4je54sly0|70165722]-HelperThread-#2,5,main] java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(SocketInputStream.java:116) java.net.SocketInputStream.read(SocketInputStream.java:171) java.net.SocketInputStream.read(SocketInputStream.java:141) oracle.net.ns.Packet.receive(Unknown Source) oracle.net.ns.DataPacket.receive(Unknown Source) oracle.net.ns.NetInputStream.getNextPacket(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.net.ns.NetInputStream.read(Unknown Source) oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099) oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070) oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478) oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:213) oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:796) oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1031) oracle.jdbc.driver.T4CPreparedStatement.executeMaybeDescribe(T4CPreparedStatement.java:836) oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1124) oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3285) oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3329) oracle.jdbc.OracleDatabaseMetaData.getTables(OracleDatabaseMetaData.java:2471) com.mchange.v2.c3p0.impl.DefaultConnectionTester$1.activeCheckConnectionNoQuery(DefaultConnectionTester.java:84) com.mchange.v2.c3p0.impl.DefaultConnectionTester$3.activeCheckConnectionNoQuery(DefaultConnectionTester.java:182) com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:275) com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(AbstractConnectionTester.java:79) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:504) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:464) com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:436) com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:2211) com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696) Task com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@65dc473c (in deadlocked PoolThread) failed to complete in maximum time 60000ms. Trying interrupt(). Task com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@3ce7f8aa (in deadlocked PoolThread) failed to complete in maximum time 60000ms. Trying interrupt(). Task com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@17f3c7a1 (in deadlocked PoolThread) failed to complete in maximum time 60000ms. Trying interrupt(). 

上面的日志报出了 APPARENT DEADLOCK!!! 错误,然后输出了
ThreadPoolAsynchronousRunner 的状态,包括线程数,当前正在处理的任务,队列中等待处理的任务,以及 ThreadPoolAsynchronousRunner 任务线程的堆栈。

1.2 配置

生产上的配置如下。

<bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" > <property name="driverClass" value="oracle.jdbc.driver.OracleDriver"></property> <property name="jdbcUrl" value="${datasource.jdbcUrl}"></property> <property name="user" value="${datasource.user}"></property> <property name="password" value="${datasource.pass}"></property> <property name="maxPoolSize" value="${datasource.maxPoolSize}"></property> <property name="maxIdleTime" value="180"></property> <property name="idleConnectionTestPeriod" value="60"></property> <property name="checkoutTimeout" value="30000"></property> </bean> 

基本相当于使用默认配置。

1.3 初步判断

根据日志,C3P0判断其线程池(ThreadPoolAsynchronousRunner)中的任务出现死锁(DEADLOCK),此时线程处于检查空闲数据链接状态是否可用。单从日志字面暂时无法判断问题所在。

查阅了网上的资料,对此问题原因分析较少,绝大部分不甚了了。比较多的文章是尝试通过调整参数来避免问题的出现的帖子。

秉着 “知其所以然” 的态度,再加上一点点的好奇心,我翻阅了一下C3P0的代码,试图从中发现端倪。

2 C3P0的死锁(DeadLock)机制分析

2.1 C3P0类结构

为了后面展示代码的需要,先附上一个我整理的C3P0的类结构图,有助于我们分析问题时确定问题的位置。如果下面的图太小,可以点击这里浏览

img_6c6d11cfcbba0dc650efba5aa18b48d7.png
c3p0类结构图

我们重点关注右下角的几个类。

2.2 ThreadPoolAsynchronousRunner线程池

C3P0PooledConnectionPoolManager 类在初始化时中创建了一个taskRunner,其类型为 ThreadPoolAsynchronousRunner , 是一个 C3P0 自定实现的线程池。

public final class C3P0PooledConnectionPoolManager{ ... ThreadPoolAsynchronousRunner taskRunner; ... private synchronized void _poolsInit() { ... this.taskRunner = createTaskRunner( num_task_threads, matt, timer, idStr + "-HelperThread" ); ... } ... private ThreadPoolAsynchronousRunner createTaskRunner( int num_threads, int matt /* maxAdministrativeTaskTime */, Timer timer, String threadLabel ) { ... out = new ThreadPoolAsynchronousRunner( num_threads, true, timer, threadLabel ); return out; ... } ... } 

ThreadPoolAsynchronousRunner 线程池被作为成员注入到多个对象之中,最终用于执行 BasicResourcePool 类中定义的 ScatteredAcquireTaskAcquireTaskRemoveTaskDestroyResourceTaskRefurbishCheckinResourceTaskAsyncTestIdleResourceTask 等任务。

我们来看一下 ThreadPoolAsynchronousRunner 的代码。

public final class ThreadPoolAsynchronousRunner implements AsynchronousRunner { ... TimerTask deadlockDetector = new DeadlockDetector(); ... public ThreadPoolAsynchronousRunner( int num_threads, boolean daemon, Timer sharedTimer, String threadLabel ) { this( num_threads, daemon, DFLT_MAX_INDIVIDUAL_TASK_TIME, DFLT_DEADLOCK_DETECTOR_INTERVAL, DFLT_INTERRUPT_DELAY_AFTER_APPARENT_DEADLOCK, sharedTimer, false, threadLabel ); } ... private ThreadPoolAsynchronousRunner( int num_threads, boolean daemon, int max_individual_task_time, int deadlock_detector_interval, int interrupt_delay_after_apparent_deadlock, Timer myTimer, boolean should_cancel_timer, String threadLabel ) { this.num_threads = num_threads; this.daemon = daemon; this.max_individual_task_time = max_individual_task_time; this.deadlock_detector_interval = deadlock_detector_interval; this.interrupt_delay_after_apparent_deadlock = interrupt_delay_after_apparent_deadlock; this.myTimer = myTimer; this.should_cancel_timer = should_cancel_timer; this.threadLabel = threadLabel; recreateThreadsAndTasks(); myTimer.schedule( deadlockDetector, deadlock_detector_interval, deadlock_detector_interval ); } ... } 

可以看到, ThreadPoolAsynchronousRunner 线程池默认创建3个执行线程,并且创建了一个 死锁检测(DeadLockDetector) 的定时任务,默认情况下每10秒执行一次。

2.3 DeadLockDetector内部机制

public final class ThreadPoolAsynchronousRunner implements AsynchronousRunner { ... class DeadlockDetector extends TimerTask { LinkedList last = null; LinkedList current = null; public void run() { ... current = (LinkedList) pendingTasks.clone(); ... if ( current.equals( last ) ) { //System.err.println(this + " -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!"); ... recreateThreadsAndTasks(); run_stray_tasks = true; } ... if (run_stray_tasks) { AsynchronousRunner ar = new ThreadPerTaskAsynchronousRunner( DFLT_MAX_EMERGENCY_THREADS, max_individual_task_time ); for ( Iterator ii = current.iterator(); ii.hasNext(); ) ar.postRunnable( (Runnable) ii.next() ); ar.close( false ); //tell the emergency runner to close itself when its tasks are complete last = null; } else last = current; ... 

可以看到,DeadLockDetector
ThreadPoolAsynchronousRunner 的内部类,是一个默认每10秒执行一次的定时任务。其工作原理是定时检查任务队列(pendingTasks)中的任务,如果前后两次检查发现待执行的任务没有变化,就认为可能产生了死锁,并额外创建线程执行等待任务。

3 问题分析

3.1 线程等待

从代码中可以看出,APPARENT DEADLOCK!!!报错原因主要是ThreadPoolAsynchronousRunner线程池的中的等待任务队列在经过一定时长(默认10秒)前后没有变化。

结合问题日志。我们可以初步判断问题的原因,是由于某些任务(如日志中显示AsyncTestIdleResourceTask)执行过程中失去响应(超过60s没有收到响应),导致后续的任务长时间等待,进而报错。

3.2 具体分析

进而观察执行线程的堆栈,可以发现三个任务都在执行检测连接有效性时,向服务端发送请求数据后,Oracle数据库未返回响应,导致线程长时间等待。

由于此问题无法通过构造场景重现,只能猜测,产生这种原因可能是连接在服务端已经超时,由服务端主动关闭连接,但由于TCP协议断开连接需要完成4次握手,而服务端只完成了前两次握手,这样就导致了客户端可以发送消息但接收不到任何内容,长时间等待直到超时。

3.3 解决方案

根据以上分析,我认为问题是maxIdleTime(连接最大空闲时间)与idleConnectionTestPeriod(空闲检测周期)参数配置与数据库不匹配引起的。

根据DBA的建议,将maxIdleTime调整为30秒,将idleConnectionTestPeriod调整为10秒。

4 后续

调整完后系统稳定运行,至今未再报出同样的错误。

在查看资料的过程中,发现有与我类似的情况,请看这里。问题出在同一块代码,但现象与原因均与我不同。我并不认为这是C3P0的缺陷,而是具体环境上的配置没有相匹配。

我再说一点废话,C3P0中大量使用了synchronized关键字进行加锁,由于单例没有使用double-check,导致正常代码中大量加锁,在高并发下效率很低。因此,如果性能对系统较为重要,还是推荐大家使用HikariCP替换C3P0。

原文链接:https://yq.aliyun.com/articles/654839
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章