Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
confirm #647
in 1.0.3 version. When many request come to hecotor pool . and more than one node have problem(doesn't send reponse but node is alive).
1 . application did't have socket connection
and i try this commnad ==> netstat -na | grep | wc - l ==> result is 0
2 . Thread used hector pool will be blocked ... and if application or cassandra didn't recover status. many thread will be Time_wait status.
3 . and hector pool status : blocked : 400, active: 0 : idle: 0
this is thread dump. all most thread are ...
"[CASSANDRA_JOB_WORKER-2]thread-91" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-90" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-89" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-87" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-86" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
so. i guess that my application lose connection socket. but it didn't recover connection in many reason.
so i think. how about make new routing policy. standard is Blocked/client count