capacityFactor>0 nodes are re-added to the cluster the system can't get back to a stable status?
Asked Answered
H

0

7

I have copied below issue from JBoss forum. We faced same issue.

Here is my scenario:

I have a only one DIST_SYNC cache, most of the JVM in the cluster are configured with capacityFactor = 0 (like the distibutedlocalstorage=false property of Coherence) and some node are configured with capacityFactor>0 (for instance 1000). We are talking about 100 nodes with capacityFactor=0 and 4 nodes of the other kind, al the cluster is indide one single "site/rack". Partition Handling is off, numOwners is 1.

When all the nodes with capacityFactor > 0 are down the cluster comes to a degraded state from which it cannot recover anymore without a full cluster restart.

If I enable partition-handling AvailablyExceptions start to be thrown and I think is the expected behavior (Infinispan User Guide).

I think this is the problem and it is a bug:

14/11/17 09:27:25 WARN topology.CacheTopologyControlCommand: ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=shared, type=JOIN, sender=testserver1@xxxxxxx-22311, site-id=xxx, rack-id=xxx, machine-id=24 bytes, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.TopologyAwareConsistentHashFactory@78b791ef, hashFunction=MurmurHash3, numSegments=60, numOwners=1, timeout=120000, totalOrder=false, distributed=true}, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, throwable=null, viewId=3}

java.lang.IllegalArgumentException: A cache topology's pending consistent hash must contain all the current consistent hash's members

    at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:48)

    at org.infinispan.topology.CacheTopology.<init>(CacheTopology.java:43)

    at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:631)

    at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:85)

    at org.infinispan.partionhandling.impl.PreferAvailabilityStrategy.onJoin(PreferAvailabilityStrategy.java:22)

    at org.infinispan.topology.ClusterCacheStatus.doJoin(ClusterCacheStatus.java:540)

    at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123)

    at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:158)

    at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:140)

    at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:278)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

    at java.lang.Thread.run(Thread.java:745)

After that error every "put" results in:

14/11/17 09:27:27 ERROR interceptors.InvocationContextInterceptor: ISPN000136: Execution error

org.infinispan.util.concurrent.TimeoutException: Timed out waiting for topology 1

    at org.infinispan.statetransfer.StateTransferLockImpl.waitForTransactionData(StateTransferLockImpl.java:93)

    at org.infinispan.interceptors.base.BaseStateTransferInterceptor.waitForTransactionData(BaseStateTransferInterceptor.java:96)

    at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:188)

    at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:95)

    at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)

    at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)

    at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148)

    at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134)

    at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)

    at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)

    at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102)

    at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71)

    at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:35)

    at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)

    at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333)

    at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1576)

    at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1054)

    at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1046)

    at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1646)

    at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:245)

kindly help to resolve this problem.

Healing answered 12/2, 2018 at 13:17 Comment(2)
What Infinispan version? I don't see your question in the Infinispan forum. Could you add your JBoss forum post link too?Ratsbane
Infinispan version is 8.1.3 & Forum Link is developer.jboss.org/message/910446#910446Healing

© 2022 - 2024 — McMap. All rights reserved.