We are currently using GridGain community Edition 8.8.10. We have setup the Ignite Cluster in Kubernetes using the Ignite operator. The cluster consists of 2 nodes with native persistence enabled and we are using thick client to connect to the Ignite cluster . The clients are also deployed in the same Kubernetes Cluster. The memory configuration of the Cluster is as follows :
-DIGNITE_WAL_MMAP=false -DIGNITE_QUIET=false -Xms6g -Xmx6g -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="name" value="Knowledge_Region"/>
<!-- Memory region of 20 MB initial size. -->
<property name="initialSize" value="#{20 * 1024 * 1024}"/>
<!-- Maximum size is 9 GB -->
<property name="maxSize" value="#{9L * 1024 * 1024 * 1024}"/>
<!-- Enabling eviction for this memory region. -->
<property name="pageEvictionMode" value="RANDOM_2_LRU"/>
<property name="persistenceEnabled" value="true"/>
<!-- Enabling SEGMENTED_LRU page replacement for this region. -->
<property name="pageReplacementMode" value="SEGMENTED_LRU"/>
</bean>
We are using the Ignite String function to query the cache. The Cache structure is as follows:
@QuerySqlField(index = true, inlineSize = 100)
private String value;
@QuerySqlField(name = "label", index = true, inlineSize = 100)
private String label;
@QuerySqlField(name = "type", index = true, inlineSize = 100)
@AffinityKeyMapped
private String type;
private String typeLabel;
private List<String> synonyms;
The SQL Query which we are using to get the data is as follows :
select _key, _val from TESTCACHEVALUE USE INDEX(TESTCACHEVALUE_label_IDX) WHERE REGEXP_LIKE(label, 'unit.*s.*','i') LIMIT 8
The Query Plan it is getting generated:
[05:04:56,613][WARNING][long-qry-#36][LongRunningQueryManager] Query execution is too long [duration=1124ms, type=MAP, distributedJoin=false, enforceJoinOrder=false, lazy=false, schema=staging_infrastructuretesting_business_object, sql='SELECT
"__Z0"."_KEY" AS "__C0_0",
"__Z0"."_VAL" AS "__C0_1"
FROM "staging_infrastructuretesting_business_object"."TESTCACHEVALUE" AS "__Z0" USE INDEX ("TESTCACHEVALUE_LABEL_IDX")
WHERE REGEXP_LIKE("__Z0"."LABEL", 'uni.*', 'i') FETCH FIRST 8 ROWS ONLY', plan=SELECT
__Z0._KEY AS __C0_0,
__Z0._VAL AS __C0_1
FROM staging_infrastructuretesting_business_object.TESTCACHEVALUE __Z0 USE INDEX (TESTCACHEVALUE_LABEL_IDX)
/* staging_infrastructuretesting_business_object.TESTCACHEVALUE.__SCAN_ */
/* scanCount: 289643 */
/* lookupCount: 1 */
WHERE REGEXP_LIKE(__Z0.LABEL, 'uni.*', 'i')
FETCH FIRST 8 ROWS ONLY
As I can see the Query is going for full scan and not using the Index specified in the Query.
The cache contains 5 million Objects.
The memory statistics of the Cluster is as follows :
^-- Node [id=d87d1212, uptime=00:30:00.229]
^-- Cluster [hosts=6, CPUs=20, servers=2, clients=4, topVer=12, minorTopVer=25]
^-- Network [addrs=[10.57.5.10, 127.0.0.1], discoPort=47500, commPort=47100]
^-- CPU [CPUs=1, curLoad=16%, avgLoad=38.3%, GC=0%]
^-- Heap [used=4265MB, free=30.58%, comm=6144MB]
^-- Off-heap memory [used=4872MB, free=58.58%, allocated=11564MB]
^-- Page memory [pages=620072]
^-- sysMemPlc region [type=internal, persistence=true, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.96%, allocRam=100MB, allocTotal=0MB]
^-- metastoreMemPlc region [type=internal, persistence=true, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.87%, allocRam=0MB, allocTotal=0MB]
^-- TxLog region [type=internal, persistence=true, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=100MB, allocTotal=0MB]
^-- volatileDsMemPlc region [type=internal, persistence=false, lazyAlloc=true,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
^-- Default_Region region [type=default, persistence=true, lazyAlloc=true,
... initCfg=20MB, maxCfg=9216MB, usedRam=4781MB, freeRam=48.12%, allocRam=9216MB, allocTotal=4753MB]
^-- Ignite persistence [used=4844MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=8, qSize=0]
^-- Striped thread pool [active=0, idle=8, qSize=0]
From the memory snapshot it seems like we have enough memory in the Cluster.
What I have tried so far.
- Index hint in the Query
- Applied limit to the Query
- Partitioned Cache with Query parallelism 3
- SkipReducer on update True
- OnheapCacheEnabled set to True
Not sure why the Query is taking time. Please let me know if i have missed anything.
One observation from the Query execution plan the time taken is around 2 secs but on the client side getting response in 5 sec.
Thanks in advance.