Salesforce

Cachedb is growing beyond the expected size

« Go Back

Information

 
TitleCachedb is growing beyond the expected size
URL NameCachedb-is-growing-beyond-the-expected-size-000072153
Article Number000187211
EnvironmentProduct: OpenEdge
Version: 11.6 and later
OS: All supported platforms
Other: OpenEdge Management, Orientdb
Question/Problem Description
Cachedb is growing beyond the expected size and filling up the filesystem in environments where there are a large amount of remote AdminServers with many resources being managed.
For example:

30 remote AdminServers with around 300 databases and AppServers. 

Java stack trace from com.orientechnologies.common.log.OLogManager reads:
 
[STDERR]                SEVERE: Error during WAL background flush
[STDERR]                java.io.IOException: No space left on device
[STDERR]                	at java.io.RandomAccessFile.writeBytes0(Native Method)
[STDERR]                	at java.io.RandomAccessFile.writeBytes(RandomAccessFile.java:520)
[STDERR]                	at java.io.RandomAccessFile.write(RandomAccessFile.java:537)
[STDERR]                	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog$LogSegment$FlushTask.flushPage(ODiskWriteAheadLog.java:232)
[STDERR]                	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog$LogSegment$FlushTask.commit(ODiskWriteAheadLog.java:196)
[STDERR]                	at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog$LogSegment$FlushTask.run(ODiskWriteAheadLog.java:129)
[STDERR]                	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
[STDERR]                	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[STDERR]                	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
[STDERR]                	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
[STDERR]                	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[STDERR]                	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[STDERR]                	at java.lang.Thread.run(Thread.java:744)

 
Steps to Reproduce
Clarifying Information
The following did not prevent or reduce the growth of the cachedb:
  • Adjusting the values for the fathom.init.paramsstorage.wal.maxSegmentSize, storage.wal.maxSize and storage.record.lockTimeout
  • Changing the settings in OEM under "Options-> Property resource distribution" to a few minutes.
  • Disabling the WAL lock using the -Dstorage.useWAL=false in AdminServerPlugins.properties.
  • Decreasing the number of days the graphcache database keeps the data for graphing.  

 
Error Message[STDERR] SEVERE: Error during WAL background flush
[STDERR] SEVERE: Exception during data flush.
Defect NumberPSC00346764 / ADAS-3364
Enhancement Number
Cause
Issues were introduced with the upgrade to the orient 2.0.8 version that we shipped with 11.6 which caused the Cachedb to grow uncontrollably eventually filling up the filesystem over time
This can also manifest as the .wal files (“write head log” similar to the openedge database .bi file) then using up CPU on the server as a consequence due to some deadlocking occurring on the cache database
Resolution
Upgrade to OpenEdge 11.6.3 and/or later versions where the orientdb 2.1.17 version was upgraded in order to fix this issue with the cachedb deadlocking .

As a precaution, run 'fathom -dump mylatestconf.xml before upgrading.
Before upgrading, ensure that the AdminServer is properly closed in order that the cachedb (graph and configuration) can be properly upgraded when next started.
 
Workaround
Increase the file system size and delete the cachedb directory under cachedata as needed in order to reclaim disk space. 

1. Periodically stop the adminserver (use -keepservers to keep ubrokers running ie only shutdown the adminserver and plugins)
2. Delete the content not the folder:
<oemwrk>/cachedata/cachedb/*

While the 'configdb' is also an orient database, it's only really for configuration information. There's not much updating only initial reading which is why you can (fortunately) leave it alone and is not part of the issue.

Occasionally only restarting the OpenEdge Management also can fix the issue.
Notes
Keyword Phrase
Last Modified Date9/20/2023 1:34 PM

Powered by