I've been trying to identify a problem that I am experiencing on one of our Centos servers. The kswapd0 process runs close to 100% most of the time causing consistently high %IOWAIT times. The servers is running 11.5.1 with appserver connections to the DB. If I run a free -m this is the result, indicating i have plenty of memory available.
total used free shared buffers cached
Mem: 9888 9779 108 6509 0 6538-/+ buffers/cache: 3240 6648Swap: 9997 3348 6649
What I have identified is if I run the proadsv -keepservers -stop command the kswapd0 process goes away and %IOWAIT time drops to acceptable levels (under %5 as apposed to 10 - 20). The second I bring up the proadsv process performance drops.
Any ideas how I can tweak the OS or the AdminServerPlugins.properties file to reduce swapping to disk.
what does cat /proc/meminfo show?
what does vmstat show ?
Run the following...
A lot of your memory is being used for OS caching, kswap is maintaining those buffers as part of its job. For database servers I usually set swappiness to 0, dirty_ratio to 60 and dirty_background_ratio to 5 as a starting point.
Before you make any changes you need to carefully consider what you are going to do with that memory (-B, -T, etc.) or you will probably run into new performance problems as pages are read from disk instead of the buffer cache.
here is the output from meminfo
MemTotal: 10125560 kB
MemFree: 140592 kB
Buffers: 2688 kB
Cached: 6714752 kB
SwapCached: 537032 kB
Active: 1959232 kB
Inactive: 918092 kB
Active(anon): 1941552 kB
Inactive(anon): 884180 kB
Active(file): 17680 kB
Inactive(file): 33912 kB
Unevictable: 6665748 kB
Mlocked: 0 kB
SwapTotal: 10237948 kB
SwapFree: 6498224 kB
Dirty: 484 kB
Writeback: 0 kB
AnonPages: 2589180 kB
Mapped: 6677372 kB
Shmem: 6665756 kB
Slab: 136420 kB
SReclaimable: 46372 kB
SUnreclaim: 90048 kB
KernelStack: 13200 kB
PageTables: 181136 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 15300728 kB
Committed_AS: 14072256 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 312140 kB
VmallocChunk: 34359410440 kB
HardwareCorrupted: 0 kB
AnonHugePages: 73728 kB
Hugepagesize: 2048 kB
DirectMap4k: 8560 kB
DirectMap2M: 10475520 kB
Vmstat shows a lot of swapping, particularly on "si"
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
2 8 3777404 110484 508 6693548 952 0 152356 176 9912 3891 7 3 65 25 0 1 0 3777312 130964 708 6698072 152 0 58648 44 4453 1837 8 2 81 9 0 0 0 3776792 129444 1140 6698200 616 0 7800 8 3844 1983 13 3 80 4 0 1 1 3775372 122144 1056 6700292 1676 0 9224 64 3509 2134 8 4 84 5 0 1 3 3773624 112072 1064 6708128 2642 0 14114 10 3953 2536 7 3 65 25 0
Thanks for your response, please see values below.
cat /proc/sys/vm/swappiness ( I recently changed the default of 60 to 1, there has been a slight improvement)
i've also tried the following with some improvement
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
With over 6Gb of free memory I wouldn't expect the OS to swap so much. I haven't worked with "dirty_ration" or "dirty_backgroud_ratio", I will certainly look into these parameters.
Well you /don't/ have 6 GB of free memory.
We don't know what you are running on this system or how the database and other stuff are configured, but you do not have enough memory.
- there is 10 GB of total memory, which is not a lot for an active server
- there is about 4 GB of stuff paged out because it does not fit in memory.
- there is a small amount of filesystem cache (Active(file))
- there is 6 GB of stuff that is locked in memory (unevictable)
- there is 6 GB'ish of "Cached" but that can be any of lots of things, including filesystem pages, code, and dynamically allocated memory.
We have to figure out where your memory is going. Questions:
How many databases are you running?
How much shared memory are they allocating?
What else is allocating shared memory?
How many app servers?
How many 4GL runtime clients?
What else is running on this box?
What does vmstat or some other utility (e.g. nmon) tell about pagin and pageout I/O rates?
Hi Gus, I really appreciate your feedback on this one. I must be honest I don't fully understand the memory allocation on Linux, there are a lot of contradictory articles out there. If this server is indeed running out of memory that would explain the high system load and swapping.
This is a customers machine and is dedicated for running Progress and nothing else. There are 7 DB's running that make up the application. Below is the size and memory allocation for each
DB1 729 Mb 1.1Gb shared memory static DB High reads
DB2 87Gb 512Mb shared memory Documentdb High writes
DB3 26Mb 46Mb shared memory Framework DB (low reads/writes
DB4 2Mb 95Mb shared memory Integration DB (low reads/writes)
DB5 26Gb 2.6Gb shared memory Main DB (high reads/mediam writes (was on 3.2Gb, I decreased the -B last night after reading your response)
DB6 10Mb 46Mb shared memory Framework DB (low reads/writes)
DB7 33Mb 234Mb shared memory Temp DB
Total 4.6Gb shared memory
Only 1 appserver running with on average 10 agents
other memory allocation
240 ABL clients
Its currently 06:30 now so I will try and gather some more stats during the course of today
Its 9am now and i'm already seeing a great improvement in performance. You were spot on about the memory. Just by reducing the Buffer pool (not ideal, if anything I would like to allocate more) the server is no longer swapping. At least 70% improvement in overall performance.
As I was providing you with the shared memory allocations along with the temp table Bt param its was quite obvious that I was over committing on memory. I just need to convince the customer to invest in new hardware or at least upgrade the memory.
I would appreciate if you could provide me with more insight into "meminfo", I would really like to get a better understanding on the memory allocation on an OS level.
12:15: Steady performance. Thanks again
here's a snippet of vmstat output:
Note the "si" and "so" columns. These indicate pages read in and pages written (i.e. swapped) out. It is possible for some of the swap space to be in use but no paging going on, such as in this case. A typical system has lots of processes that are running only occasionally and those can be swapped out but nothing is wrong.
These days, there is only demand paging in most systems. The idea of swapping out entire processes all at once has long been abandoned.
To make things confusing, some systems use the demand pager to do file I/O. That can make it hard to distinguish paging from other I/O.