kswapd0 running at 99.99% - Forum - OpenEdge RDBMS - Progress Community

kswapd0 running at 99.99%

 Forum

kswapd0 running at 99.99%

  • I've been trying to identify a problem that I am experiencing on one of our Centos servers.  The kswapd0 process runs close to 100% most of the time causing consistently high %IOWAIT times.  The servers is running 11.5.1 with appserver connections to the DB.  If I run a free -m this is the result, indicating i have plenty of memory available.

    total used free shared buffers cached

    Mem: 9888 9779 108 6509 0 6538
    -/+ buffers/cache: 3240 6648
    Swap: 9997 3348 6649

    What I have identified is if I run the proadsv -keepservers -stop command the kswapd0 process goes away and %IOWAIT time drops to acceptable levels (under %5 as apposed to 10 - 20).  The second I bring up the proadsv process performance drops.

    Any ideas how I can tweak the OS or the AdminServerPlugins.properties file to reduce swapping to disk.

  • what does cat /proc/meminfo show?

    what does vmstat show ?

  • Run the following...

    cat /proc/sys/vm/dirty_ratio

    cat /proc/sys/vm/swappiness

    cat /proc/sys/vm/dirty_background_ratio

    A lot  of your memory is being used for OS caching, kswap is maintaining those buffers as part of its job. For database servers I usually set swappiness to 0, dirty_ratio to 60 and dirty_background_ratio to 5 as a starting point.

    Before you make any changes you need to carefully consider what you are going to do with that memory (-B, -T, etc.) or you will probably run into new performance problems as pages are read from disk instead of the buffer cache.

  • Hi Gus,

    here is the output from meminfo

    MemTotal:       10125560 kB

    MemFree:          140592 kB

    Buffers:            2688 kB

    Cached:          6714752 kB

    SwapCached:       537032 kB

    Active:          1959232 kB

    Inactive:         918092 kB

    Active(anon):    1941552 kB

    Inactive(anon):   884180 kB

    Active(file):      17680 kB

    Inactive(file):    33912 kB

    Unevictable:     6665748 kB

    Mlocked:               0 kB

    SwapTotal:      10237948 kB

    SwapFree:        6498224 kB

    Dirty:               484 kB

    Writeback:             0 kB

    AnonPages:       2589180 kB

    Mapped:          6677372 kB

    Shmem:           6665756 kB

    Slab:             136420 kB

    SReclaimable:      46372 kB

    SUnreclaim:        90048 kB

    KernelStack:       13200 kB

    PageTables:       181136 kB

    NFS_Unstable:          0 kB

    Bounce:                0 kB

    WritebackTmp:          0 kB

    CommitLimit:    15300728 kB

    Committed_AS:   14072256 kB

    VmallocTotal:   34359738367 kB

    VmallocUsed:      312140 kB

    VmallocChunk:   34359410440 kB

    HardwareCorrupted:     0 kB

    AnonHugePages:     73728 kB

    HugePages_Total:       0

    HugePages_Free:        0

    HugePages_Rsvd:        0

    HugePages_Surp:        0

    Hugepagesize:       2048 kB

    DirectMap4k:        8560 kB

    DirectMap2M:    10475520 kB

    Vmstat shows a lot of swapping, particularly on "si"

    procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----

    2 8 3777404 110484 508 6693548 952 0 152356 176 9912 3891 7 3 65 25 0
    1 0 3777312 130964 708 6698072 152 0 58648 44 4453 1837 8 2 81 9 0
    0 0 3776792 129444 1140 6698200 616 0 7800 8 3844 1983 13 3 80 4 0
    1 1 3775372 122144 1056 6700292 1676 0 9224 64 3509 2134 8 4 84 5 0
    1 3 3773624 112072 1064 6708128 2642 0 14114 10 3953 2536 7 3 65 25 0

  • Hi Keith,

    Thanks for your response, please see values below.

    cat /proc/sys/vm/dirty_ratio

    20

    cat /proc/sys/vm/swappiness ( I recently changed the default of 60 to 1, there has been a slight improvement)

    1

    cat /proc/sys/vm/dirty_background_ratio

    10

    i've also tried the following with some improvement

    echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

    echo never > /sys/kernel/mm/transparent_hugepage/enabled

    With over 6Gb of free memory I wouldn't expect the OS to swap so much. I haven't worked with "dirty_ration" or "dirty_backgroud_ratio", I will certainly look into these parameters.

  • Well you /don't/ have 6 GB of free memory.

    We don't know what you are running on this system or how the database and other stuff are configured, but you do not have enough memory.

    - there is 10 GB of total memory, which is not a lot for an active server

    - there is about 4 GB of stuff paged out because it does not fit in memory.

    - there is a small amount of filesystem cache (Active(file))

    - there is 6 GB of stuff that is locked in memory (unevictable)

    - there is 6 GB'ish of "Cached" but that can be any of lots of things, including filesystem pages, code, and dynamically allocated memory.

    We have to figure out where your memory is going. Questions:

    How many databases are you running?

    How much shared memory are they allocating?

    What else is allocating shared memory?

    How many app servers?

    How many 4GL runtime clients?

    What else is running on this box?

    What does vmstat or some other utility (e.g. nmon) tell about pagin and pageout I/O rates?

  • Hi Gus,  I really appreciate your feedback on this one.  I must be honest I don't fully understand the memory allocation on Linux, there are a lot of contradictory articles out there. If this server is indeed running out of memory that would explain the high system load and swapping.  

    This is a customers machine and is dedicated for running Progress and nothing else.  There are 7 DB's running that make up the application.  Below is the size and memory allocation for each

    DB1  729 Mb   1.1Gb shared memory          static DB  High reads

    DB2  87Gb       512Mb shared memory        Documentdb  High writes

    DB3  26Mb      46Mb shared memory           Framework DB  (low reads/writes

    DB4  2Mb         95Mb shared memory          Integration DB (low reads/writes)

    DB5  26Gb       2.6Gb shared memory         Main DB (high reads/mediam writes (was on 3.2Gb,  I decreased the -B last night after reading your response)

    DB6  10Mb       46Mb shared memory         Framework DB (low reads/writes)

    DB7  33Mb       234Mb shared memory      Temp DB

    Total 4.6Gb shared memory

    Only 1 appserver running with on average 10 agents

    other memory allocation

    -Bt 20000

    -tmpbsize 4

    240 ABL clients

    Its currently 06:30 now so I will try and gather some more stats during the course of today

  • Its 9am now and i'm already seeing a great improvement in performance.  You were spot on about the memory.  Just by reducing the Buffer pool (not ideal, if anything I would like to allocate more) the server is no longer swapping.  At least 70% improvement in overall performance.  

    As I was providing you with the shared memory allocations along with the temp table Bt param its was quite obvious that I was over committing on memory.  I just need to convince the customer to invest in new hardware or at least upgrade the memory.

    I would appreciate if you could provide me with more insight into "meminfo", I would really like to get a better understanding on the memory allocation on an OS level.

  • 12:15:  Steady performance.  Thanks again

  • here's a snippet of vmstat output:

    Note the "si" and "so" columns. These indicate pages read in and pages written (i.e. swapped) out. It is possible for some of the swap space to be in use but no paging going on, such as in this case. A typical system has lots of processes that are running only occasionally and those can be swapped out but nothing is wrong.

    These days, there is only demand paging in most systems. The idea of swapping out entire processes all at once has long been abandoned.

    To make things confusing, some systems use the demand pager to do file I/O. That can make it hard to distinguish paging from other I/O.