Memory usage on web agent increases with decrease in perform

Posted by jbijker on 17-Apr-2020 08:58

We're running OE 11.7.2.0.1497 64-bit on Redhat 6.10 with classic WebSpeed web agents.

The problem we have:

  1. Web agents get progressively slower during the day.
  2. Web agents memory usage progressively increase during the day. We've increased the memory on the server so it does not start to swap, but performance still degrades over time, which means it's not linked to swapping.
  3. When you trim an agent performance is back where it was at the start of the day, so it shows it’s something with the agent and not load on the box.
  4. This is happening on a production environment, but we can reproduce on a test environment by doing a bulk run on a specific program. So we did some testing on the test environment as explained below.
  5. We’ve made use of Progress.Database.TempTableInfo:GetVSTHandle to check all temp-tables on the system. We've checked for TTs building up records over time. There are lots of TT activity, but nothing that builds up. We’ve also increased the memory allocated to temp-tables so that it doesn’t swap out to disk, but this didn’t fix the memory and/or performance issues.
  6. We’ve used the -y startup parameter with the SHOW-STATS command to write stats to client.mon. It reports on local buffer usage, stack size and r-code execution buffer, but all these figures stay more or less constant. No evidence of things that grow. It also shows reads and writes to TTs, but obviously these figures increase over time.
  7. Finally we’ve used Progress’ DynObject messages to trace memory leaks. When an agent starts up we do find memory allocations with no corresponding deallocation, but that's because the way we start up our application and all these allocations happen at the start of the agent. There are no other memory leaks we could find.

Does anyone have some other ideas we can try to see what's causing the memory buildup and performance degradation?

All Replies

Posted by jonathan.wilson on 17-Apr-2020 10:49

We have the same issue with our Webspeed and larger reports.  The Devs love TEMP-TABLES but our agents always end up in a mess memory wise.  On the surface the code looks good.  

Our work around is just to use:

wtbman -name MyWebBorker -refresh

"refresh" is easier then trim/add can be done online (we use stateless); it just waits till the agent becomes AVAILABLE then kills/readds it back into the pool.  Give us back lots of memory, once a day is enough for our Ops system.  But we could do it more often.  We also do it with our AppServer (classic).

Posted by Paul Koufalis on 17-Apr-2020 13:11

Did you try to run proGetStack <agent pid> periodically through the day? Perhaps you are loading more and more persistent procedures?

Also, have you looked at pmap <pid> to see what kind of memory is growing? You'll see open files, "shmid" which is DB shared memory (assuming the agents are connected to a DB in shared memory mode) then a bunch of "anon". Track the output to see what changes throughout the day.  Also look at the difference between pmap and pmap -x output. I'm going by memory here, but if I do remember correctly, it is the "Dirty" column in pmap -x which is actual memory used for anon lines.

Example:

$ pmap -x 11763 | grep -i -e Kbytes -e anon -e stack

Address           Kbytes     RSS   Dirty Mode   Mapping

00000000011f7000    2944     148     148 rwx--    [ anon ]

00000000011f7000    2944       0       0 rwx--    [ anon ]

0000000003486000     888     620     620 rwx--    [ anon ]

0000000003486000     888       0       0 rwx--    [ anon ]

This output is truncated, but looking at the above 4 lines only it looks like this PID is using 2944+888 Kb of mem (watch out for duplicate addresses), but really it's only using 148+620. 

If I do the real sums of anon lines, this is what I get:

sum(Kbytes) = 4292, sum(RSS)=848, sum(Dirty)=844

For fun, if I remove the greps to include shared objects like files and DB shared memory segments:

sum(Kbytes) = 90828, sum(RSS)=13716, sum(Dirty)=2292

Per the KB https://knowledgebase.progress.com/articles/Article/P77401

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  
11763 test      20   0 90824  13m  11m R 33.2  0.1 609:20.71 _progres                          

you would count 2 MB, but I'm skeptical. pmap is a much more detailed view. 

Posted by jbijker on 17-Apr-2020 13:42

Hi Paul

If we were loading more and more persistent procedures that would've shown up in the DynObject messages. That's not the case.

Thanks about the details of pmap. Will try to get something out of it.

Regards

Johan

This thread is closed