All Webspeed agents busy Alert monitoring

Posted by Adrian.wright on 12-Mar-2019 09:57

Hi All 

Currently we have situation (due to an application\locking issue) where all the 55 of our webspeed agents all go busy and cause both our server and Application to become unresponsive.

Is there any way we can can monitor via simple batch file or via the VST's the percentage of webspeed brokers that are busy. 

so would there be any way of checking or creating a metric of if say 40 of our 55 webspeed broker busy ? 

Were just looking at trying to manage the Situation while an application fix is being created 

Ade 

 

All Replies

Posted by Pieterm on 12-Mar-2019 10:38

The simplest option would be create a batch file to run the relevant "asbman" and "wtbman" commands against the brokers and extract the summary agent counts.

Command:

${DLC}/bin/wtbman -port 29700 -name wsb_brkliv -query

Output:

OpenEdge Release 11.7.4 as of Wed Oct 10 18:18:59 EDT 2018

Connecting to Progress AdminServer using rmi://localhost:29700/Chimera (8280)
Searching for wsb_brkliv (8288)
Connecting to wsb_brkliv  (8276)

Broker Name                    : wsb_brkliv
Operating Mode                 : Stateless
Broker Status                  :  ACTIVE
Broker Port                    : 36001
Broker PID                     : 25473
Active Agents                  : 5
Busy Agents                    : 0
Locked Agents                  : 0
Available Agents               : 5
Active Clients (now, peak)     : (0, 1)
Client Queue Depth (cur, max)  : (0, 0)
Total Requests                 : 90
Rq Wait (max, avg)             : (0 ms, 0 ms)
Rq Duration (max, avg)         : (1 ms, 0 ms)

PID   State     Port  nRq    nRcvd  nSent  Started          Last Change
25520 AVAILABLE 37101 000019 000019 000019 12 Mar 2019 05:00 12 Mar 2019 09:42
25525 AVAILABLE 37100 000018 000018 000018 12 Mar 2019 05:00 12 Mar 2019 09:41
25527 AVAILABLE 37102 000018 000018 000018 12 Mar 2019 05:00 12 Mar 2019 09:41
25530 AVAILABLE 37103 000017 000017 000017 12 Mar 2019 05:00 12 Mar 2019 09:41
25532 AVAILABLE 37106 000018 000018 000018 12 Mar 2019 05:00 12 Mar 2019 09:41

 

Posted by James Palmer on 12-Mar-2019 10:40

Rather than screen scraping this yourself, ProTop can be configured to do all the hard work for you.

Posted by Adrian.wright on 13-Mar-2019 10:10

Cheers for the Response,  I'm familiar with Protop, but can Protop be able to send alerts based on a criteria ?

so Like I have, with all our database sets, i query the C holder status every 5 minutes and if anything is returned other than "online", it fires out an email alert and just wondered if something similar could be set on webpseed,  I have searched the forums and OE KB but not found anything

Obviously I know the issue is the application, but were just looking to manage the situation

Posted by Paul Koufalis on 13-Mar-2019 12:12

Hi Adrian,

ProTop can generate alerts on a few hundred different metrics on pretty much everything in and around your OE environment, and we're constantly adding more.

The other cool thing about ProTop is that you can set sensitivities and nag levels, so you can configure things like "only alert me if such-and-such happens 3-out-of-5 samples, and only bug me about it once an hour". You can also tweak severity levels, for example a yellow alert if a threshold is breached once, an orange alert if it's breached 2:3 and a red alert if a metric state persists for say 3:3 samples.

For AppServer and Webspeed alerts based on a number of criteria:

1. Agent status: Available, busy, locked, sending, etc...

2. Agent "stuck": defined as an agent whose "last change" has not changed in x minutes

3. Number of agents in Use and agents in use as a percentage of maxAgent

4. Number of connected clients and percentage of maxClient

5. Max and Current Queue length

6. Max and average request wait

7. Max and average request duration

Ping me offline if you want to know more.

Posted by jonathan.wilson on 13-Mar-2019 13:43

If you're looking for something quick; just add your email code; reverse logic... if only 1 agent available then send out an alert...

if [ "$($DLC/bin/wtbman -port 29700 -name wsb_brkliv -query | grep AVAIL | wc -l)" -lt 2 ]

then

 echo oh NO! or send an email whatever...

fi

While the following (again quick) can be good find issues in code.

for x in $($DLC/bin/wtbman -port 29700 -name wsb_brkliv -query | grep BUSY | awk '{print $1}')

do

 $DLC/bin/wtbman -port 29700 -name wsb_brkliv -agentdetail $x | grep -e '--'

done

None of these are ideal for monitoring or alerting; but can be useful in the short-term

Posted by Paul Koufalis on 13-Mar-2019 13:53

[mention:84cf66f0e8d946caadaa908bff9bf1cd:e9ed411860ed4f2ba0265705b8793d05] unfortunately your solution is not ideal because the number of available agents per asbman/wtbman isn't indicative of how many agents are really available. What you need to know is maxAgent minus (busy/sending/locked/etc). It's when you get close to maxAgent that you potentially have a business-disrupting issue. Your solution returns the number of available agents among the list of running agents and if you auto-trim, this solution could generate many false positives.

Posted by jonathan.wilson on 13-Mar-2019 14:22

True Paul, this is quick check; not a long-term solution for monitoring.  The issue was described as "agents all go busy", if there's already 55 configured it might be enough "to manage the Situation".  

Posted by Paul Koufalis on 13-Mar-2019 14:31

You're 100% correct Jonathan, and ProTop originally worked this way. You can still alert on "asAvail < 5" in ProTop if that's good enough, and it is for a lot of sites where the number of running agents never really changes.  

Posted by Adrian.wright on 14-Mar-2019 09:14

Hi Paul, Jonathan

thanks for the suggestions,  exactly as Jonathan stated this is not a long term solution, just something quick and simple to alert us before the system becomes unusable while both ourselves and the Developer working out the root cause (we believe it is related to record locking)

thanks for Script Jonathan I will have play (in need to rewrite it in Windows first)

Paul,  

I had look at Protop (last time I used it was via cmdline) the installer seems to want to send infomation to a website\dashboard online ?  is there any way this can be configure locally ?

Posted by ChUIMonster on 14-Mar-2019 09:47

If you do not wish to have data sent to the portal choose "local install".

Posted by ChUIMonster on 14-Mar-2019 09:47

If you do not wish to have data sent to the portal choose "local install".

Posted by jonathan.wilson on 14-Mar-2019 13:29

lol... I got one for Windows to!  Me and Agents are old nemesis.  This is a Powershell script, the PROENV needs to be all configured first... but again quick script;  Also there is a password hard coded / plain text login; not very good... but just leave out the emails if that's an issue.

-----------

$CHKAS="MYWTB_live"

$ALERTLIMIT=3

$CHKSTR="AVAILABLE"

$BC=(wtbman -name $CHKAS -query | findstr $CHKSTR  | Measure-Object -line).lines

#write-host $BC

if ( $BC -lt $ALERTLIMIT ) {

$TimeStam=$CHKAS + "_" + (get-date).Tostring("yyyyMMdd_HHmmss") + ".txt"

#Not needed for your stuff#wtbman -name $CHKAS -query | ?{$_ -cmatch $CHKSTR} | %{ ($_ -split " ")[0] } | %{wtbman -name $CHKAS -agentdetail $_ } | Out-File $TimeStam

$tHostname = hostname

$SendToEmail = "Me@myAddrees", "Me@myAddrees2"

$Username = "ExchangeLoginID@Domain"  ## a normal user AD account will do here

$Password = ConvertTo-SecureString 'myPasswordHere' -AsPlainText -Force  ## FYI this might be an issue security issue

$Livecred = New-Object System.Management.Automation.PSCredential $Username, $Password

$Subject="Issue with XXXXX: " + $tHostname + " Agents Blocked issue!! " + $BC

$OutputStatus = "Hi fix the issue please"

$ExchangeHostname="my.exchange.com"

send-mailmessage -to $SendToEmail -from $Username -subject $Subject -smtpserver "$ExchangeHostname" -body "$OutputStatus" -credential $Livecred -Attachment $TimeStam

}

Posted by ctoman on 14-Mar-2019 13:30

wow!  So the free version of ProTop will send Broker status alerts?

Posted by Paul Koufalis on 14-Mar-2019 17:59

Sorry [mention:382cf14b91fc49099b7152e5b7fdae57:e9ed411860ed4f2ba0265705b8793d05] . The free version allows you to see the real time status of your environment but to get alerts you need a subscription. I apologize for not making that abundantly clear.

Posted by Jens Dahlin on 10-Jun-2019 10:30

We use a variation of this but not to notify when something is wrong but to push actual numbers to an external system for handling metrics.

However I think you should also focus on why this happens. Do you need to increase number of agents because of high load? Do you have any issues etc.

Note: AVAILABLE might be low also when system is idle so you might want to check number of BUSY instead.

#!/bin/bash
DLC=/usr/oe1174
DLCBIN=/usr/oe1174/bin
PATH=$PATH:$DLCBIN

export DLC DLCBIN PATH

FILE=`mktemp`

wtbman -name wsbroker1 -query -port [A PORT]| grep -e "Active Agents\|Busy Agents\|Available Agents\|Locked Agents" > $FILE

ACTIVE=`grep "Active Agents" $FILE|cut -d":" -f2|sed 's: ::g'`
BUSY=`grep "Busy Agents" $FILE|cut -d":" -f2|sed 's: ::g'`
AVAILABLE=`grep "Available Agents" $FILE|cut -d":" -f2|sed 's: ::g'`
LOCKED=`grep "Locked Agents"  $FILE|cut -d":" -f2|sed 's: ::g'`

if [ "$AVAILABLE" -lt "10" ];
then
  echo "Do something"
else
  echo "It's OK"
fi;

rm $FILE

Posted by Dmitri Levin on 03-Sep-2019 20:52

I think the number of Available agents too low does not necessarily be a problem, if the number of Active agents is less then a maximum defined. We need to compare the maximum number of agents defined in ubroker.properties, i.e.

wtbman -name wsbroker1 -listallprops | grep maxSrvrInstance
maxSrvrInstance=25
With the total number of Busy and Locked agents. The number of Active agents is less or equal to maxSrvrInstance.

And do that for all brokers of the pool, if there is load balance management in place.

This thread is closed