Replication transactions slow - Forum - OpenEdge RDBMS - Progress Community

Replication transactions slow

 Forum

Replication transactions slow

This question is not answered

Hi,

Windows

OpenEdge 10.2B08  Replication plus

I have done an index compact on the source database. The target database is behind. The number of transaction processed on the target was slow (1 transaction in 4 seconds).

I have restarted the replication and now it says 'Startup synchronization' and the number of transactions is still very slow, 1 transaction in 4 seconds.

Current Source Database Transaction:    4487378093
Last Transaction Applied to Target:     4487371502 <= The value of 'Last Transaction Applied to Target" add 1 in 4 seconds.

We checked the line between the two servers, but we could not see any congestion. Both servers are doing nothing.

So why is it very slow?

Kind regards,

Edwin.

All Replies
  • Did the slow tx processing start after the idxcompact?

    Do you see any obvious bottlenecks on the target? Checkpoint length? BI Cluster Size? Look at the ckpt screen in ProTop or ProMon and post the results.

    Are pica buffers filling on the source side? Look at pica Used% in ProTop or promon - R&D - 1 - 16.

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://protop.wss.com
  • Our application is not running. I have restarted the database with different port number. So no processes could create transactions. What we see is that rpserver.exe is reading the AI file with 2 MB per second on the source. The -pica was 50,000. I have changed this to 8192 to see if this influence the number of transactions. But everything stays the same. We are now investigating the target server. It is a VM with netapp SAN. So perhaps the SAN is very busy.

  • I did not check this before, so I don't know if this is normal or not.

  • Hmmm NetApp....

    In 10.2B08 there are new columns in the ckpt screen: Duration and Sync Time. If Sync Time is high and you are ckpt'ing often then it very likely is an issue with the disk I/O subsytem.

    Try the Furgal Test on the target file system and share the time results with us please:

    prodb sports sports

    proutil sports -C truncate bi -bi 16384

    time proutil sports -C bigrow 2

    This will create a 6 X 16 MB = 96 MB BI file.

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://protop.wss.com
  • We are now sure that the Netapp is the issue (of course ;-). We are going to investigate this. Thanks for your help.

  • Please share what you figure out about NetApp here.  We have quite a few customers running on NetApp and this would help us all.

    Thanks

    Brian
     
    Brian L. Bowman
     
    Senior Principal Product Manager
    Progress Software Corporation
    14 Oak Park, Bedford, MA, USA 01730
     
    Phone: +1 (603) 801-8259
    Email: bowman@progress.com
     
     

    Brian L. Bowman

    bowman@progress.com

  • Out of interest, Paul, or anyone else, what sort of time is 'good' for the Furgal test.

  • It depends on the sizes you specify.  In my version of this, I use 32 MB cluster size and add 4 additional clusters, for a total of 256 MB of writes (32 * (4 + 4)).  I consider less than 6 seconds good, and less than 10 is acceptable.  I've seen everything from 3 seconds to several minutes.  So if you're using the sizes Paul specified, divide accordingly.  I'd be interested to see what elapsed times others see in the field.

  • To be sure we're talking about the same thing:

    _proutil DB -C truncate bi -bi 16384

    time _proutil DB -C bigrow 2 [-zextendSyncIO in 11+]

    This will create a 96 MB BI file.

    10 seconds for a decent SAN.

    0.5 seconds on an SSD.

    Anything more than 10 seconds and I start to break into hives.

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://protop.wss.com
  • Rob: I routinely see 10-ish seconds on "Enterprise" SANs to create a 96 MB BI file.

    Paul

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://protop.wss.com
  • Results of the similar tests on the customer's hardware:

    12:16:40 Writing 12.5 MB in unbuffered (O_DSYNC) mode by 1 threads (biWriteTest).
    12:16:45 real	0m3.62s
    12:16:45 3.45 MB/sec
    
    12:18:18 Writing 12.5 MB in unbuffered (O_DSYNC) mode by 32 threads (biWriteTest).
    12:18:44 real	0m1.58s
    12:18:45 2.57 MB/sec
    
    ---
    
    23:00:26 Writing 12.5 MB in unbuffered (O_DSYNC) mode by 1 threads (biWriteTest).
    23:00:27 real	0m0.80s
    23:00:27 15.62 MB/sec
    
    23:00:29 Writing 12.5 MB in unbuffered (O_DSYNC) mode by 64 threads (biWriteTest).
    23:00:30 real	0m0.23s
    23:00:30 17.12 MB/sec
    
    ---
    
    13:53:38: Writing 100 MB in unbuffered (O_DSYNC) synchronous mode by 1 threads (bigrow).
    13:53:52: real 0m14.116s
    13:53:52: 7,08 MB/sec
    
    13:53:52: Writing 100 MB in unbuffered (O_DSYNC) synchronous mode by 2 threads (bigrow).
    13:54:03: real 0m10.938s
    13:54:03: 9,14 MB/sec
    
  • Testing on a customer site where we're seeing very high IO response fluctuations in protop, and it's taking 10-13 seconds to do the 96MB file. The whole site has been plagued by various performance issues, so this may well be part of the problem.

  • Using the BIGROW command to grow 6 x 16 MB cluster makes a 96 MB file.  Take the time and do the math to determine the physical write speed.  This helps to determine the correct Bi Cluster Size setting.  IMO cluster formatting should not take more than 2 seconds, because that would be a noticeable pause for application users.  

    So when Paul breaks out in hives is when the physical write speed is 9.6 MB per second, so the largest BI Cluster size in this case would be 20 MB  (I am a fan of powers of 2, so I would make it 16 MB or 24 MB).

    In rare cases where the update activity requires a larger BI Cluster size than the physical write speed can support, this is when you can pre-grow many BI Clusters, typically 2 times the normal high water mark to make sure there is no unplanned growth or pauses.  The downside of this is the online backup time to backup the BI file, unless you are on 11.5 or higher.

  • When I say high fluctuations...

  • That's bad James.

    This is a "not great" IBM SAN and it's still 10X faster than your client. That spike on the left is the backup.

    This is local flash at another customer that actually cares about disk I/O:

    Paul Koufalis
    White Star Software

    pk@wss.com
    @oeDBA (https://twitter.com/oeDBA)

    ProTop: The #1 Free OpenEdge DB Monitoring Tool
    http://protop.wss.com