Performace degradation between / /Power8 - Forum - OpenEdge RDBMS - Progress Community

Performace degradation between / /Power8

 Forum

Performace degradation between / /Power8

  • I don't recall your ever posting what the actual throughput is.

    How many record reads per second does this single threaded process get
    before it tops out?

    On modern AIX a single threaded job /should/ be able to read around
    200,000 records per second on a well configured system that is not IO
    bound. Obviously an IO bound process will not do nearly that well. But
    if you have SSDs that can deliver tends of thousands of IO Ops per
    second you shouldn't be crazily off the mark.

    --
    Tom Bascom
    tom@wss.com

  • @Tom: The throughput was very low (<30 MB/s) with a 3 ms response time between 8k reads. I don't have the database statistics.

    My point is that we have 8 database data LUNs that are striped with a queue depth of 128 each, so we can support 1024 queued IOs simultaneously. There are four 16Gbps HBA ports to the SAN for a total real bandwidth of 6.4 GB/s and a command depth of 1024 per port for 4096 simultaneous IOs. Yet only a single IO is ever dispatched by the process at a time.

  • RussellAdams

    @Tom: The throughput was very low (<30 MB/s) with a 3 ms response time between 8k reads. I don't have the database statistics.

    30MB/s? Either the DB is seriously mis-configured or there's a problem with the storage system configuration. 

    Before you continue on about asynch I/O, you need to post some specifics here - specifically the DB Version along with the DB server and client startup parameters. 

    If you don't know where those are, look in the db.lg file and get the list of entries from the last time the server was started. 

  • I think that I got your point. And, yes, that is true of Progress. It
    asks for IO one block at a time when it needs data from disk. Progress
    doesn't ask for data when it doesn't need any. And it doesn't try to
    guess what it will need next.

    I was under the impression that those fancy SANs had read-ahead features
    that will detect a sequential access pattern automatically (many of your
    posts seem very focused on the sequential access capabilities of the
    system). Of course if the IO is random that won't help at all. And
    might even hurt. Did the IBM and EMC engineers look at how random the
    data requests are? I've been involved in some engagements where they
    took the time to look at that and were quite surprised.

    The open question, in my mind, is -- is the Progress DB blocked and
    waiting on IO? Or is it asking for IO at a leisurely rate because the
    application is "doing stuff" (executing business logic). If the
    Progress DB is blocked on IO there should be some evidence of that.
    Reading through the thread there are a lot of reasons to think that it
    is more a case of "the application is doing stuff".

    It just isn't clear (to me) if the bottleneck is actually read IO or
    not. It sounds like when an IO is requested it happens quickly. I am
    hearing that it just isn't requested as fast as the system /could/ (in
    theory anyway) provide it. Which suggests to me that read IO is not the
    bottleneck.


    The application profiler is a really good way to find out where the
    application is spending time. If it is spending time doing things other
    than IO ops then it really doesn't matter very much how fast your SAN
    is. Has anyone given the profiler a try?



    But assuming the application is actually only trying to read data as
    fast as it possibly can (in a single thread) and is not actually doing
    much of anything with that data there are a few more nuggets that can
    maybe be teased out:

    30MB/sec with 3ms between reads sounds like about 375 IO ops per
    second. That isn't very impressive. I'm sure that your SAN can do
    better than that. Unless it happens to be just a single spindle of
    rotating rust -- in which case it is doing about as many IO ops as it
    can be reasonably expected to do.

    If the 8K blocks have a 100 records each (I'm making that up -- less is
    more likely) then, at best, you are getting 37,500 records per second.
    Which is not very good so I can understand being disappointed with
    performance.

    If the data in those records is NOT in the logical order that the
    application needs then the effectiveness of each IO operation isn't
    going to be very good and the useful throughput will be lower. Possibly
    much lower. (This is where a dump & load might help if it reorders the
    records to better fit the usage -- conversely a dump & load that ordered
    them badly will *hurt* performance for this purpose...)

    That all assumes that index blocks are staying in memory. Unless this
    is a fairly new OpenEdge release with cutting-edge code every record
    find requires both index block and data block reads. If index blocks
    are also being read from disk you lose even more of your throughput.


    But I really think profiling the app would be a lot more likely to
    identify the source of the problem.

    --
    Tom Bascom
    tom@wss.com

  • You say that you have had a Progress consultant.  Is this someone from Professional Services or one of the independent consultants?

    Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice  http://www.cintegrity.com

  • ChUIMonster


    This is where a dump & load might help if it reorders the
    records to better fit the usage -- conversely a dump & load that ordered
    them badly will *hurt* performance for this purpose...


    I love that train of thought. Just sayin'

  • The consultant was Dan Foreman...unfortunately I don't have time at the moment to consume the entire contents of this thread....but I can say that I think we found the problem (after 50 hours of travel for 32 hours of onsite consulting)...the app in question generates dunning letters into a directory (on a NAS) that already has 250k+ letters sitting there....simple unix commands like find & grep & ls are, at times, very very slow when "looking" into this directory....the DB itself has zero issues (that I can find) with latch contention, semaphore contention, locking contention, checkpoints, buffer cache hit ratio, et al....having said that the DB has not been D&L'd in more than 10 years...so fragmentation is really bad....but simple tests like bigrow show a really fast piece of kit...as the Brits say...

    On Wed, Apr 15, 2015 at 8:27 AM, Thomas Mercer-Hursh <bounce-tamhas@community.progress.com> wrote:
    Reply by Thomas Mercer-Hursh

    You say that you have had a Progress consultant.  Is this someone from Professional Services or one of the independent consultants?

    Stop receiving emails on this subject.

    Flag this post as spam/abuse.


  • So it boils down to the db isn't busy and IO is being requested at a leisurely rate because the app was "doing stuff".

    So there is both a NAS and a SAN in this setup?  That NAS wouldn't happen to be a "filer" would it?

    --
    Tom Bascom
    tom@wss.com