Index Efficiency: Unique or Not unique? - Forum - OpenEdge Development - Progress Community

Index Efficiency: Unique or Not unique?

 Forum

Index Efficiency: Unique or Not unique?

This question is not answered

If I recall correctly, there was a buzz a few versions ago, when Progress declared the ability to use more than 1 index to satisfy a query. Code like:

“for each cust where state = ‘OH’ by uid:”

“for each cust where city = ‘Chicago’ by state by uid:”

“for each cust where state = ‘OH’ and city = ‘Chicago’ by uid:”

“for each cust by state by city by uid:”

 

Would all be served by indexes:

Key1: state (non-unique)

Key2: city (non-unique

Key3: uid (unique, primary)

 

And this would avoid table scans and anything but efficient index scans. In (minimal) practice, I was never convinced/satisfied with the performance (at the time), and began following my own rule of making all indexes unique, by tacking the uid field(s) on the end. That seemed to do what I wanted (except of course in example #3 above, where I would have created a new index: “Key 4: state, city, uid”).

To make a long question short (if that’s still possible), what are the pros/cons and “best practices” today, regarding making all indexes unique?

TIA --

Jim Shepherd, Level 7

All Replies
  • The case for resolving queries using multiple indexes was overblown.

    The worst cases are those where people decided to try to define all of
    their indexes with just a single field. That simply doesn't work.

    In practice there are very few cases where multiple indexes are useful.
    At least as far as I have seen.

    In most actual business usage having a composite index with the
    appropriate components is much, much better.

    --
    Tom Bascom
    603 396 4886
    tom@greenfieldtech.com

    --
    Tom Bascom
    tom@wss.com

  • I don't see any value in making indices artificially unique by adding your uid field to it.

    The Index on state is simply not unique.

    Also for the compiler/dynamic query runtime index selection this won't make any difference - as you'd never specify and uid with a query for state and thus the index would always be considered like a non unique index when it comes to index selection/execution strategy.

    The rules of combining indices are still like those described here:

    knowledgebase.progress.com/.../000012195

    Architect of the SmartComponent Library and WinKit

    Consultingwerk Ltd.

  • One of the huge downsides of making your indexes unique is you make the indexes much larger than they need to be. This can cause performance issues all by itself depending on your table sizes/transaction volume.

    Plus like Tom said the value of multiple index usage (especially with sorting) is vastly overblown. There are of course exceptions... especially if you end up at a place where they drank the Kool-Aid and put a ton of single column indexes out there.

  • One benefit of artificially unique indexes is to provide a consistent sorting when displaying/iterating these records, so they could have some value...

  • Possibly, but in general a unique ID isn't going to mean anything to the end users or anything else observing the sorted results.

  • I'm not saying you should ever show the unique id to the user, just using it to provide a consistent sorting without showing it would do.

  • But what would the point be if you never showed it? I guess I am missing the value of sorting it by a hidden value.

  • The value would be that the results would always be the same, thus not confusing the user.  Admittedly, having the data sorted by a visible value tends to be the best way to do that, but I see a point in having a consistent order even when the records are not sorted by a visible value.

    Think of trying to compare a list from today with one from yesterday to see what has changed.  Easiest is if they are sorted by a visual value because one can focus on that field for additions and deletions.  But, if there is no such sensible value, then having them consistently sorted would be tons easier than having them arbitrarily ordered.

    Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice  http://www.cintegrity.com

  • I suppose so, but it still sounds like a bad design to me :)

  • Only on the question of should the primary unique key be natural or artificial. Whenever I made a recommendation it was the artificial key. Some thoughts: www.techrepublic.com/.../10-tips-for-choosing-between-a-surrogate-and-natural-primary-key

     Moreover this artificial key is not a sequence but a guid (you can assign the guid on whatever client-type without a connection to the appserver etc.). I would prefer the

    sequential guid www.codeproject.com/.../GUIDs-as-fast-primary-keys-under-multiple-database

    www.siepman.nl/.../ID-Sequential-Guid-COMB-Vs-Int-Identity-using-Entity-Framework.aspx

    as it gives beter performance, but we would need the clustered index in the rdbms for that.  

    --

    Kind regards,

    Stefan Houtzager

    Houtzager ICT consultancy & development

    www.linkedin.com/in/stefanhoutzager