Check for redundant indexes

Posted by tbergman on 01-Aug-2018 06:44

A long time ago I downloaded an ABL program that would scan your database schema and identify indexes (Indices?) that were redundant.

I have been unable to locate this and wonder if anyone has something that does this. 

Thanks

All Replies

Posted by 2356 on 17-Aug-2018 06:06

proutil command with idxcheck qualifier?

Posted by James Palmer on 17-Aug-2018 06:22

What do you mean by a redundant index? Is it one that's not used, or one that is rendered redundant due to other indexes that already exist? Or some other rule?

Posted by tbergman on 17-Aug-2018 07:21

I define a redundant index as one whose fields are already present in another index with the same leading components.

A typical example:

Index1

 SaleDate

 SalesPerson

Index2

 SaleDate

Index2 is redundant

Posted by George Potemkin on 17-Aug-2018 07:25

> I define a redundant index as one whose fields are already present in another index with the same leading components.

Then you're going to break a selection of the multiple indexes.

Posted by tbergman on 17-Aug-2018 07:49

That's a very good point about multiple index use. I had not considered this and we'll have to be careful before removing these indexes.

Posted by kirchner on 17-Aug-2018 08:01

> Then you're going to break a selection of the multiple indexes.

Can you elaborate on that George? I failed to see the problem here.

Posted by George Potemkin on 17-Aug-2018 08:12

> Can you elaborate on that George?

For example, if in addition to the indexes mentioned above there is Index3 on the SaleDep field then the query:

for each Sale where SaleDate eq "..." and SaleDep eq "..."

will use the two indexes: Index2 and Index3

But only if Index2 exists. Otherwise the query will use only one index (either Index1 or Index3 - depending from ...)

Posted by Ruanne Cluer on 28-Aug-2018 04:58

wouldn't it be a matter of collecting _indexstat to find unused / redundant indexes?

Posted by ChUIMonster on 28-Aug-2018 13:17

Run time data will tell you about indexes that are *unused* during the data collection period.  There can be indexes that only get used occasionally (month or year end closing for example) but which are still important.  You could easily miss usage of such indexes if you are not getting complete coverage of all the runtime data for a long enough period.

What Tom was looking for is indexes that are redundant in the sense that they are subsets of other indexes.  IOW they have duplicate leading fields and one is a subset of the other.

Both approaches have value -- even if an index is not redundant in the "duplicate fields" sense it might still be a candidate for removal if it is never actually used.  Of course it might be enforcing a uniqueness constraint too -- so lack of reading isn't quite enough justification to zap it -- but it is enough to justify thinking about zapping it :)

Compile xref would also be useful to check -- does any compiled code reference an index?  You could still miss dynamic queries but, again, the lack of references in compiled code would be a good reason to at last think about whether or not an index is really adding any value.

Posted by Dmitri Levin on 07-Sep-2018 12:57

I would limit the search for  *unused* indexes to only very active tables, i.e. tables with high _TableStat-update

Sort all tables by _TableStat-update and pick the top 20 to consider their indexes only. The rest do not matter.

I would not go with xref, though that is a way to go, because most applications have code that was never used or is never used any more. 

Also if application is provided to you by Application partner, consult with them. Index may be not used because you are not using the module. And after you remove the index, it may be re-created back by a software upgrade.

_indexstat is a good place to look.  I deleted some 10 big indexes that were never used 5 years ago. But to second what Tom wrote above, I had to re-create one of those 10 indexes at the year end. Because it is used in AR program that runs once a year :)

And finally, the is a similar problem to "unused" indexes -- indexes that are defined as unique that do not have to be defined as unique because other index already enforces uniqueness. That is a different story though.

Posted by Rob Fitzpatrick on 28-Sep-2018 15:33

Here is an attempt at this:

/*  redundantindexes.p
 *
 *  An index I is redundant if there exists an index I'
 *  on the same table that satisfies the following conditions:
 *  - the components of I are the leading components of I', in the same order
 *  - the sort order of the components of I (ASC or DESC) are the same in I'
 *  - it is not the case that I is unique and I' is non-unique
 *
 *  Rob Fitzpatrick
 *  09/28/2018
 */

define variable IndexString      as character no-undo.
define variable i                as integer   no-undo.
define variable Index-Name       as character no-undo.
define variable Component-String as character no-undo.
define variable Is-Unique        as logical   no-undo.

define temp-table ttFile no-undo
  field TableName as character format "x(32)"
  field IndexCount as integer
 index TableName
   TableName
.

define temp-table ttIndex no-undo
  field TableName       as character        format "x(32)"
  field IndexName       as character        format "x(32)"
  field ComponentString as character  format "x(80)"
  field IsUnique        as logical
 index ComponentString
  TableName
  ComponentString
.

define buffer bttIndex for ttIndex.

for each dictdb._file no-lock where _file._file-number > 0
                                and _file._file-number < 32000:
  i = 0.

  for each dictdb._index no-lock of _file:

    assign
      i = i + 1
      IndexString = ""
    .

    for each dictdb._index-field no-lock of _index:

      IndexString = IndexString + string( _index-field._field-recid ) + "_" +
        ( if _index-field._ascending then "A" else "D" ) + ",".

    end.

    create ttIndex.
    assign
      ttIndex.TableName       = _file._file-name
      ttIndex.IndexName       = _index._index-name
      ttIndex.ComponentString = IndexString
      ttIndex.Isunique        = _index._unique
    .

  end.

  create ttFile.
  assign
    ttFile.TableName  = _file._file-name
    ttFile.IndexCount = i
  .


end.

for each ttFile where ttFile.IndexCount > 1 with frame a:

  for each ttIndex where ttIndex.TableName = ttFile.TableName with frame a:

    for each bttIndex where ttIndex.TableName = bttIndex.TableName
                        and ttIndex.IndexName <> bttIndex.IndexName
                        and bttIndex.ComponentString begins ttIndex.ComponentString with frame a:

      if ttIndex.IsUnique and not bttIndex.IsUnique then next.

      display
        ttIndex.TableName
        ttIndex.IndexName @ Index-Name              format "x(32)"
        ttIndex.IsUnique  @ Is-Unique
        ttIndex.ComponentString @ Component-String  format "x(120)"
      .
      down with frame a.
      display
        bttIndex.IndexName @ Index-Name             format "x(32)"
        bttIndex.IsUnique  @ Is-Unique
        bttIndex.ComponentString @ Component-String format "x(120)"
       with width 200
      .

      down(1) with frame a.

    end.

  end.

end.

Feel free to critique.

Posted by George Potemkin on 29-Sep-2018 08:41

What to do with the word indexes (_Wordidx eq 1)? I'd compare _Idxmethod for I and I' - just in case. This would handle the word indexes as well (_Idxmethod eq "W").

Posted by SJProgress on 01-Oct-2018 01:25

In my database I get redundent messages about word indexes too (but this can be solved)

BUT biggest Problem is, that this reports redundant Indexes, even if it is a single-field index and a multi-field index starting with same field. But this is not redundant, as multi-field indexes will not be used for selecting multiple Indexes for a query. (still hoping this will change some day, but maybe this will not be fixed by Progress).

Posted by Rob Fitzpatrick on 01-Oct-2018 15:09

> What to do with the word indexes (_Wordidx eq 1)?

>BUT biggest Problem is, that this reports redundant Indexes, even if it is a single-field index and a multi-field index starting with same field. But this is not redundant, as multi-field indexes will not be used for selecting multiple Indexes for a query.

These points both relate to the problem definition.  I'm not sure that code will address this need 100%, nor that everyone will agree on one definition of a "redundant index".  Maybe the best that a program can do is make a list of candidates to be manually evaluated.

So what do people think is a definition of a "redundant index"?  Starting with what I wrote above:

What would you add or change in this definition to make it better?  

An index I is redundant if there exists an index I'
on the same table that satisfies the following conditions:
- the components of I are the leading components of I', in the same order
- the sort order of the components of I (ASC or DESC) are the same in I'
- it is not the case that I is unique and I' is non-unique

E.g.:

- either I and I' are both word indexes or neither are word indexes
- What about "abbreviated" indexes, where the two indexes share the same component list but one is abbreviated and the other is not?  Would they be non-redundant?

Posted by Rob Fitzpatrick on 04-Oct-2018 14:10

> What to do with the word indexes (_Wordidx eq 1)? I'd compare _Idxmethod for I and I' - just in case. This would handle the word indexes as well (_Idxmethod eq "W").

I'm not clear on the meaning of the _idxmethod values.  Values I have seen are B, C, M, N, W.  There might be others.

W is for word indexes; those records have _wordidx = 1.  All other _idxmethod values were in records where _wordidx = ?.

Dan Foreman's book says "B" is for single-component indexes.

Do we know the meaning of C, M, and N?

Posted by George Potemkin on 04-Oct-2018 14:22

> Dan Foreman's book says "B" is for single-component indexes.

> Do we know the meaning of C, M, and N?

"B": single-component index with a non-character field (including "default" index).

"C": multi-component index where all fields are not character ones.

"N": single-component index with character index field.

"M": multi-component index with at least one character index field.

"W": Word index (_Index._Wordidx = 1)

IOW, it's 2x2 matrix + one special case (word index):

(single- or multi-component) x (use a collation table or "native" sort weights)

Posted by Rob Fitzpatrick on 04-Oct-2018 15:40

Thanks George.  I had a feeling you might know.  :)

Posted by George Potemkin on 04-Oct-2018 16:08

I have a feeling there is something missed about _Idxmethod. That is why it could be a good idea to look after it. Just in case. Is it paranoia?  ;-)

This thread is closed