Clarification on logical in query

Posted by jquerijero on 08-Jan-2020 20:33

When writing a query, is there a performance difference between using

table.active = YES vs table.active
table.active = NO vs NOT table.active

Posted by Tim Hutchens on 09-Jan-2020 21:38

In my opinion, the biggest performance impact here is related to index selection. Given that there is an index with table.active as a leading component, what index selection does the COMPILE XREF show?

The XREF for each of the following shows the same index selection of an index with table.active as a leading component:

FOR EACH table WHERE table.active = YES NO-LOCK:

FOR EACH table WHERE table.active NO-LOCK:

FOR EACH table WHERE table.active = NO NO-LOCK:

The XREF for this query shows WHOLE-INDEX:

FOR EACH table WHERE NOT table.active NO-LOCK:

According to Dan Foreman, there is no performance difference between these constructs: pugchallenge.org/.../230_Indexing.ppt (slide 19)

FOR EACH table WHERE table.active = YES NO-LOCK:

FOR EACH table WHERE table.active NO-LOCK:

If there is a difference, I would attribute it to evaluating the equality expression as Torben mentioned.

For the NOT operator, it may have to do with NOT being treated more like a function than an operator as Tom pointed out. I can't find an Order of Operations reference, but that may be the reason the index engine won't deal with the NOT table.active construct.

So I think the main recommendations are to make sure there is an appropriate index for the query that includes the logical field; and to not use the NOT operator on logical fields in queries. After that, test performance of the other constructs against the table you are working on. Not sure what impact the presence of null values may have on performance.

All Replies

Posted by Neil Treeby on 08-Jan-2020 20:37

Yes, it can be significant.  Always try to use:

table.active = YES

table.active = NO

Particularly true if table.active is part of an index; otherwise the clause will not be considered when the index is chosen.

Posted by Patrick Tingen on 09-Jan-2020 08:06

Although this is true, it makes one wonder, why the compiler cannot solve this for us ....

Posted by frank.meulblok on 09-Jan-2020 09:07

In the scenario given, I'd worry more about the functional difference between "table.active = table.active" (always true for all records) vs. "table.active = YES" (not always true) / "table.active = NOT table.active" (never true for any record) vs. "table.active = NO" (not always true).

Comparing a field to itself doesn't often make a lot of sense.

Also, comparisons on variables or table fields *do* get locked in when the WHERE clause is evaluated, and *are* picked up when indexes are selected. COMPILE XREF this:

DEFINE TEMP-TABLE stuff NO-UNDO
FIELD thing AS CHARACTER
FIELD active AS LOGICAL
INDEX thing IS UNIQUE thing
INDEX active active.

DEFINE TEMP-TABLE morestuff NO-UNDO
  FIELD active AS LOGICAL.
  
DEFINE VARIABLE icount AS INT NO-UNDO.
DEFINE VARIABLE lfindme AS LOGICAL NO-UNDO.


PAUSE 0 BEFORE-HIDE.

DO icount = 1 TO 100000:
  CREATE stuff.
  ASSIGN thing = GUID
         active = (RANDOM (20,200) MOD 2 = 0).
END.

CREATE morestuff.

ETIME(TRUE).
FOR EACH stuff WHERE stuff.active = lfindme:
 
   
END.
MESSAGE ETIME.

ETIME(TRUE).
FOR EACH stuff WHERE stuff.active = FALSE:
   
END.
MESSAGE ETIME.

ETIME(TRUE).
FOR EACH stuff WHERE stuff.active = morestuff.active:
 
   
END.
MESSAGE ETIME.

and see that all 3 FOR EACH'es use the same index.

User-defined functions and class methods are a different matter.

Posted by Torben on 09-Jan-2020 12:41

If there is an index on active then my small test show:

Best: table.active

Equal: table.active = YES; table.active = NO (~ 10%)

Worst: NOT table.active (> 200%)

If there is not an index on active then:

Best: table.active; NOT table.active

Worst: table.active = YES; table.active = NO (~ 10%)

Posted by Thomas Mercer-Hursh on 09-Jan-2020 15:33

Frank, I believe the OP was asking about 1) comparing the field to a logical vs 2) just testing the field, not comparing the field to itself.  One might note that some of these tests could produce different result sets were it possible for table.active to take the unknown value.

Posted by ChUIMonster on 09-Jan-2020 15:47

As I recall... the difference between NOT and field = NO comes down to NOT being treated as a function rather than as an operator.

Posted by frank.meulblok on 09-Jan-2020 15:49

It could be that I misread, yes. Hopefully @jquerijero can elaborate ?

Posted by jquerijero on 09-Jan-2020 15:53

[quote user="Torben"]

If there is an index on active then my small test show:

Best: table.active

Equal: table.active = YES; table.active = NO (~ 10%)

Worst: NOT table.active (> 200%)

If there is not an index on active then:

Best: table.active; NOT table.active

Worst: table.active = YES; table.active = NO (~ 10%)

[/quote]

If this is true and since checking for negative is rare, it sounds like direct inspection of the field is better.

Posted by Torben on 09-Jan-2020 16:27

My test was on version 12.1 windows gui 64 bit version. And the difference between 'table.active' and 'table.active = YES' seems to be the overhead for evaluating equality expression.

Posted by ske on 09-Jan-2020 17:13

> "table.active = NOT table.active" (never true for any record)

But this expression CAN be true, if the field contains the unknown value.

(And I suppose that must have some implications for index searches too.)

Posted by jquerijero on 09-Jan-2020 20:31

[quote user="frank.meulblok"]

It could be that I misread, yes. Hopefully @jquerijero can elaborate ?

[/quote]

It's writing a query involving a logical field. Which of the following constructs are better?

FOR EACH table WHERE table.active = YES NO-LOCK:
or
FOR EACH table WHERE table.active NO-LOCK:

FOR EACH table WHERE table.active = NO NO-LOCK:
or
FOR EACH table WHERE NOT table.active NO-LOCK:

Posted by Tim Hutchens on 09-Jan-2020 21:38

In my opinion, the biggest performance impact here is related to index selection. Given that there is an index with table.active as a leading component, what index selection does the COMPILE XREF show?

The XREF for each of the following shows the same index selection of an index with table.active as a leading component:

FOR EACH table WHERE table.active = YES NO-LOCK:

FOR EACH table WHERE table.active NO-LOCK:

FOR EACH table WHERE table.active = NO NO-LOCK:

The XREF for this query shows WHOLE-INDEX:

FOR EACH table WHERE NOT table.active NO-LOCK:

According to Dan Foreman, there is no performance difference between these constructs: pugchallenge.org/.../230_Indexing.ppt (slide 19)

FOR EACH table WHERE table.active = YES NO-LOCK:

FOR EACH table WHERE table.active NO-LOCK:

If there is a difference, I would attribute it to evaluating the equality expression as Torben mentioned.

For the NOT operator, it may have to do with NOT being treated more like a function than an operator as Tom pointed out. I can't find an Order of Operations reference, but that may be the reason the index engine won't deal with the NOT table.active construct.

So I think the main recommendations are to make sure there is an appropriate index for the query that includes the logical field; and to not use the NOT operator on logical fields in queries. After that, test performance of the other constructs against the table you are working on. Not sure what impact the presence of null values may have on performance.

Posted by Shao.Chan on 10-Jan-2020 09:05

If I recall correctly, Gus weighed in on this about 20 years ago.  Part of it depends on whether it is a mandatory logical field.  If not, the unknown value has to be taken into consideration and thus NOT means = TRUE OR = ?.  I am unsure how much the function makes a difference.  I understand if the logical is made mandatory, the performance gap narrows as ? cannot be a possibility (I am unsure of that though).  Either way = TRUE and = FALSE are the best as they are equality matches and NOT is potentially a multi-equality match.

This thread is closed