I think I'm running a 100% identically query, one with SQL and the other in ABL.
TABLE has Index for LastChanged and FIELD. Both contains one field only
select * from TABLE WHERE LastChanged < '2019-01-03 18:17:31:491 + 01:00' and FIELD < 1000 order by LastChanged desc (limited to 1000 records by SQL tool used -> Squirrel)>> This uses the index of FIELD -> very bad because the order is by LastChanged!!
FOR EACH TABLE WHERE LastChanged < DATETIME-TZ('2019-01-03 18:17:31:491 + 01:00') AND FIELD < 1000 by LastChanged desc>> This uses the index of LastChanged -> result very fast
What is the problem that the SQL query uses the wrong index and therefor runs endless on a large table (like fullscan)?
Architect of the SmartComponent Library and WinKit
The SQL query processor uses completely different algorithms for deciding which indexes to use than the 4GL does. Among other things, it uses statistics of the number of rows, key distributions, and other info in its choices.
You can find out what the query tree looks like by querying the _Sql_Qplan table. its contents are slightly difficult to understand at firsat, but not too bad when you get used to it.
Was the sql command issued from the ABL or did it connect to OE SQL
Thanks for your suggentions, very appreciated.
I ran the SQL with the JDBC tool Squirrel and after my post i tried statistics as Mike mentioned too
UPDATE INDEX STATISTICS
Then the Index for LastChanges was used and I got the results in the same time.
"The SQL query processor uses completely different algorithms for deciding which indexes to use than the 4GL does. Among other things, it uses statistics of the number of rows, key distributions, and other info in its choices."
As a SQL dummy I would prefer to ignore statistics and use the same logic as ABL does
This means SQL Performance is poor without running the maintenance commands?
Shall i run this one time a week (maybe Sunday without users)?
UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS;
Please find the responses below.
>> This means SQL Performance is poor without running the maintenance commands?
In general yes, since SQL evaluates different execution plans for a statement and selects the one with lease cost. So, without statistics, cost of all execution plans might look alike.
However, starting from OE 12.0, SQL will be introducing a feature called "Autonomous Update Statistics" (AUS), which updates the statistics without user intervention.
>> Shall i run this one time a week (maybe Sunday without users)?
In general, statistics should be updated as and when good amount of data changes in tables. If it happened to be the case that, only 5% of total tables has data changes, then only those tables requires new statistics.
>> UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS;
Above command updates statistics for all tables, indexes and columns present in the database. Below command allows you to update statistics for specific table.
UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS FOR <table_name>;
SQL stats can be maintained automatically in OE release 12.0 by a special server process.
I ran now the statistic command : UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS;
Then I have this SQL query, not very difficult:
SELECT field0 from table WHERE field1 = 201903AND field2 = 'XXX'AND field3 = 999AND field4 = 'USD'AND (field5 = 6100 OR field5 = 6200)
Indexes: One for field 1, one for field2, one for field 3, one for field5 and many others
I started the query with my JDBC tool and I'm waiting still, protop shows it's using index for field3.
But field1 reduces the amount of records dramatically.
The same in ABL displays the values at once! Really, at once!Using 3 Indexes of field1, field2 and field3
Meanwhile I got the result:Query 1 of 1, Rows read: 42, Elapsed time (seconds) - Total: 1,649.513, SQL query: 0.026, Reading results: 1,649.487Query Plan Shows the same as I got from protop (vst): Only one index is used, a bad one not efficent enough
Clearly, SQL optimizer should have used index of field1, if it reduces number of DB reads. I see following two possibilities on why it did not pick that index.
1. Once we execute a query, the query plan (access path) will cached, so even after updating statistics, SQL might be using cached plan. In order for SQL to use new plan (based on statistics), we need to re-connect and run the query again.
2. There could be a bug in SQL optimizer, due to which it picked non-efficient index.
If above point #1 is not the reason, then, It would be appreciated (if possible), if you can share data definition (.df) and data (.d) for the above mentioned table. This might help us in understanding the issue and improve SQL access.
And on which version of OE that these queries are being exercised?
note that aside from the ordering available by the key. each index can have a radically different i/o cost, as much as 500 to 1 difference, depending on the blocks where the rows are stored as well as the (less important) seek time between index blocks.
this cost should be measured (probably again when significant changes happen to storage arrangement) on each deployed system.
>SQL stats can be maintained automatically in OE release 12.0 by a special server process.
Thank you very much Richard !
What is the threshold when that "special" server process decides to automatically rebuild statistics? Is that 5% changed? And will it be described in details in 12.0 documentation ?
I had some problems with that process in Oracle. When a new partition starts on the 1-st day of the month statistics that is automatically generated is off, because there is not enough data in a partition. Oracle has a way to copy statistics, so I was copying statistics from the previous month partition to overwrite the generated one on the 1st and 2d day of the month.
So there may be issues with automatically generated statistics ( like with autonomous driving cars)
like with autonomous driving cars
You mean ... safer than leaving it to humans? :)
Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice http://www.cintegrity.com
Hi Dmitri Levin,
>> What is the threshold when that "special" server process decides to automatically rebuild statistics? Is that 5% changed? And will it be described in details in 12.0 documentation ?
Threshold percentage will be configurable. Yes, details will be described in 12.0 documentation.