After migration to utf-8 our customer noted that the item table is sorted differently.
We need utf-8 because we have to be able to store different languages but we'd like to have the same sorting as in iso8859-1.
I know this is done by specifying the collation but I can't find documentation explaining which collection table is what.
e.g. I expected ICU-be.df to be for Belgium but then I noticed ICU_48-be.df is for Belarus, I suspect ICU-be.df is for Belarus as well.
Is there an overview table explaining the different collations and there uses ?
What is the difference between the 48 collation tables and the others ? which are the more up-to-date versions ?
Is there a table that matches iso8859 or table that would sort  (square brackets) in the same position as iso8859-1 ?
/* iso-8859-1 sorting */
/* utf-8 sorting */
ICU-be is indeed Belarussian. Collation is per language, not per Country, I believe. You can see if ICU-en_BE makes any difference, or ICU-UCA if you are currently using Basic.
ICU_48 is the more recent, the ICU without version listed is version 24.
For the complete list you can check DLC/prolang/readme
Reply from Libor
I have replied on the forum but for whatever reason it rejected the post saying I need moderator approval
ICU-be is for Belarus/sian indeed, it’s per language not per country. You can see if ICU-en_BE makes a different, of ICU_UCA if your are using default Basic.
ICU 48 is the more recent version, the non versioned ICU files are version 24.
You can also check DLC/prolang/README.
I tried all tables in $DLC/prolang/utf8.
They all sort  brackets the same way
Is it possible to create our own collation table ?
The documentation (https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvint/modifying-openedge-collation-tables.html) states "If you need to modify an International Components for Unicode (ICU) collation, contact Progress Software Technical Support for assistance."
I never tried it with any Unicode, but I once had the need to create a custom codepage which would compare characters exactly the same as some other system and it was not very difficult.
This helped me a lot:
In my case I didn't had to change the database collation or codepage, just the session codepage with the -cpcoll parameter.
And I also worked with Progress TS and they give me some good directions on what to do.
But yeah, no docs on how to do it for Unicode that I'm aware of.