Which utf-8 collation matches iso8859-1 sorting ? Is there an overview of the existing collation tables ? - Forum - OpenEdge RDBMS - Progress Community

Which utf-8 collation matches iso8859-1 sorting ? Is there an overview of the existing collation tables ?

 Forum

Which utf-8 collation matches iso8859-1 sorting ? Is there an overview of the existing collation tables ?

This question is not answered

After migration to utf-8 our customer noted that the item table is sorted differently.

We need utf-8 because we have to be able to store different languages but we'd like to have the same sorting as in iso8859-1.

I know this is done by specifying the collation but I can't find documentation explaining which collection table is what.

e.g. I expected  ICU-be.df to be for Belgium but then I noticed ICU_48-be.df is for Belarus,  I suspect ICU-be.df  is for Belarus as well.

Mutlple questions

Is there an overview table explaining the different collations and there uses ?

What is the difference between the 48 collation tables and the others ? which are the more up-to-date versions ?

Is there a table that matches iso8859 or table that would sort [] (square brackets) in the same position as iso8859-1 ?

/* iso-8859-1 sorting */
EBRUBARROMA HT60
EBRUBARROMA HT65
EBRUBARROMA HT80
EBRUBARROMA [HT]


/* utf-8 sorting */
EBRUBARROMA [HT]
EBRUBARROMA HT60
EBRUBARROMA HT65
EBRUBARROMA HT80
All Replies
  • ICU-be is indeed Belarussian. Collation is per language, not per Country, I believe. You can see if ICU-en_BE makes any difference, or ICU-UCA if you are currently using Basic.

    ICU_48 is the more recent, the ICU without version listed is version 24.

    For the complete list you can check DLC/prolang/readme

  • Reply from Libor

    I have replied on the forum but for whatever reason it rejected the post saying I need moderator approval 

    ICU-be is for Belarus/sian indeed, it’s per language not per country. You can see if ICU-en_BE makes a different, of ICU_UCA if your are using default Basic.

    ICU 48 is the more recent version, the non versioned ICU files are version 24.

    You can also check DLC/prolang/README.

  • I tried all tables in $DLC/prolang/utf8.

    They all sort [] brackets the same way

    Is it possible to create our own collation table ?

    The documentation (https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvint/modifying-openedge-collation-tables.html) states "If you need to modify an International Components for Unicode (ICU) collation, contact Progress Software Technical Support for assistance."

  • I never tried it with any Unicode, but I once had the need to create a custom codepage which would compare characters exactly the same as some other system and it was not very difficult.

    This helped me a lot:

    documentation.progress.com/.../index.html

    In my case I didn't had to change the database collation or codepage, just the session codepage with the -cpcoll parameter.

    And I also worked with Progress TS and they give me some good directions on what to do.

    But yeah, no docs on how to do it for Unicode that I'm aware of.