Capturing and storing Chinese into progress DB - Forum - OpenEdge General - Progress Community

Capturing and storing Chinese into progress DB

 Forum

Capturing and storing Chinese into progress DB

This question is not answered

Hi Guys

I sit with an issue on my hands in regards to capturing Chinese into the progress DB. I followed the following link... and it works.. i just have trouble with the font display.

https://knowledgebase.progress.com/articles/Article/P108864

The client requires that a copy paste utility where he can copy and paste the Chinese into the necessary fields. The problem with this is as follows.

1) how do convert the Chinese ideograph to utf-8 at the time of copy and paste?

2) how do i get the fonts installed to show the Chinese ideographs.

I cant change our current DB codepage from what it is currently - (restriction).

Guidance and help will be greatly appreciated here.

OE 11.7

All Replies
  • What *is* the current DB code page?   Depending on what this is, what you ask could be impossible.

    Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice  http://www.cintegrity.com

  • We converted everything from iso8859-x/125x codepages many years ago, so everything is utf-8. (Database, AppServer, Client, Source code, input files, output files ....)

    This works with Chinese. (And any other of the code-pages we have tested, except Turkish)

    Both source code, database and input/output files contains Arabic, Chinese, Japanese, Cyrillic, Western European and more characters)

    To get characters to display correctly you need to install/use font, that supports all the characters you want to display. (Otherwise you will not see the character or a box in place of the real character).

  • I think we need more information about the database and client configurations in order to help.

    What is the code page of the database, and which code pages (-cpinternal) are the clients using?

    I assume the clients are running on Windows. Is that correct?

    Please note that the kbase article you referred to was written for version 9, when our unicode support was limited. In version OE 11.7 it should not be necessary to convert the characters within the session - it will happen automatically for a CP936 client connected to UTF-8 database, for example.

  • Hi Guys

    Sorry the codepage is iso8859-1

  • Elsworth Burmeister

     the codepage is iso8859-1

    In that case anything you do will only "work" by accident because the ISO8859-1 codepage does not contain any Chinese characters,

    Any end result that looks correct will be because of 2 improper/missing codepage conversions cancelling eachother out. (One on write to the database, one on read from the database). This can (and probably will) also affect any indexes on the character fields in question, interfering with sort order and leading to weird and unwanted side effects. Especially because AFAIK most codepages that do support Chinese are double-byte, and ISO8859-1 is single-byte. (The UTF* Unicode encodings being the exception).

    You said you're restricted in changing DB codepage, but you realyl should get that restriction lifted/reconsidered.

  • > On Feb 5, 2019, at 9:20 AM, Elsworth Burmeister wrote:

    >

    > Sorry the codepage is iso8859-1

    >

    >

    >

    ISO-8859-1 is an 8 bit (single byte) code page designed for western europe and the US. It does not work for chinese characters without some sort of custom hack and much programming to encode/decode the hack.

    You should use a multi-byte code page that can handle chinese characters. Two that come to mind are UTF-8 and GB2312. Others are listed here:

    documentation.progress.com/.../openedge-support-for-multi-byte-code-pages.html

    doc about using multi-byte character sets is here:

    documentation.progress.com/.../using-multi-byte-code-pages.html

  • In older versions I think the Progress execuptables used single byte code page internally.

    This changes with 10.? where the internal code page is multi-byte. ((NOT CPINTERNAL! and think it uses utf-8)

    So when using 8 bit character code page conversion happens from screen to internal and back from internal to screen. (Fx € symbol)

    If the characters are not valid in the conversion tables, then the iso8859-1 conversion between screen and internal (utf-8) will fail.

    Running

    MESSAGE

    CODEPAGE-CONVERT(CODEPAGE-CONVERT("€", "iso8859-1", "utf-8"), "utf-8", "iso8859-1") SKIP

    CODEPAGE-CONVERT(CODEPAGE-CONVERT("€", "iso8859-15", "utf-8"), "utf-8", "iso8859-15") SKIP

    ASC("€") SKIP

    CODEPAGE-CONVERT(CODEPAGE-CONVERT("出", "iso8859-1", "utf-8"), "utf-8", "iso8859-1") SKIP

    CODEPAGE-CONVERT(CODEPAGE-CONVERT("出", "iso8859-15", "utf-8"), "utf-8", "iso8859-15") SKIP

    ASC("出").

    with  -cpinternal utf-8 following gives:

    ?

    14844588

    ?

    ?

    15042490

  • Torben

    In older versions I think the Progress execuptables used single byte code page internally.

    This changes with 10.? where the internal code page is multi-byte. ((NOT CPINTERNAL! and think it uses utf-8)

    So when using 8 bit character code page conversion happens from screen to internal and back from internal to screen. (Fx € symbol)

    If the characters are not valid in the conversion tables, then the iso8859-1 conversion between screen and internal (utf-8) will fail.

    It happened in 10.0A in the Windows GUI client (prowin/prowin32) when it became a Unicode-compliant application. Data which is displayed on screen is converted to/from UTF-16 because that's what Windows uses natively for displaying Unicode characters.