Discussion:
[Foxgui-users] FOX Unicode question.
The Devils Jester
2008-04-15 12:47:50 UTC
Permalink
I dont use double wide strings in my FOX 1.6 application anywhere, and
havent compiled unicode support. Yet Unicode seems to work flawlessly
on Linux platforms anyway, (and doesnt work at all on Windows
platforms).

Why is this? Can X11 display unicode with 1 byte chars? Is there a
way to get them to display in Windows without unicode support?
--
If you make something that any idiot can use, only idiots will use it.
Jeroen van der Zijp
2008-04-15 13:26:19 UTC
Permalink
Post by The Devils Jester
I dont use double wide strings in my FOX 1.6 application anywhere, and
havent compiled unicode support. Yet Unicode seems to work flawlessly
on Linux platforms anyway, (and doesnt work at all on Windows
platforms).
Why is this? Can X11 display unicode with 1 byte chars? Is there a
way to get them to display in Windows without unicode support?
In FOX, Unicode support means UTF8-encoded Unicode. When you compile
it on Windows (with -DUNICODE), the strings are converted from UTF8 to
wide-character strings prior to calling WIN32 functions.

On Linux, UTF8 is more or less the standard now and therefore interface to
the system is fairly painless.

Because of this, in your own code, you can keep using 8-bit character
strings, and none of your software needs changing except in a few rare
instances where you actually care about the content instead of just
punctuations and such.

UTF8 is a pretty neat invention:

- The first 127 code points are single-byte ASCII. So all common
punctuation, numbers, and so on will be single-byte characters.

- You can recognize start of multi-byte characters; this means you
can scan forward and backward, even from a random position in
the string.

- It is actually more compact in most cases than wide character
strings, for many European languages.

- It has no byte-order issues.

- It can be encoded/decoded with very little effort.

And many more advantages. Its too bad some systems picked wide-characters
instead, because UTF8 is actually much easier to deal with in most common
scenarios.



- Jeroen
The Devils Jester
2008-04-15 13:48:08 UTC
Permalink
So UTF8 is a compatibility layer or just another way to implement Unicode?

I am having two issues, the first being Windows not liking Unicode,
and to convert it to Unicode, I have to rewrite many many lines of
code, and actually recreate some functions for certain DLLs that dont
like Unicode. There is no UTF8 fix for this?

My second issue is with my "core" app (the non FOX part). Unicode
paths in Linux (that were saved from my FOX app) will not load (or
atleast display?) properly in my printf() and X11 drawing routines.

On Tue, Apr 15, 2008 at 1:26 PM, Jeroen van der Zijp
Post by Jeroen van der Zijp
Post by The Devils Jester
I dont use double wide strings in my FOX 1.6 application anywhere, and
havent compiled unicode support. Yet Unicode seems to work flawlessly
on Linux platforms anyway, (and doesnt work at all on Windows
platforms).
Why is this? Can X11 display unicode with 1 byte chars? Is there a
way to get them to display in Windows without unicode support?
In FOX, Unicode support means UTF8-encoded Unicode. When you compile
it on Windows (with -DUNICODE), the strings are converted from UTF8 to
wide-character strings prior to calling WIN32 functions.
On Linux, UTF8 is more or less the standard now and therefore interface to
the system is fairly painless.
Because of this, in your own code, you can keep using 8-bit character
strings, and none of your software needs changing except in a few rare
instances where you actually care about the content instead of just
punctuations and such.
- The first 127 code points are single-byte ASCII. So all common
punctuation, numbers, and so on will be single-byte characters.
- You can recognize start of multi-byte characters; this means you
can scan forward and backward, even from a random position in
the string.
- It is actually more compact in most cases than wide character
strings, for many European languages.
- It has no byte-order issues.
- It can be encoded/decoded with very little effort.
And many more advantages. Its too bad some systems picked wide-characters
instead, because UTF8 is actually much easier to deal with in most common
scenarios.
- Jeroen
--
If you make something that any idiot can use, only idiots will use it.
Jeroen van der Zijp
2008-04-15 14:38:47 UTC
Permalink
Post by The Devils Jester
So UTF8 is a compatibility layer or just another way to implement Unicode?
Its a way to store and manipulate Unicode. UTF8 allows programs to
kind of "ease into" Unicode without major changes to data structures.
Post by The Devils Jester
I am having two issues, the first being Windows not liking Unicode,
and to convert it to Unicode, I have to rewrite many many lines of
code, and actually recreate some functions for certain DLLs that dont
like Unicode. There is no UTF8 fix for this?
Not sure what you mean here. I can tell you only that my own experience
is that when using UTF8, there is very, very minimal impact on existing
code, since all you'd have to do is making sure your string manipulations
are 8-bit safe [in case of signed char's].

In a few cases, where the content matters, you will need to use FXString
inc() and dec() instead of ++ and -- since characters are no longer one
byte.
Post by The Devils Jester
My second issue is with my "core" app (the non FOX part). Unicode
paths in Linux (that were saved from my FOX app) will not load (or
atleast display?) properly in my printf() and X11 drawing routines.
They will if you go through FXDCWindow and FXFont. If curious, look
there on how to display UTF8 strings. Depending on whether you use
Xft or XLFD fonts, you may have to convert to 16-bit characters [the
Xft layer actually has direct UTF8 API support].

When printing, it helps to (1) set your locale to xx_XX.UTF8 mode,
and also (2) to run a terminal program which understands UTF8 (e.g.
rxvt-unicode or any of the modern terminal programs).


Regards,

- Jeroen
The Devils Jester
2008-04-15 15:01:30 UTC
Permalink
Post by Jeroen van der Zijp
Not sure what you mean here. I can tell you only that my own experience
is that when using UTF8, there is very, very minimal impact on existing
code, since all you'd have to do is making sure your string manipulations
are 8-bit safe [in case of signed char's].
I am talking about my Windows build. The Linux one is flawless. Is
there some code I can use to enable UTF8 in my Windows FOX?
Post by Jeroen van der Zijp
They will if you go through FXDCWindow and FXFont.
The key part of my comment was my "non FOX app". I have a parent app
(GUI, FOX based) and a child 'core' app (pure console) that uses no
fox but still needs to work with what the parent setup. I will look
into the src of those and maybe get an idea how you did it though.

On Tue, Apr 15, 2008 at 2:38 PM, Jeroen van der Zijp
Post by Jeroen van der Zijp
Post by The Devils Jester
So UTF8 is a compatibility layer or just another way to implement Unicode?
Its a way to store and manipulate Unicode. UTF8 allows programs to
kind of "ease into" Unicode without major changes to data structures.
Post by The Devils Jester
I am having two issues, the first being Windows not liking Unicode,
and to convert it to Unicode, I have to rewrite many many lines of
code, and actually recreate some functions for certain DLLs that dont
like Unicode. There is no UTF8 fix for this?
Not sure what you mean here. I can tell you only that my own experience
is that when using UTF8, there is very, very minimal impact on existing
code, since all you'd have to do is making sure your string manipulations
are 8-bit safe [in case of signed char's].
In a few cases, where the content matters, you will need to use FXString
inc() and dec() instead of ++ and -- since characters are no longer one
byte.
Post by The Devils Jester
My second issue is with my "core" app (the non FOX part). Unicode
paths in Linux (that were saved from my FOX app) will not load (or
atleast display?) properly in my printf() and X11 drawing routines.
They will if you go through FXDCWindow and FXFont. If curious, look
there on how to display UTF8 strings. Depending on whether you use
Xft or XLFD fonts, you may have to convert to 16-bit characters [the
Xft layer actually has direct UTF8 API support].
When printing, it helps to (1) set your locale to xx_XX.UTF8 mode,
and also (2) to run a terminal program which understands UTF8 (e.g.
rxvt-unicode or any of the modern terminal programs).
Regards,
- Jeroen
--
If you make something that any idiot can use, only idiots will use it.
Jeroen van der Zijp
2008-04-15 17:01:34 UTC
Permalink
Post by The Devils Jester
Post by Jeroen van der Zijp
Not sure what you mean here. I can tell you only that my own experience
is that when using UTF8, there is very, very minimal impact on existing
code, since all you'd have to do is making sure your string manipulations
are 8-bit safe [in case of signed char's].
I am talking about my Windows build. The Linux one is flawless. Is
there some code I can use to enable UTF8 in my Windows FOX?
Compile with -DUNICODE as I have already mentioned. FOX works exclusively
with UTF8, internally.

On Windows, we can either convert this to 16-bit wide characters when
interfacing with Native Windows API's (-DUNICODE), or assume we're only
dealing with ASCII (without -DUNICODE). Many Native Windows API's can
be either 8-bit character or 16-bit (UTF16) Unicode.

Note that UTF16 is *also* a variable-length encoding of Unicode, due to the
fact that the number of code points in Unicode has outgrown the original
65536 code assignments over the years. There are now over 1,000,000 code
points reserved (not all are used, it is sparse...).

See http://www.unicode.org for more details...
Post by The Devils Jester
Post by Jeroen van der Zijp
They will if you go through FXDCWindow and FXFont.
The key part of my comment was my "non FOX app". I have a parent app
(GUI, FOX based) and a child 'core' app (pure console) that uses no
fox but still needs to work with what the parent setup. I will look
into the src of those and maybe get an idea how you did it though.
What I was suggesting is, (1) use FOX API's so those gory details are
taken care of for you, or (2) look into FOX source code to see how its
done when you write your own code.



Good luck,


- Jeroen

Loading...