OK, take a deep breath. Look at this page and tell me the world isn't crazy.
Say you want to talk to the world in Unicode, but you want to do
it quickly. Well, obviously you're going to draft C's
atoi
and friends to convert numerals to your internal
integer type, right? That's great in theory, but when your code is
running on someone else's webserver that you know little about, things
might get a little tricky.
Haskell's FFI specifies that the functions in the
CString
module are subject to the current
locale, which renders them unpredictable on the hitherto
mentioned webserver. I can imagine a numeral encoding that
e.g. strtol_l
understands with the locale setting of
today that it fails to understand tomorrow. I don't think there are
enough manpages in all the world to clarify this problem.
Solution? Use integers only for internal purposes, like user
identifiers, render them in ASCII, and use Unicode strings for
everything else. Don't use the CString
module, carefully
unpack UTF-8 ByteString
s into Haskell
String
s, and don't expect warp speed. If you're (cough)
putting this stuff in a library, hope like hell your users don't try
anything too weird.
One day someone will resolve all the issues of implementing a proper Unicode I/O layer, and I will thank them for it.