Enable definition of Unicode conv. options on compiler command line.

Three documented pre-processor variables can now be defined on the compiler command line to avoid editing the FLTK src code. The default values still apply unchanged. git-svn-id: file:///fltk/svn/fltk/branches/branch-1.3@11404 ea41ed52-d2ee-0310-a9c1-e6b18d33e121
2016-03-23 13:36:50 +00:00 · 2016-03-23 13:36:50 +00:00 · de74c82d42
commit de74c82d42
parent 4bfabbd619
2 changed files with 18 additions and 8 deletions
--- a/documentation/src/unicode.dox
+++ b/documentation/src/unicode.dox
@ -191,14 +191,14 @@ the following limitations:

 \section unicode_illegals Illegal Unicode and UTF-8 sequences

-Three pre-processor variables are defined in the source code that
+Three pre-processor variables are defined in the source code [1] that
 determine how %fl_utf8decode() handles illegal UTF-8 sequences:

 - if ERRORS_TO_CP1252 is set to 1 (the default), %fl_utf8decode() will
  assume that a byte sequence starting with a byte in the range 0x80
-  to 0x9f represents a Microsoft CP1252 character, and will instead
-  return the value of an equivalent UCS character. Otherwise, it
-  will be processed as an illegal byte value as described below.
+  to 0x9f represents a Microsoft CP1252 character, and will return
+  the value of an equivalent UCS character. Otherwise, it will be
+  processed as an illegal byte value as described below.

 - if STRICT_RFC3629 is set to 1 (not the default!) then UTF-8
  sequences that correspond to illegal UCS values are treated as
@ -210,6 +210,10 @@ determine how %fl_utf8decode() handles illegal UTF-8 sequences:
  byte value is returned unchanged, otherwise 0xFFFD, the Unicode
  REPLACEMENT CHARACTER, is returned instead.

+[1] Since FLTK 1.3.4 you may set these three pre-processor variables on
+    your compile command line with -D"variable=value" (value: 0 or 1)
+    to avoid editing the source code.
+
 %fl_utf8encode() is less strict, and only generates the UTF-8
 sequence for 0xFFFD, the Unicode REPLACEMENT CHARACTER, if it is
 asked to encode a UCS value above U+10FFFF.
--- a/src/fl_utf.c
+++ b/src/fl_utf.c
@ -67,7 +67,9 @@
   to completely ignore character sets in your code because virtually
   everything is either ISO-8859-1 or UTF-8.
 */
-#define ERRORS_TO_ISO8859_1 1
+#ifndef ERRORS_TO_ISO8859_1
+# define ERRORS_TO_ISO8859_1 1
+#endif

 /*!Set to 1 to turn bad UTF-8 bytes in the 0x80-0x9f range into the
   Unicode index for Microsoft's CP1252 character set. You should
@ -75,7 +77,9 @@
   available text (such as all web pages) are correctly converted
   to Unicode.
 */
-#define ERRORS_TO_CP1252 1
+#ifndef ERRORS_TO_CP1252
+# define ERRORS_TO_CP1252 1
+#endif

 /*!A number of Unicode code points are in fact illegal and should not
   be produced by a UTF-8 converter. Turn this on will replace the
@ -83,7 +87,9 @@
   arbitrary 16-bit data to UTF-8 and then back is not an identity,
   which will probably break a lot of software.
 */
-#define STRICT_RFC3629 0
+#ifndef STRICT_RFC3629
+# define STRICT_RFC3629 0
+#endif

 #if ERRORS_TO_CP1252
 /* Codes 0x80..0x9f from the Microsoft CP1252 character set, translated
@ -103,7 +109,7 @@ static unsigned short cp1252[32] = {
    (adding \e len to \e p will point at the next character).

    If \p p points at an illegal UTF-8 encoding, including one that
-    would go past \e end, or where a code is uses more bytes than
+    would go past \e end, or where a code uses more bytes than
    necessary, then *(unsigned char*)p is translated as though it is
    in the Microsoft CP1252 character set and \e len is set to 1.
    Treating errors this way allows this to decode almost any