summaryrefslogtreecommitdiffstats
path: root/lib/libc/locale
diff options
context:
space:
mode:
authorpfg <pfg@FreeBSD.org>2014-05-05 14:50:53 +0000
committerpfg <pfg@FreeBSD.org>2014-05-05 14:50:53 +0000
commit9fe5eca952a101650cad59dd5bb5256c130c962d (patch)
treee459c6f9d7671fac72a4c257ffe0a77eb7375a11 /lib/libc/locale
parent0bbe9c267ffe4ce0f7ad36dc92d99ea5083be996 (diff)
downloadFreeBSD-src-9fe5eca952a101650cad59dd5bb5256c130c962d.zip
FreeBSD-src-9fe5eca952a101650cad59dd5bb5256c130c962d.tar.gz
MFC r265095, r265167;
citrus: Avoid invalid code points. The UTF-8 decoder should not accept byte sequences which decode to unicode code positions U+D800 to U+DFFF (UTF-16 surrogates).[1] Contrary to the original OpenBSD patch, we do pass U+FFFE and U+FFFF, both values are valid "non-characters" [2] and must be mapped through UTFs. [1] http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 [2] http://www.unicode.org/faq/private_use.html Reported by: Stefan Sperling [1] Thanks to: jilles [2] Obtained from: OpenBSD
Diffstat (limited to 'lib/libc/locale')
-rw-r--r--lib/libc/locale/utf8.c7
1 files changed, 7 insertions, 0 deletions
diff --git a/lib/libc/locale/utf8.c b/lib/libc/locale/utf8.c
index 40f0e17..cffa241 100644
--- a/lib/libc/locale/utf8.c
+++ b/lib/libc/locale/utf8.c
@@ -203,6 +203,13 @@ _UTF8_mbrtowc(wchar_t * __restrict pwc, const char * __restrict s, size_t n,
errno = EILSEQ;
return ((size_t)-1);
}
+ if (wch >= 0xd800 && wch <= 0xdfff) {
+ /*
+ * Malformed input; invalid code points.
+ */
+ errno = EILSEQ;
+ return ((size_t)-1);
+ }
if (pwc != NULL)
*pwc = wch;
us->want = 0;
OpenPOWER on IntegriCloud