Handling untrusted labels¶
Encoding labels often come from outside the program: HTTP headers, import job configuration, database metadata, or user selection. Treat those labels as untrusted input.
Validate before converting¶
if (!polycpp::iconv_lite::encodingExists(label)) {
throw polycpp::TypeError("unsupported encoding");
}
auto text = polycpp::iconv_lite::decode(bytes, label);
encode and decode also throw polycpp::TypeError for unsupported
labels, so explicit validation is mainly useful when you want to produce a
domain-specific error message before conversion starts.
Inspect resolution for logs and tests¶
auto info = polycpp::iconv_lite::inspectEncoding("ISO_8859-5:1988");
// info.canonical == "iso88595"
// info.converter == "iso88595"
Use inspectEncoding for diagnostics, not as a security boundary. The
supported/unsupported decision should use encodingExists or the conversion
exception.
Choose a policy for missing labels¶
When metadata omits the charset, pick an application-level default rather than guessing blindly. Common choices are:
default to UTF-8 for new protocols you control
require the import job to specify a label
use a per-partner or per-file-format default
reject the payload and ask for explicit metadata
Replacement characters¶
Invalid byte sequences and unrepresentable characters are converted with the library’s current replacement defaults. For tests or compatibility with a specific integration, the defaults can be changed for future converters:
auto old = polycpp::iconv_lite::defaultCharUnicode();
polycpp::iconv_lite::setDefaultCharUnicode("?");
auto text = polycpp::iconv_lite::decode(bytes, "gbk");
polycpp::iconv_lite::setDefaultCharUnicode(old);
Keep replacement changes tightly scoped. Existing encoder and decoder objects keep their own state; create new converters after changing defaults.