libgpac
Documentation of the core library of GPAC
|
UTF and Unicode-related functions. More...
Macros | |
#define | GF_UTF8_FAIL 0xFFFFFFFF |
Functions | |
u32 | gf_utf8_wcstombs (char *dst, size_t dst_len, const unsigned short **srcp) |
wide-char to multibyte conversion More... | |
u32 | gf_utf8_mbstowcs (unsigned short *dst, size_t dst_len, const char **srcp) |
multibyte to wide-char conversion More... | |
u32 | gf_utf8_wcslen (const unsigned short *s) |
wide-char string length More... | |
GF_Err | gf_utf_get_string_from_bom (const u8 *data, u32 size, char **out_ptr, char **result, u32 *res_size) |
returns a string from a string started with BOM More... | |
Bool | gf_utf8_is_legal (const u8 *data, u32 size) |
Checks validity of a UTF8 string. More... | |
Bool | gf_utf8_reorder_bidi (u16 *utf_string, u32 len) |
string bidi reordering More... | |
u32 | utf8_to_ucs4 (u32 *ucs4_buf, u32 utf8_len, unsigned char *utf8_buf) |
Unicode conversion from UTF-8 to UCS-4. More... | |
Variables | |
static const u32 | UTF8_MAX_BYTES_PER_CHAR = 4 |
This section documents the UTF functions of the GPAC framework.
The wide characters in GPAC are unsignad shorts, in other words GPAC only supports UTF8 and UTF16 coding styles.
#define GF_UTF8_FAIL 0xFFFFFFFF |
error code for UTF-8 conversion errors
u32 gf_utf8_wcstombs | ( | char * | dst, |
size_t | dst_len, | ||
const unsigned short ** | srcp | ||
) |
Converts a wide-char string to a multibyte string
dst | multibyte destination buffer |
dst_len | multibyte destination buffer size |
srcp | address of the wide-char string. This will be set to the next char to be converted in the input buffer if not enough space in the destination, or NULL if conversion was completed. |
u32 gf_utf8_mbstowcs | ( | unsigned short * | dst, |
size_t | dst_len, | ||
const char ** | srcp | ||
) |
Converts a multibyte string to a wide-char string
dst | wide-char destination buffer |
dst_len | wide-char destination buffer size |
srcp | address of the multibyte character buffer. This will be set to the next char to be converted in the input buffer if not enough space in the destination, or NULL if conversion was completed. |
u32 gf_utf8_wcslen | ( | const unsigned short * | s | ) |
Gets the length in character of a wide-char string
s | the wide-char string |
GF_Err gf_utf_get_string_from_bom | ( | const u8 * | data, |
u32 | size, | ||
char ** | out_ptr, | ||
char ** | result, | ||
u32 * | res_size | ||
) |
Returns string from data, potentially converting utf16 to utf8
data | the string or wide-char string |
size | of the data buffer size of the data buffer |
out_ptr | set to an allocated buffer if needed for conversion, shall be destroyed by caller. Must not be NULL |
result | set to resulting string. Must not be NULL |
res_size | set to length of resulting string. May be NULL |
Checks if a given byte sequence is a valid UTF-8 encoding
data | the byte equence buffer |
size | the length of the byte sequence |
Performs a simple reordering of words in the string based on each word direction, so that glyphs are sorted in display order.
utf_string | the wide-char string |
len | the len of the wide-char string |
ucs4_buf | The UCS-4 buffer to fill |
utf8_len | The length of the UTF-8 buffer |
utf8_buf | The buffer containing the UTF-8 data |
This code has been adapted from http://www.ietf.org/rfc/rfc2640.txt Full Copyright Statement
Copyright (C) The Internet Society (1999). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC Editor function is currently provided by the Internet Society.
|
static |
maximum character size in bytes