libcommonc++  0.7
UTF8Decoder Class Reference

A UTF-8 to UTF-16 string transcoder. More...

#include <UTF8Decoder.h++>

Inheritance diagram for UTF8Decoder:
Collaboration diagram for UTF8Decoder:

Public Member Functions

 UTF8Decoder ()
 Construct a new UTF8Decoder. More...
 
 ~UTF8Decoder ()
 Destructor. More...
 
int decode (const char **input, int *inputCountLeft, char16_t **output, int *outputCountLeft)
 Transcode UTF-8 data to UTF-16. More...
 
void setStopDecodingAtNulChar (bool stopDecodingAtNulChar)
 Specifies whether decoding should stop as soon as a NUL character is encountered in the input. More...
 
void reset ()
 Resets the decoder to an initial state. More...
 

Static Public Member Functions

static int decodedLength (const char *buf, int length, int maxLength=0)
 Calculate the transcoded length of the UTF-16 data, without actually transcoding the string. More...
 

Static Public Attributes

static const int STATUS_OK = 0
 A status indicating that all input has been successfully transcoded. More...
 
static const int STATUS_NEED_MORE_INPUT = -1
 A status indicating that more input must be supplied to complete the transcoding. More...
 
static const int STATUS_OUTPUT_BUFFER_FULL = -2
 A status indicating that there is not enough room in the output buffer to finish transcoding the input buffer. More...
 
static const int STATUS_INVALID_INPUT = -3
 A status indicating that transcoding cannot continue because invalid data was encountered in the input buffer. More...
 

Protected Member Functions

int outputChar (char32_t char32, char16_t **buf, int *length)
 Outputs a UTF-32 character as a single UTF-16 character or surrogate pair. More...
 

Detailed Description

A UTF-8 to UTF-16 string transcoder.

Since the String class provides facilities for converting text to/from UTF-8, this class will generally not need to be used directly.

Author
Mark Lindner

Constructor & Destructor Documentation

◆ UTF8Decoder()

Construct a new UTF8Decoder.

◆ ~UTF8Decoder()

Destructor.

Member Function Documentation

◆ decode()

int decode ( const char **  input,
int *  inputCountLeft,
char16_t **  output,
int *  outputCountLeft 
)

Transcode UTF-8 data to UTF-16.

Parameters
inputA pointer to a pointer to the input buffer containing the UTF-8 data. On return, the pointer will be incremented by the number of bytes consumed from input.
inputCountLeftA pointer to the number of characters remaining to be decoded from input. On return, the value will be decremented by the number of characters consumed from input.
outputA pointer to a pointer to the output buffer where the UTF-16 data will be written. On return, the pointer will be incremented by the number of UTF-16 code points written to output.
outputCountLeftA pointer to the number of elements remaining in output. On return, the value will be decremented by the number of elements that were written to output.
Returns
One of the status constants defined in UTFDecoder.

◆ decodedLength()

int decodedLength ( const char *  buf,
int  length,
int  maxLength = 0 
)
static

Calculate the transcoded length of the UTF-16 data, without actually transcoding the string.

Parameters
bufA pointer to the buffer containing the UTF-8 data.
lengthThe length of the buffer.
maxLengthIf non-zero, indicates the maximum decoded length, in UTF-16 characters.
Returns
On success, the decoded length of the UTF-8 string, as a count of UTF-16 code points (including surrogates), not counting the NUL terminator. On failure, one of the status constants defined in UTFDecoder.

◆ outputChar()

int outputChar ( char32_t  char32,
char16_t **  buf,
int *  length 
)
protectedinherited

Outputs a UTF-32 character as a single UTF-16 character or surrogate pair.

Parameters
char32The character to output.
bufThe output buffer.
lengthThe remaining number of elements in the output buffer.
Returns
A status code.

◆ reset()

void reset ( )
virtual

Resets the decoder to an initial state.

Implements UTFDecoder.

◆ setStopDecodingAtNulChar()

void setStopDecodingAtNulChar ( bool  stopDecodingAtNulChar)
inline

Specifies whether decoding should stop as soon as a NUL character is encountered in the input.

By default, decoding continues until the entire input buffer has been consumed. By default this setting is off.

Member Data Documentation

◆ STATUS_INVALID_INPUT

const int STATUS_INVALID_INPUT = -3
staticinherited

A status indicating that transcoding cannot continue because invalid data was encountered in the input buffer.

◆ STATUS_NEED_MORE_INPUT

const int STATUS_NEED_MORE_INPUT = -1
staticinherited

A status indicating that more input must be supplied to complete the transcoding.

◆ STATUS_OK

const int STATUS_OK = 0
staticinherited

A status indicating that all input has been successfully transcoded.

◆ STATUS_OUTPUT_BUFFER_FULL

const int STATUS_OUTPUT_BUFFER_FULL = -2
staticinherited

A status indicating that there is not enough room in the output buffer to finish transcoding the input buffer.


The documentation for this class was generated from the following files: