This namespace contains utilities relevant to the use of UTF-8 in programs. More...

Classes
class	ConversionError

class	Iterator
	A class which will iterate through a std::string object by reference to unicode characters rather than by bytes. More...

class	Reassembler
	A class for reassembling UTF-8 strings sent over pipes and sockets so they form complete valid UTF-8 characters. More...

class	ReverseIterator
	A class which will iterate in reverse through a std::string object by reference to unicode characters rather than by bytes. More...

Functions
std::wstring	uniwide_from_utf8 (const std::string &input)

std::string	uniwide_to_utf8 (const std::wstring &input)

std::wstring	wide_from_utf8 (const std::string &input)

std::string	wide_to_utf8 (const std::wstring &input)

std::string	filename_from_utf8 (const std::string &input)

std::string	filename_to_utf8 (const std::string &input)

std::string	locale_from_utf8 (const std::string &input)

std::string	locale_to_utf8 (const std::string &input)

bool	validate (const std::string &text)

bool	operator== (const Iterator &iter1, const Iterator &iter2)

bool	operator!= (const Iterator &iter1, const Iterator &iter2)

bool	operator< (const Iterator &iter1, const Iterator &iter2)

bool	operator<= (const Iterator &iter1, const Iterator &iter2)

bool	operator> (const Iterator &iter1, const Iterator &iter2)

bool	operator>= (const Iterator &iter1, const Iterator &iter2)

bool	operator== (const ReverseIterator &iter1, const ReverseIterator &iter2)

bool	operator!= (const ReverseIterator &iter1, const ReverseIterator &iter2)

bool	operator< (const ReverseIterator &iter1, const ReverseIterator &iter2)

bool	operator<= (const ReverseIterator &iter1, const ReverseIterator &iter2)

bool	operator> (const ReverseIterator &iter1, const ReverseIterator &iter2)

bool	operator>= (const ReverseIterator &iter1, const ReverseIterator &iter2)

Detailed Description

This namespace contains utilities relevant to the use of UTF-8 in programs.

#include <c++-gtk-utils/convert.h> (for conversion and validation functions)

#include <c++-gtk-utils/reassembler.h> (for Reassembler class)

See also: convert.h reassembler.h

This namespace contains utilities relevant to the use of UTF-8 in programs. If you want these functions to work, you will generally have needed to have set the locale in the relevant program with either std::locale::global(std::locale("")) (from the C++ standard library) or setlocale(LC_ALL,"") (from the C standard library).

Function Documentation

◆ filename_from_utf8()

std::string Cgu::Utf8::filename_from_utf8 ( const std::string & input )

Converts text from UTF-8 to the system's filename encoding.

Parameters

input Text in valid UTF-8 format.

Returns: The input text converted to filename encoding.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid UTF-8 format, or cannot be converted to filename encoding (eg because the input characters cannot be represented by that encoding).
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Note: glib takes the system's filename encoding from the environmental variables G_FILENAME_ENCODING and G_BROKEN_FILENAMES. If G_BROKEN_FILENAMES is set to 1 and G_FILENAME_ENCODING is not set, it will be assumed that the filename encoding is the same as the locale encoding. If G_FILENAME_ENCODING is set, then G_BROKEN_FILENAMES is ignored, and filename encoding is taken from the value held by G_FILENAME_ENCODING.

Since 0.9.2

◆ filename_to_utf8()

std::string Cgu::Utf8::filename_to_utf8 ( const std::string & input )

Converts text from the system's filename encoding to UTF-8.

Parameters

input Text in valid filename encoding.

Returns: The input text converted to UTF-8.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid filename encoding.
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Note: glib takes the system's filename encoding from the environmental variables G_FILENAME_ENCODING and G_BROKEN_FILENAMES. If G_BROKEN_FILENAMES is set to 1 and G_FILENAME_ENCODING is not set, it will be assumed that the filename encoding is the same as the locale encoding. If G_FILENAME_ENCODING is set, then G_BROKEN_FILENAMES is ignored, and filename encoding is taken from the value held by G_FILENAME_ENCODING.

Since 0.9.2

◆ locale_from_utf8()

std::string Cgu::Utf8::locale_from_utf8 ( const std::string & input )

Converts text from UTF-8 to the system's locale encoding.

Parameters

input Text in valid UTF-8 format.

Returns: The input text converted to locale encoding.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid UTF-8 format, or cannot be converted to locale encoding (eg because the input characters cannot be represented by that encoding).
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Since 0.9.2

◆ locale_to_utf8()

std::string Cgu::Utf8::locale_to_utf8 ( const std::string & input )

Converts text from the system's locale encoding to UTF-8.

Parameters

input Text in valid locale encoding.

Returns: The input text converted to UTF-8.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid locale encoding.
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Since 0.9.2

◆ operator!=() [1/2]

bool Cgu::Utf8::operator!=	(	const Iterator &	iter1,
		const Iterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator!=() [2/2]

bool Cgu::Utf8::operator!=	(	const ReverseIterator &	iter1,
		const ReverseIterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator<() [1/2]

bool Cgu::Utf8::operator<	(	const Iterator &	iter1,
		const Iterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator<() [2/2]

bool Cgu::Utf8::operator<	(	const ReverseIterator &	iter1,
		const ReverseIterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation. Ordering is viewed from the perspective of the logical operation (reverse iteration), so that for example an iterator at position std::string::rbegin() is less than an iterator at position std::string::rend().

Since 1.0.1

◆ operator<=() [1/2]

bool Cgu::Utf8::operator<=	(	const Iterator &	iter1,
		const Iterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator<=() [2/2]

bool Cgu::Utf8::operator<=	(	const ReverseIterator &	iter1,
		const ReverseIterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation. Ordering is viewed from the perspective of the logical operation (reverse iteration), so that for example an iterator at position std::string::rbegin() is less than an iterator at position std::string::rend().

Since 1.0.1

◆ operator==() [1/2]

bool Cgu::Utf8::operator==	(	const Iterator &	iter1,
		const Iterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator==() [2/2]

bool Cgu::Utf8::operator==	(	const ReverseIterator &	iter1,
		const ReverseIterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator>() [1/2]

bool Cgu::Utf8::operator>	(	const Iterator &	iter1,
		const Iterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator>() [2/2]

bool Cgu::Utf8::operator>	(	const ReverseIterator &	iter1,
		const ReverseIterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation. Ordering is viewed from the perspective of the logical operation (reverse iteration), so that for example an iterator at position std::string::rbegin() is less than an iterator at position std::string::rend().

Since 1.0.1

◆ operator>=() [1/2]

bool Cgu::Utf8::operator>=	(	const Iterator &	iter1,
		const Iterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation.

Since 1.0.1

◆ operator>=() [2/2]

bool Cgu::Utf8::operator>=	(	const ReverseIterator &	iter1,
		const ReverseIterator &	iter2
	)

inline

The comparison operators will not throw provided assigning a std::string::const_iterator object does not throw, as it will not in any sane implementation. Ordering is viewed from the perspective of the logical operation (reverse iteration), so that for example an iterator at position std::string::rbegin() is less than an iterator at position std::string::rend().

Since 1.0.1

◆ uniwide_from_utf8()

std::wstring Cgu::Utf8::uniwide_from_utf8 ( const std::string & input )

Converts text from UTF-8 to the system's Unicode wide character representation, which will be UCS-4/UTF-32 for systems with a wide character size of 4 (almost all unix-like systems), and UTF-16 for systems with a wide character size of 2.

Parameters

input Text in valid UTF-8 format.

Returns: The input text converted to UCS-4 or UTF-16.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid UTF-8 format or the system does not support wide character Unicode strings.
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Since 0.9.2

◆ uniwide_to_utf8()

std::string Cgu::Utf8::uniwide_to_utf8 ( const std::wstring & input )

Converts text from the system's Unicode wide character representation, which will be UCS-4/UTF-32 for systems with a wide character size of 4 (almost all unix-like systems) and UTF-16 for systems with a wide character size of 2, to narrow character UTF-8 format.

Parameters

input Text in valid UCS-4 or UTF-16 format.

Returns: The input text converted to UTF-8.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid UCS-4 or UTF-16 format or the system does not support wide character Unicode strings.
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Since 0.9.2

◆ validate()

bool Cgu::Utf8::validate ( const std::string & text )

inline

Indicates whether the input text comprises valid UTF-8.

Parameters

text	The text to be tested.

Returns: true if the input text is in valid UTF-8 format, otherwise false.

Exceptions

std::bad_alloc This function might throw std::bad_alloc if std::string::data() might throw when memory is exhausted.

Note: #include <c++-gtk-utils/convert.h> for this function.

Since 0.9.2

◆ wide_from_utf8()

std::wstring Cgu::Utf8::wide_from_utf8 ( const std::string & input )

Converts text from UTF-8 to the system's wide character locale representation. For this function to work correctly, the system's installed iconv() must support conversion to a generic wchar_t target, but in POSIX whether it does so is implementation defined (GNU's C library implemention does). For most unix-like systems the wide character representation will be Unicode (UCS-4/UTF-32 or UTF-16), and where that is the case use the uniwide_from_utf8() function instead, which will not rely on the generic target being available.

Parameters

input Text in valid UTF-8 format.

Returns: The input text converted to the system's wide character locale representation.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in valid UTF-8 format, or cannot be converted to the system's wide character locale representation (eg because the input characters cannot be represented by that encoding, or the system's installed iconv() function does not support conversion to a generic wchar_t target).
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Since 0.9.2

◆ wide_to_utf8()

std::string Cgu::Utf8::wide_to_utf8 ( const std::wstring & input )

Converts text from the system's wide character locale representation to UTF-8. For this function to work correctly, the system's installed iconv() must support conversion from a generic wchar_t target, but in POSIX whether it does so is implementation defined (GNU's C library implemention does). For most unix-like systems the wide character representation will be Unicode (UCS-4/UTF-32 or UTF-16), and where that is the case use the uniwide_to_utf8() function instead, which will not rely on the generic target being available.

Parameters

input Text in a valid wide character locale format.

Returns: The input text converted to UTF-8.

Exceptions

Cgu::Utf8::ConversionError	This exception will be thrown if conversion fails because the input string is not in a valid wide character locale format, or cannot be converted to UTF-8 (eg because the system's installed iconv() function does not support conversion from a generic wchar_t target).
std::bad_alloc	This function might throw std::bad_alloc if memory is exhausted and the system throws in that case.

Since 0.9.2

Classes

Functions

Detailed Description

Function Documentation

◆ filename_from_utf8()

◆ filename_to_utf8()

◆ locale_from_utf8()

◆ locale_to_utf8()

◆ operator!=() [1/2]

◆ operator!=() [2/2]

◆ operator<() [1/2]

◆ operator<() [2/2]

◆ operator<=() [1/2]

◆ operator<=() [2/2]

◆ operator==() [1/2]

◆ operator==() [2/2]

◆ operator>() [1/2]

◆ operator>() [2/2]

◆ operator>=() [1/2]

◆ operator>=() [2/2]

◆ uniwide_from_utf8()

◆ uniwide_to_utf8()

◆ validate()

◆ wide_from_utf8()

◆ wide_to_utf8()