A class for reassembling UTF-8 strings sent over pipes and sockets so they form complete valid UTF-8 characters. More...
|Cgu::SharedHandle< char * >||operator() (const char *input, size_t size)|
|size_t||get_stored () const|
A class for reassembling UTF-8 strings sent over pipes and sockets so they form complete valid UTF-8 characters.
Utf8::Reassembler is a functor class which takes in a partially formed UTF-8 string and returns a nul-terminated string comprising such of the input string (after inserting, at the beginning, any partially formed UTF-8 character which was at the end of the input string passed in previous calls to the functor) as forms complete UTF-8 characters (storing any partial character at the end for the next call to the functor). If the input string contains invalid UTF-8 after adding any stored previous part character (apart from any partially formed character at the end of the input string) then operator() will return a null Cgu::SharedHandle<char*> object (that is, Cgu::SharedHandle<char*>::get() will return 0). Such input will not be treated as invalid if it consists only of a single partly formed UTF-8 character which could be valid if further bytes were received and added to it. In that case the returned SharedHandle<char*> object will contain an allocated string of zero length, comprising only a terminating \0 character, rather than a NULL pointer.
This enables UTF-8 strings to be sent over pipes, sockets, etc and displayed in a GTK+ object at the receiving end
Note that for efficiency reasons the memory held in the returned Cgu::SharedHandle<char*> object may be greater than the length of the nul-terminated string that is contained in that memory: just let the Cgu::SharedHandle<char*> object manage the memory, and use the contents like any other nul-terminated string.
This class is not needed if std::getline(), with its default '\n' delimiter, is used to read UTF-8 characters using, say, Cgu::fdistream, because a whole '\n' delimited line of UTF-8 characters will always be complete.
This is an example of its use, reading from a pipe until it is closed by the writer and putting the received text in a GtkTextBuffer object:
The constructor will not throw.
Gets the number of bytes of a partially formed UTF-8 character stored for the next call to operator()(). It will not throw.
|Cgu::SharedHandle<char*> Cgu::Utf8::Reassembler::operator()||(||const char *||input,|
Takes a byte array of wholly or partly formed UTF-8 characters to be converted (after taking account of previous calls to the method) to a valid string of wholly formed characters.
|input||The input array.|
|size||The number of bytes in the input (not the number of UTF-8 characters).|
|std::bad_alloc||The method might throw std::bad_alloc if memory is exhausted and the system throws in that case. It will not throw any other exception.|