There are really only two possible solutions. If you're doing this a
lot, over large distances, you'd be better off converting your
characters to a single element encoding, using wchar_t
(or int32_t
,
or whatever is most appropriate. This is not a simple copy, which
would convert each individual char
into the target type, but a true
conversion function, which would recognize the multibyte characters, and
convert them into a single element.
For occasional use or shorter sequences, it's possible to write your own
functions for advancing n
bytes. For UTF-8, I use the following:
inline size_t
size(
Byte ch )
{
return byteCountTable[ ch ] ;
}
template< typename InputIterator >
InputIterator
succ(
InputIterator begin,
size_t size,
std::random_access_iterator_tag )
{
return begin + size ;
}
template< typename InputIterator >
InputIterator
succ(
InputIterator begin,
size_t size,
std::input_iterator_tag )
{
while ( size != 0 ) {
++ begin ;
-- size ;
}
return begin ;
}
template< typename InputIterator >
InputIterator
succ(
InputIterator begin,
InputIterator end )
{
if ( begin != end ) {
begin = succ( begin, end, size( *begin ),
std::::iterator_traits< InputIterator >::iterator_category() ) ;
}
return begin ;
}
template< typename InputIterator >
size_t
characterCount(
InputIterator begin,
InputIterator end )
{
size_t result = 0 ;
while ( begin != end ) {
++ result ;
begin = succ( begin, end ) ;
}
return result ;
}
std::wstring
is unfortunately implementation dependent (16 bits wide characters on Windows, 32 bits wide on Linux), therefore it is not sufficient. – Donaugh