std::codecvt::do_in method overloading vs the rest of base methods
Asked Answered
W

1

0

I have overloaded do_in method of std::codecvt:

#include <iostream>
#include <locale>
#include <string>

class codecvt_to_upper : public std::codecvt<char, char, std::mbstate_t> {
public:

    explicit codecvt_to_upper(size_t r = 0) : std::codecvt<char, char, 
                                                          std::mbstate_t>(r) {}
protected:
    result do_in(state_type& state, const extern_type* from,
                 const extern_type* from_end, const extern_type*& from_next,
                 intern_type* to, intern_type* to_end, intern_type*& to_next)
                   const;

    result
    do_out(state_type& __state, const intern_type* __from,
            const intern_type* __from_end, const intern_type*& __from_next,
            extern_type* __to, extern_type* __to_end,
            extern_type*& __to_next) const {
        return codecvt_to_upper::ok;
    }

    result
    do_unshift(state_type& __state, extern_type* __to,
            extern_type* __to_end, extern_type*& __to_next) const {
        return codecvt_to_upper::ok;
    }

    int
    do_encoding() const throw () {
        return 1;
    }

    bool
    do_always_noconv() const throw () {
        return false;
    }

    int
    do_length(state_type&, const extern_type* __from,
            const extern_type* __end, size_t __max) const {
        return 1;
    }

    int
    do_max_length() const throw () {
        return 10;
    }
};

codecvt_to_upper::result codecvt_to_upper::do_in(state_type& state, 
                  const extern_type* from, const extern_type* from_end, const 
                  extern_type*& from_next, intern_type* to, intern_type* 
                  to_end, intern_type*& to_next) const {
    codecvt_to_upper::result res = codecvt_to_upper::error;
    const std::ctype<char>& ct = std::use_facet<std::ctype<char> >( 
                                                               std::locale());

    const extern_type* p = from;
    while( p != from_end && to != to_end) {
        *to++ = ct.toupper( *p++);
    }
    from_next = p;
    to_next = to;
    res = codecvt_to_upper::ok;
    return res;
}

and used this way:

int main(int argc, char** argv) {

    std::locale ulocale( std::locale(), new codecvt_to_upper);

    std::cin.imbue( ulocale);

    char ch;
    while ( std::cin >> ch) {
        std::cout << ch;
    }
    return 0;
}

but do_in overload is not being called. Have I overloaded it correctly? Which method of std::codecvt<char, char, std::mbstate_t> (and how) do I have to change to make my facet calling do_in method?

Wilmerwilmette answered 27/3, 2014 at 13:21 Comment(6)
The char => char specialization of std::codecvt<> doesn't define a coversion, therefore do_in() won't be called because it is unnecessary to convert between the two types.Alcestis
If you call std::use_facet<codecvt_to_upper>(ulocale).always_noconv() it should return true, indicating that a conversion isn't needed.Alcestis
@0x499602D2 this sounds reasonable but this is how it is written in Stroustrup "C++ 3rd...". This is true, always_nonconv returns trueWilmerwilmette
@0x499602D2 Can I force my codec to always perform conversion?Wilmerwilmette
@0x499602D2 you can write this in an answer, I will accept this. Also I know now, that yes, every method has to be implemented since they all are pure virtual...Wilmerwilmette
Thanks but before I answer I would like to research the Standard to confirm this (and other things).Alcestis
A
2

I think the first thing that should be addressed is that the std::codecvt family facets are only used by std::basic_filebuf because code conversion is only needed when dealing with an external device. In your code, you were imbuing the locale into std::cin, which has a buffer that does not do code conversion.

Of course, it is still possible to perform code conversion within the program, but the thing about your facet that was preventing your code from working was that it inherited from a specialization of std::codecvt<> that cannot do conversions. The char => char specialization of std::codecvt<> doesn't define a conversion, therefore do_in() won't be called because it is unnecessary to convert between the two types.

I tried running your code, but changing the inherited facet to std::codecvt<wchar_t, char, std::mbstate> and used wide-character file streams and it worked.

If you want this to work for narrow-character streams as well I would suggest creating a stream buffer that forwards uppercase characters through underflow().

Alcestis answered 27/3, 2014 at 17:4 Comment(3)
this sounds reasonable, only one thing: derivation from codecvt<char,char> with do_in overloaded to perform upper case and imbuing it to std::cin was given as an example by Stroustrup in "C++ 3d..." Do you think it was just bare example and he didn't test it if it actually works?Wilmerwilmette
@lizusek Stroustrup is known for making a few mistakes in his examples so I suppose it's possible.Alcestis
I absolutely agree, everyone makes them, and from about 100 pages indeed it looks as if he wrote the last pages of book in hurry or didn't test the code. But even with those few mistakes it is still excellent. Thank you very much.Wilmerwilmette

© 2022 - 2024 — McMap. All rights reserved.