C++ read binary file and convert to hex
Asked Answered
A

1

8

I'm having some problems reading a binary file and converting it's bytes to hex representation.

What I've tried so far:

ifstream::pos_type size;
char * memblock;

ifstream file (toread, ios::in|ios::binary|ios::ate);
  if (file.is_open())
  {
    size = file.tellg();
    memblock = new char [size];
    file.seekg (0, ios::beg);
    file.read (memblock, size);
    file.close();

    cout << "the complete file content is in memory" << endl;

std::string tohexed = ToHex(memblock, true);


    std::cout << tohexed << std::endl;

   }

Converting to hex:

string ToHex(const string& s, bool upper_case)
{
    ostringstream ret;

    for (string::size_type i = 0; i < s.length(); ++i)
        ret << std::hex << std::setfill('0') << std::setw(2) << (upper_case ? std::uppercase : std::nouppercase) << (int)s[i];

    return ret.str();
}

Result: 53514C69746520666F726D61742033.

When I open the original file with a hex editor, this is what it shows:

53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00
04 00 01 01 00 40 20 20 00 00 05 A3 00 00 00 47
00 00 00 2E 00 00 00 3B 00 00 00 04 00 00 00 01
00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 05 A3
00 2D E2 1E 0D 03 FC 00 06 01 80 00 03 6C 03 D3

Is there a way to get the same desired output using C++?

Working solution (by Rob):

...

std::string tohexed = ToHex(std::string(memblock, size), true);

...
string ToHex(const string& s, bool upper_case)
{
    ostringstream ret;

    for (string::size_type i = 0; i < s.length(); ++i)
    {
        int z = s[i]&0xff;
        ret << std::hex << std::setfill('0') << std::setw(2) << (upper_case ? std::uppercase : std::nouppercase) << z;
    }

    return ret.str();
}
Anacoluthia answered 8/3, 2012 at 17:17 Comment(9)
"memblock contains only first 15 bytes, stopping at the null byte (16th)" What makes you say that? I don't see where you are printing out the contents of memblock. I suspect that memblock contains the entire file, but that the code you aren't showing us misinterprets its contents. Please reduce your program to the smallest complete program that demonstrates the error, and post that program in the question. sscce.orgDestefano
@Rob okay shall i repost the first 15 bytes for you to be more clear?Anacoluthia
Assuming that this is a homework or a learning assignment of some sort, here's a couple of hints: (1) you are missing a while loop, (2) calling tellg() on a stream that you have just opened is premature.Cerated
@develroot - No, please just tell us how you concluded that it only contained the first fifteen bytes.Destefano
that's what it shows when I run the program.Anacoluthia
Have you checked that the value of size isn't 15 bytes, just by pure coincidence being the count before the first null byte? ifstream in binary mode certainly isn't supposed to do anything with null bytes...Billyebilobate
@develroot - "that's what it shows when I run the program". What program? Please post a complete minimal program that demonstrates the error, so that we can help you find the error. sscce.orgDestefano
@sscce.org what should I post? the includes? the return 0; or what? THIS IS THE WHOLE PROGRAMAnacoluthia
@develroot - That is by no means the entire program, but it was sufficient to find the error. Please read sscce.org to understand why I asked you for a complete program.Destefano
B
8
char *memblock;
… 
std::string tohexed = ToHex(memblock, true);
…

string ToHex(const string& s, bool upper_case)

There's your problem, right there. The constructor std::string::string(const char*) interprets its input as a nul-terminated string. So, only the characters leading up to '\0' are even passed to ToHex. Try one of these instead:

std::string tohexed = ToHex(std::string(memblock, memblock+size), true);
std::string tohexed = ToHex(std::string(memblock, size), true);
Bilharziasis answered 8/3, 2012 at 17:30 Comment(3)
okay...that worked for the hex conversion...but i still have a problem: the result isn't exactly the same. 47 is now 00, or 2D is now 05..and so are all the non-ASCII characters.Anacoluthia
@develroot - if it helps, you have a sign-extension bug in your ToHex routine. Try (s[i]&0xff) instead of (int)s[i].Destefano
This is amazing! :)Leesa

© 2022 - 2024 — McMap. All rights reserved.