Read file into std::vector<std::byte>
Asked Answered
C

4

11

I'm trying to read a file in binary format into a std::vector<std::byte>

  std::ifstream fStream(fName, std::ios::binary);

  std::vector<std::byte> file_content((std::istreambuf_iterator<std::byte>(fStream)),
                                        std::istreambuf_iterator<std::byte>());

but I'm getting this error (which to me looks like istreambuf_iterator is missing an overload for std::byte)

error: no matching function for call to ‘std::istreambuf_iterator<std::byte>::istreambuf_iterator(std::ifstream&)’
     std::vector<std::byte> file_content((std::istreambuf_iterator<std::byte>(fStream)),

Am I doing something wrong ? And if yes what is the best way to do this ?

Thanks!

Como answered 24/2, 2018 at 16:22 Comment(3)
Possible duplicate of #47481731Dipole
@PasserBy If I try using istream_iterator if get error: no match for ‘operator>>’ (operand types are ‘std::istream_iterator<std::byte>::istream_type {aka std::basic_istream<char>}’ and ‘std::byte’)Show
@ThéoChampion That is because there is no default operator>> that reads a std::byte from a std::istreamAcetylate
A
8

I'm trying to read a file in binary format into a std::vector<std::byte>

You are using std::istream_iterator, which reads from an std::istream using operator>>, which performs a formatted read instead of a binary read by default. Use std::istream::read() to read binary data.

If you want to use std::istring_iterator to read bytes, you would need to define a custom operator>> that calls std::istream::read() or std::stream::get(). But this would be inefficient since it would read 1 byte at a time. It is better to call read() directly to read blocks of multiple bytes at a time. For instance, query the file size, preallocate the std::vector to that size, and then read() from the std::ifstream directly into the std::vector for that size.

Update: I just noticed that you are using std::istreambuf_iterator instead of std::istream_iterator. std::istreambuf_iterator does not use operator>>, so it would be better suited for reading bytes. However, it still reads 1 byte at a time, so what I said about using std::istream::read() to read multiple bytes at a time still applies.

Acetylate answered 24/2, 2018 at 17:34 Comment(2)
If I understand well, you mean something like godbolt.org/g/ezWhp2 but this will not work if I use std::byte instead of char right ?Show
@ThéoChampion yes, something like that, and yes, it works fine for std::byte, eg: file.read((char*)buffer.data(), length);Acetylate
Q
4

you should be able to do it like this:

  std::basic_ifstream<std::byte> fStream{fName, std::ios::binary};

  std::vector<std::byte> file_content{ std::istreambuf_iterator<std::byte>(fStream), {} };
Quorum answered 29/5, 2019 at 1:53 Comment(1)
What you mentioned above is not working. terminate called after throwing an instance of 'std::bad_cast' what(): std::bad_castCisalpine
O
0

If you need an iterator based solution then even in c++23 it's not getting better than that:

struct istreambuf_iterator_byte : public std::istreambuf_iterator<char> {
  using base_type = std::istreambuf_iterator<char>;
  
  static_assert(std::is_same_v<base_type::iterator_category, std::input_iterator_tag>, "Stronger iterator requires more methods to be reimplemented here.");
  
  using value_type = std::byte;
  using reference = std::byte;

  std::byte operator*() const noexcept(noexcept(base_type::operator*())) {
    return static_cast<std::byte>(base_type::operator*());
  }
  istreambuf_iterator_byte& operator++() noexcept(noexcept(base_type::operator++())) {
    base_type::operator++();
    return *this;
  }
  istreambuf_iterator_byte operator++(int) noexcept(noexcept(base_type::operator++(int{}))) {
    return istreambuf_iterator_byte{base_type::operator++(int{})};
  }
};

#if defined __cpp_concepts
static_assert(std::input_iterator<istreambuf_iterator_byte>);
#endif

auto read_file(const std::string& file_path) -> std::vector<std::byte> {
  std::ifstream input_file(file_path, std::ios::binary);
  return {istreambuf_iterator_byte{input_file}, istreambuf_iterator_byte{}};
}

At the same time, As Remy Lebeau pointed out, there is much more efficient way to read raw data using preallocation and read(), readsome() methods. My tests shows that it's 2 to 5 times faster this way, and practically as fast as assign()'ing memory-mapped file for all file sizes from 5KB to 4GB.

auto read_file(const std::string& file_path) -> std::vector<std::byte> {
  std::ifstream input_file(file_path, std::ios::binary);

  input_file.seekg(0, std::ios::end);
  auto const file_size = input_file.tellg();
  input_file.seekg(0, std::ios::beg);

  std::vector<std::byte> result(file_size);
  input_file.read(reinterpret_cast<char*>(&result[0]), file_size);
  result.resize(input_file.tellg());
  return result;
}
Orbital answered 4/9, 2023 at 13:18 Comment(3)
This looks like an illegal iterator to me, since ++ returns a different type.Lane
@Lane That's right, thank you! I will add this to the answer.Orbital
I would check the resulting type against the std::input_iterator concept.Lane
B
-1

unfortunately only char iterators are implemented in STL, so

std::istreambuf_iterator<std::byte>

will cause an error. so use

std::istreambuf_iterator<char>

instead.

Broughton answered 3/11, 2020 at 16:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.