encode / decode binary data in a qr-code using qrencode and zbarimg in bash
Asked Answered
Z

3

5

I have some binary data that I want to encode in a qr-code and then be able to decode, all of that in bash. After a search, it looks like I should use qrencode for encoding, and zbarimg for decoding. After a bit of troubleshooting, I still do not manage to decode what I had encoded

Any idea why? Currently the closest I am to a solution is:

$ dd if=/dev/urandom bs=10 count=1 status=none > data.bin
$ xxd data.bin
00000000: b255 f625 1cf7 a051 3d07                 .U.%...Q=.
$ cat data.bin | qrencode -l H -8 -o data.png
$ zbarimg --raw --quiet data.png | xxd
00000000: c2b2 55c3 b625 1cc3 b7c2 a051 3d07 0a    ..U..%.....Q=..

It looks like I am not very far, but something is still off.

Edit 1: a possible fix is to use base64 wrapping, as explained in the answer by @leagris.

Edit 2: using base64 encoding doubles the size of the message. The reason why I use binary in the first place is to be size-efficient so I would like to avoid that. De-accepting the answer by @leagris as I would like to have it 'full binary', sorry.

Edit 3: as of 2020-03-03 it looks like this is a well-known issue of zbarimg and that a pull request to fix this is on its way:

https://github.com/mchehab/zbar/pull/64

Edit 4: if you know of another command-line tool on linux that is able to decrypt qr-codes with binary content, please feel free to let me know.

Zonate answered 3/3, 2020 at 11:26 Comment(0)
A
14

My pull request has been applied. ZBar version 0.23.1 and newer will be able to decode binary QR codes:

zbarimg --raw --oneshot -Sbinary qr.png
zbarcam --raw --oneshot -Sbinary

QR codes have several encoding modes. The simplest, most commonly used and widely supported is the alphanumeric encoding which is suitable for simple text. The byte encoding allows storing arbitrary 8 bit data in the QR code. The ECI mode is like 8 bit mode but with additional metadata that tells the decoder which character set to use in order to decode the binary data back to text. Here's a list of known ECI values and the character encodings they represent. For example, when a decoder encounters an ECI 26 mode QR code it knows to decode the binary data as UTF-8.

The qrencode tool is doing its job correctly: it is creating a byte mode QR code with the data you gave it as its contents. The problem is most decoders were explicitly designed to handle textual data first and foremost. The retrieval of the raw binary data is a detail at best.

Current versions of the zbar library will treat byte mode QR codes as if they were unknown ECI mode QR codes. If a character set isn't specified, it will attempt to guess the encoding and convert the data to it. This will most likely mangle the binary data. As you noted, I brought this up in issue #55 and after some time managed to submit a pull request to improve this. Should it be merged, the library will have binary decoder option that will instruct decoders to return the raw binary data without converting it. Another source of data mangling is the tendency of the command line tools to append line feeds to the output. I submitted a pull request to allow users to prevent this and it has already been merged.

The zxing-cpp library will also try to guess the encoding of binary data in QR codes. The comments suggest that the QR code specification requires that decoders pick an encoding without specifying a default or allowing them to return the raw binary data. In order to make that possible, the binary data is copied to a byte array which can be accessed through the DecoderResult. When I have some free time, I intend to write zximg and zxcam tools with binary decoding support for this library.

It's always possible to encode binary data as base 64 and encode the result as an alphanumeric QR code. However, base 64 encoding will increase the size of the data and the alphanumeric mode doesn't allow use of the QR code's maximum capacity. In a comment, you mentioned what you intend to use binary QR codes for:

I want to have a package to effectively dump some gpg stuff in a format that makes recovery easy.

That is the exact use case I'm attempting to enable with my pull request: an easier-to-restore paperkey. 4096 bit RSA secret keys can be directly QR encoded in 8 bit mode but not in alphanumeric mode as base 64-encoded data.

Acrobat answered 4/3, 2020 at 3:22 Comment(11)
Yes, agree 100%, I also want a 'better paperkey' :)Zonate
Btw @MatheusMoreira I am working on my pure bash little 'qr code paperkey'. Hope to have some code that looks like something within a few days, I will let you know then.Zonate
@Zonate If my patch is merged, a key restoration script will probably be as simple as zbarcam --raw --oneshot --Sbinary | paperkey --pubring public.gpg | gpg --import. I was planning to add these exact instructions to the Arch Wiki's paperkey article after the fearure is available in the repositories.Acrobat
Yes, I agree for this specific case that just creating one qr-code will be enough, and this will be an extremely welcome addition. But when thinking about it a few days ago, I came to the conclusion that having a 'ture' qr-code dump bash package would be welcome. So that one can dump also a bit larger things - like, the full content of my password manager with pass, or a full message that I may want to send by the post, or something like that :)Zonate
@Zonate You could split binary data into multiple parts and encode each part as a separate QR code. If you decode them in the correct order, the result should be the original file. You could theoretically QR encode files of any size this way. Several bar code formats support structured append mode which helps the decoder figure out the correct order of each bar code in the sequence. I've never actually seen anyone use this feature though. There's code in the decoders to handle structured append but I don't know how robust it is. I've never tested it either.Acrobat
Yes, this is exactly what I am working on :) I will put my scripts for that in an own repo / try to set up a simple package. I let you know when it starts to take form :)Zonate
@Zonate By the way, the --oneshot feature is useful even if you're trying to read multiple bar codes in a sequence. When you run zbarcam and place a QR code in front of the camera, there's a chance it will decode that QR code multiple times. This will result in the data being duplicated in the output, corrupting the file you're trying to restore. One shot mode forces zbarcam to terminate after reading exactly one bar code, allowing you to program in a time out before your script tries to read the next one. This gives you time to prepare the next bar code.Acrobat
Thanks. I was thinking about having a (very simple) binary metadata field to take care of all of that :) .Zonate
I start to work on the 'bash package' to help with dumping in a series of qr-codes here: github.com/jerabaul29/qrdump . It is still very primitive / ugly / dyssfunctional, but I hope it will be better within a couple of weeks. I let you know.Zonate
I think that qrdump starts to look like something reasonable. Feel free to comment on API etc, now that I have a working example the plan is to 1) stabilize API 2) refactor the inner code.Zonate
Very interesting - I did write a modern version of a tool I've developed for internal use that uses encrypted QR codes for secrets backup: github.com/yawn/offkey - I think it might make a lot of sense to at least support the option of using binary encodings directly. This will reduce recovery options but enable larger secrets. Thanks for the PR!Gaultiero
A
4

See also: Storing binary data in QR codes

Look like zbarimg is only supporting printable characters and adding a newline

printf '%s' 'Hello World!' >data.bin
xxd data.bin
qrencode -l H -8 -o data.png -r data.bin
zbarimg --raw --quiet data.png | xxd

I think a better more portable option would be to base64 encode your binary data before qr encoding.

Like this:

dd if=/dev/urandom bs=10 count=1 status=none > data.bin
xxd data.bin
base64 <data.bin | qrencode -l H -8 -o data.png
zbarimg --raw --quiet data.png | base64 -d | xxd
Allmon answered 3/3, 2020 at 12:17 Comment(9)
Yes, I was aware of the newline, I guess I could have written that this was not a problem to me. So you think the problem is with zbarimg and not qrencode, sounds loke?Zonate
This works, many thanks :) A small question: do you know what is the explanation 'deep down' for the need to base64 encode here? :)Zonate
See: #37996601Bethezel
Aah but wait, now this takes twice as much space :( . I want to dump quite large things, this will not work for my part. Sorry, this means I de-accept your answer for now, I would really like to have a true 'optimal' binary qr-code.Zonate
The generated binary qr-code is correct. The fault is at zbar decoding incorrectly as character string. You will have portability issues with your binary qr-code as lots of decoders do not handle binary correctly. See the post I linked that explain it can be done with patched zbarimg. I don't know your implementation plan, but if you have so much data that it won't fit a decently sized qr-code, you probably should qr code a record ID instead and retrieve the long data from the record ID in a database.Bethezel
Ok, thanks. I let it open a bit longer in the hope that somebody knows an answer / maybe another decode. I will consider to have a look at the decode too and see if I can modify it a bit. I want to have a package to effectively dump some gpg stuff in a format that makes recovery easy. There are some solutions that work 'up to an extent' but not exactly as I want either. This becomes quickly a few kB so size is important.Zonate
It looks like zbarimg is a dead project with no more development taking place. Do you know something about this @leagris ?Zonate
Let us continue this discussion in chat.Zonate
An alphanumeric QR code containing base 64-encoded data is most reliable and compatible since most decoders are designed to work with text. It should be noted that this method provides a lower capacity compared to directly encoding binary data in an 8 bit QR code. For example, a 4096 bit RSA secret key fits directly in an 8 bit QR code but it doesn't fit in an alphanumeric QR code as base 64-encoded data.Acrobat
S
1

base32 encoding for QR is more efficient (13%) than base64 encoding.

Encode: base32 -w0 <FILEIN | tr = + | qrencode --ignorecase ...

Decode: tr + = <DATAFILE | base32 -d >FILEOUT

base32 yields eight QR 5.5 bit characters for every 5 bytes, or 44:40 bits or 110%.

base64 yields four QR 8-bit characters for every 3 bytes, or 32:24 bits or 133%.

Scandalize answered 22/2, 2024 at 21:27 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Antiperspirant

© 2022 - 2025 — McMap. All rights reserved.