QR code compression [closed]
Asked Answered
R

4

6

Is it possible to store about 20 000 characters in QR code? (Or even more? http://blog.qr4.nl/page/QR-Code-Data-Capacity.aspx)

I would like to store only ascii simbols (chars and numbers with extra dash and so on).

As far as I know it's possible to compress not complext text with ratio 80-98% which sound promissing: http://www.maximumcompression.com/index.html

Do you have some more experience? Thanks for sharing!

Rodneyrodolfo answered 14/6, 2012 at 11:19 Comment(2)
Not the answer you want but : if you want to put lots of data in a 2D barcode, maybe you should have a look at datamatrix.Displant
The DataMatrix looks good. It doesn't support Unicode and much compact. I have some random Android app to scan codes and it scanned the DataMatrix successfully. But not sure if it's so widely supported so afraid to use itFootloose
B
6

QR codes have a special encoding mode for alphanumeric data (upper-case only, plus digits and a few symbols). It uses less than 8 bits per character and can store 4,296 characters at most in this mode.

This ought to be close to optimal. For simpler data (like, all alpha), a compression algorithm like gzip might be able to achieve fewer bits per byte. Of course, no standard reader would interpret the gzipped payload as such. Only a special reader would be able to.

Can you get 5x more data into a QR code this way? No, almost surely not, unless it's a trivial case like 20,000 "a"s.

Even if you could, it would create a large complex QR code. Anything holding over a few hundred bytes gets hard to scan in practice. Version 40, the largest, is useless in the real world. Even version 20 is.

Blen answered 14/6, 2012 at 12:34 Comment(1)
Sean, thanks for feedback :) I'm thinking about building my own app to read and unzip "zipQR".Rodneyrodolfo
H
9

If your question is: "Is it possible to store 20K characters in QR Code?", then the answer is yes, it is possible.

If your question is: "Is it possible to guarantee you'll always be able to store 20K characters in QR Code to compression?", the answer is no. There is no way to guarantee that, due to pigeonhole principle.

If your question is: "Is there a "comfortable zone" where it is highly likely that a text input, whose maximum size is 20K, will most probably fit into a QR Code?", the proper answer is: it depends on your input data. And a more risky answer is: if you're dealing with "normal text" data, such as a book content, you're probably asking for too much.

The 80-90% compression ratio you refer to is possible because input data is extremely large (several MB), and decompression algorithms are very slow. For a "small" input data, such as 20K characters, the compression ratio for a "normal text" will more likely be in the 50-70% range, depending on algorithm strength (PPM for example, is very suitable for such input data).

Obviously, if your input data is a kind of "log file", with a huge lot of repetitions, then yes, compression ratio > 95% is easily accessible.

But compression ratio is not the only thing to take into consideration. For "real-life" usage, you'll also have to consider the QR size, and a reasonable level of correction for the QR print to survive. Betting on "max capacity with lowest possible correction" is a fairly wrong bet, at least for real life scenarios. You'll have to ask around you to know what are the "reasonable limits" of your QR Code. Most probably, printing capabilities will get into the way, and you'll have to settle for something less than maximum.

Last point, don't forget that compressed data are "binary", not "alphanumeric". As a consequence, the final capacity of your QR Code is into the last column. Which is much less than the column "alphanumeric".

Homeopathist answered 14/6, 2012 at 12:22 Comment(2)
Thanks for your feedback. Cyan, I need to admit - you are completly right about the binary state of "zipped" text file. I forget about that little detail. ;) I would like would to insert table with two or three columns "name|value", "name|value|value". And there would be characters and numbers for both columns. Upto 300-400 records.Rodneyrodolfo
If you wonder for the compression ratio, you can make some tests on your table, using available algorithms. 7zip for instance is known to have fairly good performances on "structured" data. This would provided a first evaluation level.Homeopathist
B
6

QR codes have a special encoding mode for alphanumeric data (upper-case only, plus digits and a few symbols). It uses less than 8 bits per character and can store 4,296 characters at most in this mode.

This ought to be close to optimal. For simpler data (like, all alpha), a compression algorithm like gzip might be able to achieve fewer bits per byte. Of course, no standard reader would interpret the gzipped payload as such. Only a special reader would be able to.

Can you get 5x more data into a QR code this way? No, almost surely not, unless it's a trivial case like 20,000 "a"s.

Even if you could, it would create a large complex QR code. Anything holding over a few hundred bytes gets hard to scan in practice. Version 40, the largest, is useless in the real world. Even version 20 is.

Blen answered 14/6, 2012 at 12:34 Comment(1)
Sean, thanks for feedback :) I'm thinking about building my own app to read and unzip "zipQR".Rodneyrodolfo
P
4

In practice, when you want to use a QR to store huge ammounts of data, you simply store a URL pointing to the location of the data.

What is theoretically possible is very different to what is actually possible when you have to support real-life devices. Good luck scanning anything above version 10 (57x57 modules) with a low-end smartphone camera.

Paleography answered 22/4, 2015 at 22:2 Comment(1)
URL/Pointer looks a lot more promising than I was thinking before. Thank you for your reply @NiloVelez. :)Rodneyrodolfo
G
0

It depends on your data, as others have pointed out. If it's binary, I don't know. If it's pure text, definitely.

The trick is to compress it and then render it into a form a QR encored can understand (i.e.: base64).

Demo of encoding about 173kB of Lorem Impsum (assumes you have a qr program to generate the QR, or just copy-paste the base64 encoded string to a QR generator like this):

cat in.txt | gzip | base64 --wrap=0 | xclip | qr

Renders this QR: enter image description here

Which you can get back by decoding the QR and copying the text into an echo command which you can decode:

echo 'H4sIAAAAAAAAA+3YS3LcNhSF4blWgQWotAdXMnRlksoCIBJqIeajTQJyrNXngt2KFJeTSgapHOH+k8S2Hk3+PBLq64/rluaQz3udw7hO6xb2XEKcU7kNw7rsqaRS7R/jaJ8z5OUU0pTLdhv2NIYxxzks61LnryHlbV7HUNJ8tm+Sl6c81qWEWsIU7+1FQiqXF0hhjqclhjjlz/WrfYO0xfLm+z2tUz2XGu/ChxKe0raGtO7tq+Mw1L19fgm/1r2sYazr9VseH08xbOm+znfhZ7voMNh1xvAp7mM41fu0nba03NrV2ivFUOKnPEf7+B6XoVT7+r2Ej3/Z4u5vPkYnOv37Tjc3P9a8h1jtBu2qppDsk/KWq93P5XPzEh7TMm5psy+yvzzV6VxLLKl9uv1T2ne79XWyV8rp0vZzbXfevlueppcXtVI1PKR6yrGEpU5TDA9xyFPe2+u/JNlem8x2D+3P+Yiyjtmq5NOS9z3P4XPN4X6Ky2hXcN5i2pM9lBY4FnvB5+ctT2FMU1randZTtbtp93m9Erv4diUxf3sl/2A4Q1tO2kJbTn6dzrfLseks+f7Rbjrvx4DyMryZjr3Kd4dz3c2xlnKOxxP6pYQvdm3Bbma2lw1zbn94sr/G+baV2O0l97LVMaTf0jZkm0HJ6xLarc3Dup3tcvdq13q2W5lWm21pX5P3qV3M8br5bF/bpjass13u+vog7wIL+dNCbm5+sqc0ZfvZe/m1YJcS9vbEoj2Je/su9n3zgzUJ6/l4EnbT7RWW/Givmmd7IGM+nv/c9pNHu1rbyByf7cLPUxyODTy0/4fzetxO3Pf2K0Bunv/3ODV+gbXrUPq1zvH3H3f68Hrhf9zxcZ+X/1yKvNxFu6NryOOmjzJ2L9fGrdE1cLvf8pr5EuRa+s3t263an9/c/9Hj8gAsxbXXS4RW5niNlzr1+gRen8k72VP7MfuB+ao9Fjq9j06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/j7/IxJsSEeKeGd2qYLz/mdOKdGgZDJ7lOGtARcrgchXmnRmwhvFPDOzUwu4/jD2bDbI/zpZPHToRgMHSC2QKIknMMzBZbCMyG2TC7j+MPZsNsj/Olk8dOhGAwdILZAoiScwzMFlsIzIbZMLuP4w9mw2yP86WTx06EYDB0gtkCiJJzDMwWWwjMhtkwu4/j7/IxHt27fXS8Q/JO9sQ7JJKPhU7voxMhGAydeIdEwL9yBOUdErGF8A4J75DwDkkfxx/Mhtke50snj50IwWDoBLMFECXnGJgtthCYDbO/w+zfAVCuR60NowIA' | base64 -d | gunzip
Gujarati answered 26/2, 2023 at 13:42 Comment(1)
Up: Seems my choice of input heavily affected the measurement. A repeating pattern of Lorem Ipsum is a poor choice (low variability). I tried it on a normal text and could encore much much less, in the order of single kB.Gujarati

© 2022 - 2025 — McMap. All rights reserved.