Before trying to perform this tasks on jpeg, you can try with simpler format.
You can, for exemple, begin with PGM files.
PGM are grayscale image (black and white image). You can create a very simple PGM file using gimp (Export As -> PGM -> raw).
For the example, I drawed this really simple 4*4 image:
*Be careful ! The image I just linked is a jpeg big sized version of my 4*4 pgm image ! It's not my real file !*
PGM, like all kind of image, is a format that follows a norm
You can find the norm here
The most interesting part is here:
Each PGM image consists of the following:
A "magic number" for identifying the file type. A pgm image's magic number is the two characters "P5".
Whitespace (blanks, TABs, CRs, LFs).
A width, formatted as ASCII characters in decimal.
Whitespace.
...
It describes how a PGM file is formatted !
So now, according to this norm, we can create a very simple python PGM parser !
# Opening my PGM file. Since this is a raw encoded file, img.read() will read
# bytes !
img = open('./maze_test.pgm', 'rb')
# This line means this is a PGM file.
# It is encoded in ASCII. So, since every ASCII character is encoded with 1 byte,
# we have to read 2 bytes according to the norm
print(img.read(2))
# This is a blank line
print(img.readline())
# This line is a GIMP comment
print(img.readline())
# This line is an ASCII line. It contains the width, encoded in ASCII, then a
# space, and then the height also encoded in ASCII
width_height = str(img.readline())
# Remove the python byte information
width_height = width_height[2:-3]
# We split this line in an list
width_height = width_height.split(' ')
# The first element represents the width
width = int(width_height[0])
# The second represents the height
height = int(width_height[1])
# The max_value encoded in ASCII
max_value = int(img.readline())
# Now, there is only byte data
pixel_map = []
for row in range(width):
# We prepare the next line in our list
pixel_map.append([])
for column in range(height):
# The value that we read is a byte. We simply use ord to convert it to int
pixel_value = ord(img.read(1))
# We normalize the value using the max_value
pixel_value = pixel_value//max_value
pixel_map[row].append(pixel_value)
# Here is the pixel map
print(pixel_map)
Outputs:
[[0, 1, 0, 1], [1, 0, 0, 1], [1, 0, 0, 0], [1, 0, 1, 1]]