I finally came up with a solution, a python script, here it is (I've also created a github gist).
Here is the main idea behind the code:
- Iterate over all blog posts.
- Find all
img
tags in each post.
- Extract the content of the
src
attribute.
- Open the image and extract its size.
- Write image size in the
img
attribues width
and height
.
The code:
#!/bin/python
from BeautifulSoup import BeautifulSoup
from os.path import basename, splitext
from PIL import Image
import glob
# Path where the posts are, in markdown format
path = "/ruta/ficheros/*.md"
# Iterate over all posts
for fname in glob.glob(path):
# Open the post
f = open(fname)
# Create a BeautifulSoup object to parse the file
soup = BeautifulSoup(f)
f.close()
# For each img tag:
for img in soup.findAll('img'):
if img != None:
try:
if img['src'].startswith("/assets") == True:
# Open the image
pil = Image.open("/ruta/carpeta/imagenes" + img['src'])
# Get its size
width, height = pil.size
# Modify img tag with image size
img['width'] = str(width) + "px"
img['height'] = str(height) + "px"
except KeyError:
pass
# Save the updated post
with open(fname, "wb") as file:
file.write(str(soup))
How to use
Just create the script in you machine and change path
variable to point where your posts are.
Hope it helps, I've also wrote a blog post about this issue