How to create JPEG compressed DICOM dataset using pydicom?
Asked Answered
I

2

17

I am trying to create a JPEG compressed DICOM image using pydicom. A nice source material about colorful DICOM images can be found here, but it's mostly theory and C++. In the code example below I create a pale blue ellipsis inside output-raw.dcm (uncompressed) which looks fine like this:

Sample DICOM image

import io
from PIL import Image, ImageDraw
from pydicom.dataset import Dataset
from pydicom.uid import generate_uid, JPEGExtended
from pydicom._storage_sopclass_uids import SecondaryCaptureImageStorage

WIDTH = 100
HEIGHT = 100


def ensure_even(stream):
    # Very important for some viewers
    if len(stream) % 2:
        return stream + b"\x00"
    return stream


def bob_ross_magic():
    image = Image.new("RGB", (WIDTH, HEIGHT), color="red")
    draw = ImageDraw.Draw(image)
    draw.rectangle([10, 10, 90, 90], fill="black")
    draw.ellipse([30, 20, 70, 80], fill="cyan")
    draw.text((11, 11), "Hello", fill=(255, 255, 0))
    return image


ds = Dataset()
ds.is_little_endian = True
ds.is_implicit_VR = True
ds.SOPClassUID = SecondaryCaptureImageStorage
ds.SOPInstanceUID = generate_uid()
ds.fix_meta_info()
ds.Modality = "OT"
ds.SamplesPerPixel = 3
ds.BitsAllocated = 8
ds.BitsStored = 8
ds.HighBit = 7
ds.PixelRepresentation = 0
ds.PhotometricInterpretation = "RGB"
ds.Rows = HEIGHT
ds.Columns = WIDTH

image = bob_ross_magic()
ds.PixelData = ensure_even(image.tobytes())

image.save("output.png")
ds.save_as("output-raw.dcm", write_like_original=False)  # File is OK

#
# Create compressed image
#
output = io.BytesIO()
image.save(output, format="JPEG")

ds.PixelData = ensure_even(output.getvalue())
ds.PhotometricInterpretation = "YBR_FULL_422"
ds.file_meta.TransferSyntaxUID = JPEGExtended

ds.save_as("output-jpeg.dcm", write_like_original=False)  # File is corrupt

At the very end I am trying to create compressed DICOM: I tried setting various transfer syntaxes, compressions with PIL, but no luck. I believe the generated DICOM file is corrupt. If I were to convert the raw DICOM file to JPEG compressed with gdcm-tools:

$ gdcmconv -J output-raw.dcm output-jpeg.dcm

By doing a dcmdump on this converted file we could see an interesting structure, which I don't know how to reproduce using pydicom:

$ dcmdump output-jpeg.dcm

# Dicom-File-Format

# Dicom-Meta-Information-Header
# Used TransferSyntax: Little Endian Explicit
(0002,0000) UL 240                                      #   4, 1 FileMetaInformationGroupLength
(0002,0001) OB 00\01                                    #   2, 1 FileMetaInformationVersion
(0002,0002) UI =SecondaryCaptureImageStorage            #  26, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.826.0.1.3680043.8.498.57577581978474188964358168197934098358] #  64, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =JPEGLossless:Non-hierarchical-1stOrderPrediction #  22, 1 TransferSyntaxUID
(0002,0012) UI [1.2.826.0.1.3680043.2.1143.107.104.103.115.2.8.4] #  48, 1 ImplementationClassUID
(0002,0013) SH [GDCM 2.8.4]                             #  10, 1 ImplementationVersionName
(0002,0016) AE [gdcmconv]                               #   8, 1 SourceApplicationEntityTitle

# Dicom-Data-Set
# Used TransferSyntax: JPEG Lossless, Non-hierarchical, 1st Order Prediction
...
... ### How to do the magic below?
...
(7fe0,0010) OB (PixelSequence #=2)                      # u/l, 1 PixelData
  (fffe,e000) pi (no value available)                     #   0, 1 Item
  (fffe,e000) pi ff\d8\ff\ee\00\0e\41\64\6f\62\65\00\64\00\00\00\00\00\ff\c3\00\11... # 4492, 1 Item
(fffe,e0dd) na (SequenceDelimitationItem)               #   0, 0 SequenceDelimitationItem

I tried to use pydicom's encaps module, but I think it's mostly for reading data, not writing. Anyone else have any ideas how to deal with this issue, how to create/encode these PixelSequences? Would love to create JPEG compressed DICOMs in plain Python without running external tools.

Inflow answered 23/10, 2019 at 8:11 Comment(3)
Are you able to read the JPEG compressed image through PyDicom?Riemann
Yes of course I can decompress and read it. Of course, you need a few extra libraries installed, here's the possible combinations: pydicom.github.io/pydicom/stable/image_data_handlers.htmlInflow
Did this use case every get resolved? I'd love to see some documentation on it myself.Mosely
C
7

DICOM requires compressed Pixel Data be encapsulated (see the tables especially). Once you have your compressed image data you can use the encaps.encapsulate() method to create bytes suitable for use with Pixel Data:

from pydicom.encaps import encapsulate

# encapsulate() requires a list of bytes, one item per frame
ds.PixelData = encapsulate([ensure_even(output.getvalue())])
# Need to set this flag to indicate the Pixel Data is compressed
ds['PixelData'].is_undefined_length = True  # Only needed for < v1.4
ds.PhotometricInterpretation = "YBR_FULL_422"
ds.file_meta.TransferSyntaxUID = JPEGExtended

ds.save_as("output-jpeg.dcm", write_like_original=False)
Castile answered 27/12, 2019 at 10:53 Comment(2)
This works, but only for single frame jpeg. Anyone know how to encode a multiframe jpeg?Mosely
Each frame needs to be encoded separately and then all the frames encapsulated with encapsulate([frame1, frame2, ...])Castile
M
4

Trying the solution from @scaramallion, with more detail looks to work:

import numpy as np
from PIL import Image
import io

# set some parameters
num_frames = 4
img_size = 10

# Create a fake RGB dataset
random_image_array = (np.random.random((num_frames, img_size, img_size, 3))*255).astype('uint8')
# Convert to PIL
imlist = []
for i in range(num_frames):   # convert the multiframe image into RGB of single frames (Required for compression)
    imlist.append(Image.fromarray(tmp))

# Save the multipage tiff with jpeg compression
f = io.BytesIO()
        imlist[0].save(f, format='tiff', append_images=imlist[1:], save_all=True, compression='jpeg')
# The BytesIO object cursor is at the end of the object, so I need to tell it to go back to the front
f.seek(0)
img = Image.open(f)

# Get each one of the frames converted to even numbered bytes
img_byte_list = []
for i in range(num_frames):
    try:
        img.seek(i)
        with io.BytesIO() as output:
            img.save(output, format='jpeg')
            img_byte_list.append(output.getvalue())
    except EOFError:
         # Not enough frames in img
         break

ds.PixelData = encapsulate([x for x in img_byte_list])
ds['PixelData'].is_undefined_length = True
ds.is_implicit_VR = False
ds.LossyImageCompression = '01'
ds.LossyImageCompressionRatio = 10 # default jpeg
ds.LossyImageCompressionMethod = 'ISO_10918_1'
ds.file_meta.TransferSyntaxUID = '1.2.840.10008.1.2.4.51'

ds.save_as("output-jpeg.dcm", write_like_original=False)
Mosely answered 29/12, 2019 at 17:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.