How can I retrieve images from a .pptx file using MS Open XML SDK?
Asked Answered
D

2

6

I started experimenting with Open XML SDK 2.0 for Microsoft Office.

I'm currently able to do certain things such as retrieve all texts in each slide, and get the size of the presentation. For example, I do the latter this way:

using (var doc = PresentationDocument.Open(pptx_filename, false)) {
     var presentation = doc.PresentationPart.Presentation;

     Debug.Print("width: " + (presentation.SlideSize.Cx / 9525.0).ToString());
     Debug.Print("height: " + (presentation.SlideSize.Cy / 9525.0).ToString());
}

Now I'd like to retrieve embedded images in a given slide. Does anyone know how to do this or can point me to some docs on the subject?

Darryl answered 15/8, 2011 at 20:0 Comment(1)
I'm curious - why the " / 9525.0"? The standard divisor for EMU-to-points is " / 12700".Acalia
R
4

First you need to grab the SlidePart in which you want to get the images from:

public static SlidePart GetSlidePart(PresentationDocument presentationDocument, int slideIndex)
{
    if (presentationDocument == null)
    {
        throw new ArgumentNullException("presentationDocument", "GetSlidePart Method: parameter presentationDocument is null");
    }

    // Get the number of slides in the presentation
    int slidesCount = CountSlides(presentationDocument);

    if (slideIndex < 0 || slideIndex >= slidesCount)
    {
        throw new ArgumentOutOfRangeException("slideIndex", "GetSlidePart Method: parameter slideIndex is out of range");
    }

    PresentationPart presentationPart = presentationDocument.PresentationPart;

    // Verify that the presentation part and presentation exist.
    if (presentationPart != null && presentationPart.Presentation != null)
    {
        Presentation presentation = presentationPart.Presentation;

        if (presentation.SlideIdList != null)
        {
            // Get the collection of slide IDs from the slide ID list.
            var slideIds = presentation.SlideIdList.ChildElements;

            if (slideIndex < slideIds.Count)
            {
               // Get the relationship ID of the slide.
               string slidePartRelationshipId = (slideIds[slideIndex] as SlideId).RelationshipId;

                // Get the specified slide part from the relationship ID.
                SlidePart slidePart = (SlidePart)presentationPart.GetPartById(slidePartRelationshipId);

                 return slidePart;
             }
         }
     }

     // No slide found
     return null;
}

Then you need to search for the Picture object which will contain the image you are looking for based on the file name of the image:

Picture imageToRemove = slidePart.Slide.Descendants<Picture>().SingleOrDefault(picture => picture.NonVisualPictureProperties.OuterXml.Contains(imageFileName));
Roesch answered 16/8, 2011 at 0:24 Comment(3)
How to convert SlidePart to an actual image which can be in imageList?Selenodont
This code seems to assume that you know the file name of the image - right? What if I just wanted to retrieve the first image in a PPTX file or all images in a PPTX file?Purse
Is there any way to convert all slide(s) into image(s) or svg?Ragged
D
-2

Simplest way of getting Images from Openxml formats:

Use any zip archive library to extract images from media folder of the pptx file. This will contain the images in the document. Similarly, you can manually replace extension .pptx into .zip and extract to get images from media folder.

Hope this helps.

Draughtboard answered 10/1, 2014 at 18:14 Comment(1)
Question is "How can I retrieve images from a .pptx file using MS Open XML SDK?" and you are giving manual solution?Speaker

© 2022 - 2024 — McMap. All rights reserved.