How to get chart info from an Excel spreadsheet using Apache POI?
Asked Answered
F

4

6

Is it possible to extract chart information from an Office 2007 (xlsx / OpenXML) spreadsheet using Apache POI? I've managed to read in the spreadsheet and even get the part that refers to the chart but not sure how I can retrieve any info from this part e.g. Type of chart, chart data etc.

XSSFWorkbook xwb = new XSSFWorkbook("charts_lines.xlsx");

XSSFSheet sheet = xwb.getSheetAt(0);

I can also iterate through the package parts to retrieve the chart part, but I don't see how I then go on to retrieve any info about the chart?

Note, I'm not interested in creating charts using POI, just read as much chart info as is possible to do...I'm also not saving an xlsx. I simply wish to extract line colours, labels, data, chart type (pie, line, bar etc.)

Fayina answered 8/4, 2011 at 18:28 Comment(0)
B
7

There isn't a high level representation at the moment, so you'll need to drop down into the xmlbeans level and work with the low level CT* objects.

For Chart Sheets, there's XSSFChartSheet which will give you a CTChartsheet object, which has a little bit of info.

For both XSSFChart and XSSFChartSheet (regular and chart sheets), you'll need to go via the drawings to get the charts. Each sheet with charts on it should have one Drawing, and the charts get linked from the drawing, rather than the sheet itself.

As of r1090442 (so POI 3.8 or newer), there's a method on XSSFDrawing to give you all the XSSFChart objects (which are wrappers around the /charts/chart#.xml part). If you're on a really really old version of POI, use the CTDrawing to get the details of the chart, grab the /charts/chart#.xml part that corresponts, and then have xmlbeans give you the CT objects for it. Either way that'll let you get the titles, types, data ranges etc.

It is a bit fiddly though, so do please consider sending in a patch to POI if you get something good worked out for working with the CTChart objects!

Brinker answered 8/4, 2011 at 19:41 Comment(6)
As of r1090442, you should be able to get all the Charts easily from the XSSFDrawing of a XSSFSheet, hope that helps!Brinker
Thanks for that. Just wondering though, I've downloaded 3.8 beta 1 and there doesn't seem to be an XSSFDrawing member in XSSFSheet?Fayina
The method you want is poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/… . You probably also want a nightly svn build for the new method to get charts from the drawing.Brinker
Thanks @Brinker I got the latest from SVN. The issue now is figuring out how to extract the series values etc. Thank you very much :)Fayina
These functions are in the latest version of Apache POI (v3.11). xssfsheet.createDrawingPatriarch().getCharts() is the chain you want.Dogcart
@KevinL getDrawingPatriarch() might be better - that'll avoid creating a new empty drawing if there are noneBrinker
D
5

you can read chart data as XML using XSSFDrawing

like

 XSSFDrawing drawing = ((XSSFSheet)sheet).createDrawingPatriarch();
        System.out.println(drawing.getCTDrawing().toString());

will print whole chart as XMl and also using

drawing.getCharts();

you can add Iterator to it to browse chart

Diamine answered 4/3, 2013 at 10:35 Comment(0)
C
2

I don't know the exact answer to your question, but the OpenXML SDK 2.0 comes with a DocumentReflector.exe tool that will show you exactly how the chart is defined (including all relationships between the SpreadsheetML and the DrawingML packages). There is some more info on this tool in this article.

Chare answered 8/4, 2011 at 18:40 Comment(0)
L
1

Yes, It is possible to read any type of chart using Apache POI. But Before reading any chart information you need to know what XML string you are receiving because this could be different based on the chart type i.e. pie, line, bar, scatter or mixed (a combination of two or more) charts, etc. Therefore you approach will be different for different type of chart.

For a simple bar chart like this:

enter image description here

Your XML will look something like this:

<xml-fragment ...>
  <c:title>
    <c:tx>
      <c:rich>
        ...
        <a:p>
          ...
          <a:r>
            ...
            <a:t>Employee Salary</a:t>
          </a:r>
        </a:p>
      </c:rich>
    </c:tx>
    ...
  </c:title>
  ...
  <c:plotArea>
    ...
    <c:barChart>
      ...
      <c:ser>
        ...
        <c:cat>
          <c:strRef>
            ...
            <c:strCache>
              <c:ptCount val="5"/>
              <c:pt idx="0">
                <c:v>Tom</c:v>
              </c:pt>
              <c:pt idx="1">
                <c:v>John</c:v>
              </c:pt>
              <c:pt idx="2">
                <c:v>Harry</c:v>
              </c:pt>
              <c:pt idx="3">
                <c:v>Sam</c:v>
              </c:pt>
              <c:pt idx="4">
                <c:v>Richa</c:v>
              </c:pt>
            </c:strCache>
          </c:strRef>
        </c:cat>
        <c:val>
          <c:numRef>
            ...
            <c:numCache>
              <c:formatCode>"$"#,##0</c:formatCode>
              <c:ptCount val="5"/>
              <c:pt idx="0">
                <c:v>1000</c:v>
              </c:pt>
              <c:pt idx="1">
                <c:v>700</c:v>
              </c:pt>
              <c:pt idx="2">
                <c:v>300</c:v>
              </c:pt>
              <c:pt idx="3">
                <c:v>900</c:v>
              </c:pt>
              <c:pt idx="4">
                <c:v>800</c:v>
              </c:pt>
            </c:numCache>
          </c:numRef>
        </c:val>
        ...
      </c:ser>
      ...
    </c:barChart>
    ...
  </c:plotArea>
  ...
</xml-fragment>

Now based on the above XML String we can use CT* classes and its various methods to traverse through the whole XML using Apache POI. Let's see how to read Chart Title, Labels(Employee names) and Series(Employee salaries) using POI:

Workbook workbook = new XSSFWorkbook(new File(PATH));
Sheet sheet = workbook.getSheet("GraphSheet");

XSSFSheet xsheet = (XSSFSheet) sheet;
XSSFDrawing drawing = xsheet.getDrawingPatriarch();
if (drawing != null) {
    List<XSSFChart> charts = drawing.getCharts();

    for (int chartIndex = 0; charts != null && chartIndex < (charts.size()); chartIndex++) {

        XSSFChart chart = charts.get(chartIndex);

        CTChart chart2 = chart.getCTChart();
        CTPlotArea plot = chart2.getPlotArea();
        System.out.println("Chart Title :" + chart2.getTitle().getTx().getRich().getPArray(0).getRArray(0).getT());

        CTBarSer[] ctScaSerList = plot.getBarChartArray(0).getSerArray();

        for (CTBarSer ctLineSer : ctScaSerList) {

            CTStrVal[] ctStrVals = ctLineSer.getCat().getStrRef().getStrCache().getPtArray();
            for (int i = 0; i < ctStrVals.length; i++) {
                System.out.print(ctStrVals[i].getV() + ",");
            }
            System.out.println();
            CTNumVal[] ctXNumVal = ctLineSer.getVal().getNumRef().getNumCache().getPtArray();

            for (int i = 0; i < ctXNumVal.length; i++) {
                System.out.print(ctXNumVal[i].getV() + ",");
            }

        }
    }
}

Console:

Chart Title :Employee Salary
Tom,John,Harry,Sam,Richa,
1000,700,300,900,800,

Note: Here, the idea is to first read the XML String(because could be different based on your graph type) and then traverse the whole XML accordingly.

Liable answered 10/5, 2020 at 14:6 Comment(2)
how do i get the position of a charts and colors used?Furan
depends on what type of chart you have, simply your approach should be, first print the XML somewhere, analyze that XML carefully, surely you will find a common pattern in charts (from which you can get the position) and you might also find colors of the chart (about the color I am not 100% sure). Once you get all the information then use CT* classes to get the values of those properties.Liable

© 2022 - 2024 — McMap. All rights reserved.