XML to Fixed width text file with xsl style sheet
Asked Answered
G

3

6

I need help formatting this xml to a fixed width text file using a xsl style sheet. I know very little about xsl and have found very little information online on how this can be done.

Basically I need this xml

<?xml version="1.0" encoding="UTF-8"?>
<Report>
   <table1>
      <Detail_Collection>
         <Detail>
            <SSN>*********</SSN>
            <DOB>1980/11/11</DOB>
            <LastName>user</LastName>
            <FirstName>test</FirstName>
            <Date>2013/02/26</Date>
            <Time>14233325</Time>
            <CurrentStreetAddress1>53 MAIN STREET</CurrentStreetAddress1>
            <CurrentCity>san diego</CurrentCity>
            <CurrentState>CA</CurrentState>
      </Detail_Collection>
   </table1>
</Report>

In this format, all on the same line

*********19801111user         test       201302261423332553 MAIN STREET                                    san diego          CA

These are the fixed widths

FR TO
1   9     SSN
10  17    DOB
18  33    LastName
34  46    FirstName
47  54    Date
55  62    Time
63  90    CurrentStreetAddress1 
91  115   CurrentCity
116 131   CurrentStat

All help is much appreciated! Thanks in advance!

Gown answered 29/5, 2013 at 14:12 Comment(1)
Can you use an extension like node-set(), or an extra xml document/file to hold the width and order of your output?Latish
P
6

The secret to doing this in XSLT 1.0 is to realize that you can combine a "padding strategy" with a "substring strategy" to either pad or cut off a piece of text to a desired width. In particular, XSLT instructions of this form:

substring(concat('value to pad or cut', '       '), 1, 5)

...where concat is used to add a number of padding characters to a string and substring is used to limit the overall width, are helpful. With that said, here's an XSLT 1.0 solution that accomplishes what you want.

Please note that in your expected output, some of the character widths do not match your requirements; for example, according to the requirements, <LastName> should be sized to 16 characters, whereas your output appears to cut it off at 13. That said, I believe my solution below outputs what you expect.

When this XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="Detail">
    <xsl:apply-templates />
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

  <xsl:template match="SSN">
    <xsl:value-of
      select="substring(concat(., '         '), 1, 9)"/>
  </xsl:template>

  <xsl:template match="DOB">
    <xsl:value-of
      select="substring(concat(translate(., '/', ''), '        '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="LastName">
    <xsl:value-of
      select="substring(concat(., '                '), 1, 16)"/>
  </xsl:template>

  <xsl:template match="FirstName">
    <xsl:value-of
      select="substring(concat(., '             '), 1, 13)"/>
  </xsl:template>

  <xsl:template match="Date">
    <xsl:value-of
      select="substring(concat(translate(., '/', ''), '        '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="Time">
    <xsl:value-of
      select="substring(concat(., ' '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="CurrentStreetAddress1">
    <xsl:value-of
      select="substring(concat(., '                            '), 1, 28)"/>
  </xsl:template>

  <xsl:template match="CurrentCity">
    <xsl:value-of
      select="substring(concat(., '                         '), 1, 25)"/>
  </xsl:template>

  <xsl:template match="CurrentStat">
    <xsl:value-of
      select="substring(concat(., '               '), 1, 15)"/>
  </xsl:template>

</xsl:stylesheet>

...is run against the provided XML (with a </Detail> added to make the document well-formed):

<Report>
  <table1>
    <Detail_Collection>
      <Detail>
        <SSN>*********</SSN>
        <DOB>1980/11/11</DOB>
        <LastName>user</LastName>
        <FirstName>test</FirstName>
        <Date>2013/02/26</Date>
        <Time>14233325</Time>
        <CurrentStreetAddress1>53 MAIN STREET</CurrentStreetAddress1>
        <CurrentCity>san diego</CurrentCity>
        <CurrentState>CA</CurrentState>
      </Detail>
    </Detail_Collection>
  </table1>
</Report>

...the wanted result is produced:

*********19801111user            test         201302261423332553 MAIN STREET              san diego                CA
Paulinapauline answered 29/5, 2013 at 14:50 Comment(2)
This is perfect, Thank you. When there are multiple detail sections, one for each person, is there a way to insert a line break so each detail section is on a new line? Thanks again.Gown
@Gown - great question. I just made an update that will accomodate the scenario you describe.Paulinapauline
L
8

Here are (in my view) lite more reliable and maintainable version:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
    <xsl:output method="text" indent="no"/>

    <xsl:variable name="some_spaces" select="'                                                                  '" />

    <xsl:template match="/">
        <xsl:apply-templates select="//Detail_Collection/Detail" />
    </xsl:template>

    <xsl:template match="Detail_Collection/Detail">
        <xsl:apply-templates mode="format" select="SSN">
            <xsl:with-param name="width" select="number(9-1)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format_date" select="DOB">
            <xsl:with-param name="width" select="number(17-10)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="LastName">
            <xsl:with-param name="width" select="number(33-18)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="FirstName">
            <xsl:with-param name="width" select="number(46-34)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format_date" select="Date">
            <xsl:with-param name="width" select="number(54-47)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="Time">
            <xsl:with-param name="width" select="number(62-55)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="CurrentStreetAddress1">
            <xsl:with-param name="width" select="number(90-63)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="CurrentCity">
            <xsl:with-param name="width" select="number(115-91)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="CurrentState">
            <xsl:with-param name="width" select="number(131-116)"/>
        </xsl:apply-templates>
        <xsl:text>&#10;</xsl:text>
    </xsl:template>

    <xsl:template  match="node()" mode ="format">
        <xsl:param name="width" />
        <xsl:value-of select="substring(concat(text(),$some_spaces ), 1, $width+1)"/>
    </xsl:template>
    <xsl:template  match="node()" mode="format_date">
        <xsl:param name="width" />
        <xsl:value-of select="substring(concat(translate(text(),'/',''),$some_spaces ), 1, $width+1)"/>
    </xsl:template>

</xsl:stylesheet>

It will create the right output even if the fields in input not in order with the requested output, or if fields are missing in input. Also it consider that there are more than one Detail entry.

Latish answered 29/5, 2013 at 15:40 Comment(4)
Um, all of the widths are off by one: if CurrentStreet occupies the columns 63 through 90, its width is 28, not 90 - 63 (= 27). And similarly for all the others.Foggia
@C.M.Sperberg-McQueen: With a little closer look you will find , $width+1. ;-) In thought computer are better in calculation than humans.Latish
Ah; I was too quick. Touché.Foggia
Hi, I like your solution very much but what if the data needed to be represented in left justified (Text) or right justifed (numbers)Walkerwalkietalkie
P
6

The secret to doing this in XSLT 1.0 is to realize that you can combine a "padding strategy" with a "substring strategy" to either pad or cut off a piece of text to a desired width. In particular, XSLT instructions of this form:

substring(concat('value to pad or cut', '       '), 1, 5)

...where concat is used to add a number of padding characters to a string and substring is used to limit the overall width, are helpful. With that said, here's an XSLT 1.0 solution that accomplishes what you want.

Please note that in your expected output, some of the character widths do not match your requirements; for example, according to the requirements, <LastName> should be sized to 16 characters, whereas your output appears to cut it off at 13. That said, I believe my solution below outputs what you expect.

When this XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="Detail">
    <xsl:apply-templates />
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

  <xsl:template match="SSN">
    <xsl:value-of
      select="substring(concat(., '         '), 1, 9)"/>
  </xsl:template>

  <xsl:template match="DOB">
    <xsl:value-of
      select="substring(concat(translate(., '/', ''), '        '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="LastName">
    <xsl:value-of
      select="substring(concat(., '                '), 1, 16)"/>
  </xsl:template>

  <xsl:template match="FirstName">
    <xsl:value-of
      select="substring(concat(., '             '), 1, 13)"/>
  </xsl:template>

  <xsl:template match="Date">
    <xsl:value-of
      select="substring(concat(translate(., '/', ''), '        '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="Time">
    <xsl:value-of
      select="substring(concat(., ' '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="CurrentStreetAddress1">
    <xsl:value-of
      select="substring(concat(., '                            '), 1, 28)"/>
  </xsl:template>

  <xsl:template match="CurrentCity">
    <xsl:value-of
      select="substring(concat(., '                         '), 1, 25)"/>
  </xsl:template>

  <xsl:template match="CurrentStat">
    <xsl:value-of
      select="substring(concat(., '               '), 1, 15)"/>
  </xsl:template>

</xsl:stylesheet>

...is run against the provided XML (with a </Detail> added to make the document well-formed):

<Report>
  <table1>
    <Detail_Collection>
      <Detail>
        <SSN>*********</SSN>
        <DOB>1980/11/11</DOB>
        <LastName>user</LastName>
        <FirstName>test</FirstName>
        <Date>2013/02/26</Date>
        <Time>14233325</Time>
        <CurrentStreetAddress1>53 MAIN STREET</CurrentStreetAddress1>
        <CurrentCity>san diego</CurrentCity>
        <CurrentState>CA</CurrentState>
      </Detail>
    </Detail_Collection>
  </table1>
</Report>

...the wanted result is produced:

*********19801111user            test         201302261423332553 MAIN STREET              san diego                CA
Paulinapauline answered 29/5, 2013 at 14:50 Comment(2)
This is perfect, Thank you. When there are multiple detail sections, one for each person, is there a way to insert a line break so each detail section is on a new line? Thanks again.Gown
@Gown - great question. I just made an update that will accomodate the scenario you describe.Paulinapauline
F
2

To pad a string to a given length in XSLT 1.0, I'd use a combination of concat() and substring(). In a template for Detail, for example, I might write something like

<xsl:value-of 
  select="substring(concat(SSN,'          '),1,9)"/>
<xsl:value-of 
  select="substring(concat(DOB,'          '),1,8)"/>
<xsl:value-of 
  select="substring(concat(LastName,'                '),1,16)"/>
...
<xsl:text>&#xA;</xsl:text>

If you know very little about XSLT, you will also need to learn how to construct the stylesheet: XSLT typically uses template matching to drive flow of control in the stylesheet, which is often difficult for people coming from imperative programming languages to get their heads around.

If you know that every Detail element will have the same children in the same sequence (this is one thing DTDs and schemas are good for), then the simplest thing to do is to write a template for each element type that can occur in the input. The following stylesheet illustrates the pattern for some but not all elements:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:variable name="blanks10" select="          "/>
  <xsl:variable name="blanks" 
    select="concat($blanks10, $blanks10, $blanks10)"/>

  <!--* For Report, table1, and Detail_collection, we just 
      * recur on the children *-->
  <xsl:template match="Report | table1 | Detail_collection">
    <xsl:apply-templates select="*"/>
  </xsl:template>

  <!--* For Detail, we recur on the children and supply a
      * line-ending newline. *-->
  <xsl:template match="Detail">
    <xsl:apply-templates select="*"/>
    <xsl:text>&#xA;</xsl:text>
  </xsl:template>

  <!--* For SSN, DOB, etc., we pad the value with blanks and
      * truncate at the appropriate length. *-->
  <xsl:template match="SSN">
    <xsl:value-of select="substring(concat(.,$blanks),1,9)"
  </xsl:template>

  <!--* For DOB, we assume input is yyyy/mm/dd and output should
      * be yyyymmdd. *-->
  <xsl:template match="DOB">
    <xsl:value-of 
      select="substring(concat(translate(.,'/',''),$blanks),1,8)"
  </xsl:template>

  <xsl:template match="LastName">
    <xsl:value-of select="substring(concat(.,$blanks),1,16)"
  </xsl:template>     

  <!--* FirstName etc. left as exercise for the reader. *-->

</xsl:stylesheet>

If Detail can vary in order or population, the variation can be normalized by replacing the call to xsl:apply-templates in the template for Detail with code like that shown in the first code fragment here. That style of code also feels more natural to some procedural programmers; for that reason, I advise you to avoid it consciously while learning XSLT. If you want to learn XSLT well, it pays to become friends with xsl:apply-templates.

If you don't care about learning XSLT, then my advice is to hope that someone answers your query by giving you a complete solution to your task.

Foggia answered 29/5, 2013 at 15:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.