Convert String from ASCII to EBCDIC in Java?
Asked Answered
H

10

19

I need to write a 'simple' util to convert from ASCII to EBCDIC?

The Ascii is coming from Java, Web and going to an AS400. I've had a google around, can't seem to find a easy solution (maybe coz there isn't one :( ). I was hoping for an opensource util or paid for util that has already been written.

Like this maybe?

Converter.convertToAscii(String textFromAS400)
Converter.convertToEBCDIC(String textFromJava)

Thanks,

Scott

Hallett answered 15/12, 2008 at 14:55 Comment(1)
Do you have to deal with redefines and packed records, or is this a straight transalation?Fulmis
C
11

JTOpen, IBM's open source version of their Java toolbox has a collection of classes to access AS/400 objects, including a FileReader and FileWriter to access native AS400 text files. That may be easier to use then writing your own conversion classes.

From the JTOpen homepage:

Here are just a few of the many i5/OS and OS/400 resources you can access using JTOpen:

  • Database -- JDBC (SQL) and record-level access (DDM)
  • Integrated File System
  • Program calls
  • Commands
  • Data queues
  • Data areas
  • Print/spool resources
  • Product and PTF information
  • Jobs and job logs
  • Messages, message queues, message files
  • Users and groups
  • User spaces
  • System values
  • System status
Cragsman answered 15/12, 2008 at 16:58 Comment(3)
We are using the JTopen tool box and it is doing some of the convertion/mapping, it's just it seems to incorrectly map £,$,[ and ^Hallett
Sounds like your AS/400 is incorrectly configured regarding its native tongue. If it is set up correctly jt400.jar will not require any other tweaking.Meir
Yes, the conversion should happen basically automatically. If it isn't, something isn't setup right.Gaiser
H
35

Please note that a String in Java holds text in Java's native encoding. When holding an ASCII or EBCDIC "string" in memory, prior to encoding as a String, you'll have it in a byte[].

ASCII -> Java:   new String(bytes, "ASCII")
EBCDIC -> Java:  new String(bytes, "Cp1047")
Java -> ASCII:   string.getBytes("ASCII")
Java -> EBCDIC:  string.getBytes("Cp1047")
Helmut answered 15/12, 2008 at 23:49 Comment(4)
There are many EBCDIC code tables. It is very tedious to get right manually.Meir
The Java character sets that start with "CP" refer to IBM CCSIDs. Some documentation of these can be found at www-03.ibm.com/systems/i/software/globalization/ccsid_list.html and www-03.ibm.com/systems/i/software/globalization/codepages.html CP1047 appears to refer to 01047, "Latin 1/Open Systems".Helmut
@AlanKrueger as of today these links are dead. Thats really too bad.Sharpset
@Sharpset - new link is www-01.ibm.com/software/globalization/cp/cp_cpgid.htmlZambia
C
11

JTOpen, IBM's open source version of their Java toolbox has a collection of classes to access AS/400 objects, including a FileReader and FileWriter to access native AS400 text files. That may be easier to use then writing your own conversion classes.

From the JTOpen homepage:

Here are just a few of the many i5/OS and OS/400 resources you can access using JTOpen:

  • Database -- JDBC (SQL) and record-level access (DDM)
  • Integrated File System
  • Program calls
  • Commands
  • Data queues
  • Data areas
  • Print/spool resources
  • Product and PTF information
  • Jobs and job logs
  • Messages, message queues, message files
  • Users and groups
  • User spaces
  • System values
  • System status
Cragsman answered 15/12, 2008 at 16:58 Comment(3)
We are using the JTopen tool box and it is doing some of the convertion/mapping, it's just it seems to incorrectly map £,$,[ and ^Hallett
Sounds like your AS/400 is incorrectly configured regarding its native tongue. If it is set up correctly jt400.jar will not require any other tweaking.Meir
Yes, the conversion should happen basically automatically. If it isn't, something isn't setup right.Gaiser
N
4

You should use either the Java character set Cp1047 (Java 5) or Cp500 (JDK 1.3+).

Use the String constructor: String(byte[] bytes, [int offset, int length,] String enc)

Neolamarckism answered 15/12, 2008 at 15:5 Comment(1)
You forgot Cp037 (we have that one). You should suggest that the person verifies what characterset is being used.Azide
S
4
package javaapplication1;

import java.nio.ByteBuffer;
import java.nio.CharBuffer;

import java.nio.charset.CharacterCodingException;

import java.nio.charset.Charset;

import java.nio.charset.CharsetDecoder;

import java.nio.charset.CharsetEncoder;

public class ConvertBetweenCharacterSetEncodingsWithCharBuffer {

    public static void main(String[] args) {

       //String cadena = "@@@@@@@@@@@@@@@ñâæÃÈÄóöó@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ÔÁâãÅÙÃÁÙÄ@ÄÅÂÉã@âæÉãÃÈ@@@@@@@@";
        String cadena = "ñâæÃÈÄóöó";
        System.out.println(Convert(cadena,"CP1047","ISO-8859-1"));
        cadena = "1SWCHD363";
        System.out.println(Convert(cadena,"ISO-8859-1","CP1047"));

    }

    public static String Convert (String strToConvert,String in, String out){
       try {

        Charset charset_in = Charset.forName(out);
        Charset charset_out = Charset.forName(in);

        CharsetDecoder decoder = charset_out.newDecoder();

        CharsetEncoder encoder = charset_in.newEncoder();

        CharBuffer uCharBuffer = CharBuffer.wrap(strToConvert);

        ByteBuffer bbuf = encoder.encode(uCharBuffer);

        CharBuffer cbuf = decoder.decode(bbuf);

        String s = cbuf.toString();

        //System.out.println("Original String is: " + s);
        return s;

    } catch (CharacterCodingException e) {

        //System.out.println("Character Coding Error: " + e.getMessage());
        return "";

    }


}

}
Strigose answered 1/4, 2015 at 23:31 Comment(1)
Welcome to SO! Explaining your solution is not required, but considers good practice, with the nice side effects that people learn to understand and hence upvote your answer. ;)Phrenetic
P
1

You can create one yoursef with this translation table.

But here is a site that has a link to a Java example.

Procurable answered 15/12, 2008 at 15:2 Comment(4)
The second link is dead. Do you know where it went? Can you post the example here?Flied
@BilltheLizard web.archive.org/web/20080112153232/https://reply42.com/…. Maybe I should edit the answer but...Laxity
Web Archive links are fine if the original is no longer online. I'd go ahead and edit the answer. Also, you could have waited until Nov 8 to reply back on the 10th anniversary of my comment. ;pFlied
That you remember, is the first like a CP037?Peroxide
W
1

Perhaps, like me you were not strictly using a JDBC feature (Writing to a Dataqueue, in my instance), so the auto-magical encoding didn't apply to you since we're communicating through multiple APIs.

My issue was similar to @scottyab's issue with certain characters not mapping. In my case, the example code I was referencing worked perfectly, but writing an xml string to a dataqueue resulted in [ being replaced with £.

As a web developer working with a pre-existing database backend with decades of information, I didn't simply have the ability to "right" the "mis-configuration" as one other commenter suggests.

However, I was able to see which Coded Character Set Identifier the i was likely using by issuing a command to the 400 to display file field information on a known good file: DSPFFD *LIB*/*FILE*.

Doing so gave me good information, including the specific CCSID set: CCSID Identifier

After some information sought on CCSIDs, I ran into a page on IBM for EBCDIC with key information printed on the page (since that has a habit of disappearing):

Version 11.0.0 Extended Binary Coded Decimal Interchange Code (EBCDIC) is an encoding scheme that is typically used on zSeries (z/OS®) and iSeries (System i®).

And most helpful:

Some example EBCDIC CCSIDs are 37, 500, and 1047.

Since I already learned from this question itself that Cp1047 is another good character set to try (This time, the £ turned into an accented "Y"), I tried Cp37 to see no such charsset existed, but attempted Cp037 and got the right encoding.

It looks like the key is finding which Coded Character Set Identifier (CCSID) is used in your system, and ensuring that your jt400 instance - which otherwise is working perfecting - matches up 100% to the encoding set on the as400, in my case way before my lifetime and decades of business logic ago.

Wesle answered 24/1, 2017 at 15:4 Comment(0)
K
1

I make a code that transforms data types easily.

public class Converter{

    public static void main(String[] args) {

        Charset charsetEBCDIC = Charset.forName("CP037");
        Charset charsetACSII = Charset.forName("US-ASCII");

        String ebcdic = "(((((((";
        System.out.println("String EBCDIC: " + ebcdic);
        System.out.println("String converted to ASCII: " + convertTO(ebcdic, charsetEBCDIC, charsetACSII));

        String ascII = "MMMMMM";
        System.out.println("String ASCII: " + ascII);
        System.out.println("String converted to EBCDIC: " + convertTO(ascII, charsetACSII, charsetEBCDIC));
    }

    public static String convertTO(String dados, Charset encondingFrom, Charset encondingTo) {
        return new String(dados.getBytes(encondingFrom), encondingTo);
    }
}
Kirsti answered 2/2, 2017 at 18:14 Comment(0)
A
1

I want to add on to what Kwebble and Shawn S have said. I can use JTOpen to do this.

I needed to write to a field which was 6 0P (6 bytes, nothing behind the decimal, packed). That's a decimal(11,0) for those of you who don't grok DDM.

    AS400PackedDecimal convertedCustId = new AS400PackedDecimal(11, 0);
    byte[] packedCust = convertedCustId.toBytes((int) custId);

    String packedCustStr = new String(packedCust, "Cp037");

    StringBuilder jcommData = new StringBuilder();
    jcommData.append(String.format("%6s", packedCustStr));

Yes, I used the library KWebble mentioned. Looking at DSPPFD as Shawn S mentioned, I discovered that the table was using CCSID 37. This worked.

I originally tried using Cp1047, as per Alan Krueger's suggestion. It seemed to work. Unfortunately, if my custId ended with a 5, the data rendered into the file was B0 instead of 5F. Changing it to Cp037 fixed that.

Albie answered 24/8, 2017 at 15:16 Comment(0)
M
0

It should be fairly simple to write a map for the EBCDIC character set, and one for the ASCII character set, and in each return the character representation of the other. Then just loop over the string to translate, and look up each character in the map and append it to an output string.

I don't know if there are any converter's publicly available, but it shouldn't take more than an hour or so to write one.

Metallurgy answered 15/12, 2008 at 15:2 Comment(0)
B
0

This is what I've been using.

public static final int[] ebc2asc = new int[256];
public static final int[] asc2ebc = new int[256];

static
{
  byte[] values = new byte[256];
  for (int i = 0; i < 256; i++)
    values[i] = (byte) i;

  try
  {
    String s = new String (values, "CP1047");
    char[] chars = s.toCharArray ();
    for (int i = 0; i < 256; i++)
    {
      int val = chars[i];
      ebc2asc[i] = val;
      asc2ebc[val] = i;
    }
  }
  catch (UnsupportedEncodingException e)
  {
    e.printStackTrace ();
  }
}
Benfield answered 2/1, 2015 at 5:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.