Transforming character-delimited text to XML

To transform data between a character-delimited format, such as comma-separated value (CSV), and an XML data format:

  1. Create a character-delimited format file that contains the data you want to transform.

  2. Create an XML schema file in a text editor defining how the character-delimited format file maps to an XML file.

    Your file should be structured similar to the following example:

    <?xml version="1.0" encoding="UTF-8" ?>
    <TextSchema DataType = "CSV Format">
       <RecordDescription
          FieldSeparator = ","
          RecordSeparator = "&#010;&#013;"
          StringDelimiter = "&quot;"
          HeaderIncluded = "true"
          HeaderLines = "1"
          ElementName = "category"
       >
             <FieldDescription FieldName = "categoryName" FieldPosition = "1" />
             <FieldDescription FieldName = "markForDelete" FieldPosition = "2" />
             <FieldDescription FieldName = "field1" FieldPosition = "3" />
             <FieldDescription FieldName = "field2" FieldPosition = "4" />
       </RecordDescription>
    </TextSchema>
    

    Use the following tags and attributes when creating your XML schema file:

    DataType

    Enter a description of the data format of your character-delimited format

    RecordDescription

    This tag describes the structure of your character-delimited format file. This tag uses the following attributes to define the file structure:

    FieldSeparator

    This attribute specifies the character or characters separating fields in the character-delimited format file.

    RecordSeparator

    This attribute specifies the character or characters separating records in the character-delimited format file.

    Special characters must entered as a decimal numerical Unicode entity. For example, a line feed or new line (\n) must be entered as &#010; and a carriage return (\r) must be entered as &#013;.

    StringDelimiter

    This attribute specifies the character or characters that enclose each field in a record in your character-delimited format file.

    HeaderIncluded

    Valid values for this attribute are:

    false

    The character-delimited format file does not contain a header line that indicates the field names.

    true

    The character-delimited format file contains a header line that indicates the field names.

    HeaderLines

    This attribute specifies the number of records at the start of the character-delimited format file, as separated by the RecordSeparator, that are not to be considered as data. These lines will not be converted to XML.

    ElementName

    This attribute defines the root element for each record.

    FieldDescription

    This element describes a field in the records in your character-delimited format file. There must be one FieldDescription element for each field in your character-delimited format file. This element uses the following attributes to define each field:

    FieldName

    This attribute defines the name of the field. This attribute will be used as a child element.

    FieldPosition

    This attribute indicates the position of this field in a record. The first field in a record is in field position 1.
    Save your file as an XML file. This file is not an XSD file.

  3. Create the parameters file that specifies the parameters required by the txttransform utility.

    The order of the values in the parameters file is important. The parameters must be separated by commas and appear in the file in the following order:

    Input file

    Name of the character-delimited variable format file to be transformed.

    Schema file

    Name of the XML schema file to be used in the transformation.

    Output file

    Name for the output XML file in which the transformed data will be stored.

    Transformation method

    Method to be used in adding the data to the output file. The following methods are valid:

    Create

    Create a new XML file from the text file.

    Append

    Append new XML data to an existing XML file.

    Encoding

    The character encoding scheme of the input file. Any character encoding scheme supported by Java can be specified.

    For example, your parameters file could consist of the following line of text:

    sample.csv,sample_schema.xml,catalog.xml,Create,UTF8

    This parameter file tells the txtransform utility to create a new XML file (catalog.xml) from a UTF-8 encoded comma separated value (CSV) file (sample.csv) using the schema defined in an XML schema file (sample_schema.xml).

  4. Run the txttransform utility.

Based on the example XML schema file, each record would have the following XML structure:

<category>
  <categoryName>Category_name_value</categoryName>
  <markforDelete>Marked_for_delete_value</markforDelete>
  <field1>field1_value</field1>
  <field2>field2_value</field2>
</category>

Related concepts

Related reference