IBM BPM, V8.0.1, All platforms > Authoring services in Integration Designer > Services and service-related functions > Access external services with adapters > Configure and using adapters > IBM WebSphere Adapters > Flat Files > Overview of WebSphere Adapter for Flat Files > Technical overview > Outbound processing

File splitting

To support files that contain multiple records, the adapter provides an optional file splitting feature. When you use this feature during the Retrieve operation, the adapter divides large files into smaller chunks, which are then retrieved separately.

Depending upon the type of content contained in the file, the file can be split by delimiter or by size.

By default, the adapter splits files by size.

The value specified in the SplitCriteria property determines the method that is used. The default value for SplitCriteria property is zero, which means that no splitting is performed. You can also leave the values of the SplitCriteria and SplittingFunctionClassName properties empty if no splitting is required.

You can optionally provide a custom file splitter class. Set the SplittingFunctionClassName property to the name of the class.


File splitting by delimiter

When one or more characters such as a comma (,), semicolon (;), quotation mark ( ", ' ), brace ({}), or slash ( / \ ) (delimiters) are used to separate the business objects in a file, the adapter can split the file into smaller chunks based on the delimiter. You define the delimiter that separates the business objects in the file in the SplitCriteria property.

You can enable file splitting by delimiter by specifying the value of the SplittingFunctionClassName property as com.ibm.j2ca.utils.filesplit.SplitByDelimiter.

The following rules apply to the use of delimiters:

Example of a common scenario and the recommended delimiter format:

Delimiter format for a scenario
Data binding BO content Recommended delimiter format
XML
<?xml version="1.0" encoding="UTF-8"?>
<customer:Customer xsi:type="customer:Customer" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:customer="http://www.ibm.com/xmlns/prod/websphere/
j2ca/flatfile/customer">
<CustomerName>Deepa</CustomerName>
<Address>IBM</Address>
<City>Bangalore</City>
<State>KA</State>
</customer:Customer>
\n


File splitting by size

The value specified in the SplittingFunctionClassName property determines whether a file is split by size. If the SplittingFunctionClassName property is set to com.ibm.j2ca.utils.filesplit.SplitBySize, the SplitCriteria property must contain a valid number that represents the maximum file size, in bytes. If the file is larger than the value specified in the SplitCriteria property, the file is split into chunks and each chunk is posted to the import separately. If the file is smaller than the SplitCriteria value, the entire file is posted to the import.

When event files are split into chunks, each chunk becomes a business object. This means that the value specified for the PollQuantity property and the number of business objects delivered to the import can be different. Although the adapter polls according to the PollQuantity value, it actually processes the number of business objects in the file one at a time.

For example, if an event file is chunked into three parts, one file is polled and the three business objects are delivered to the import (because each chunk creates a single business object).

At the import, the adapter does not reassemble the chunked data into a single file, but it provides information about the chunks to enable IBM BPM or WebSphere Enterprise Service Bus to reassemble them into a single file. The chunk information is included in the ChunkFileName property of the FlatFileInputStreamRecord record, and includes the chunk size in bytes and the event ID. The event ID of a chunk uses the following form: eventFileLocation_/_timestampStr_/_MofN, where M is the current chunk number and N is the total number of chunks. An event ID would look like the following example:

C:\flatfile\eventdir\eventfile.in_/_2005_01_10_10_17_49_864_/_3of5, where timestampStr has the following format: year_month_day_hour_minutes_seconds_milliseconds.

Outbound processing