Step 1: Background information
The WebSphere Commerce Rapid Solution Services Offering helps you quickly implement your commerce solution to take advantage of the rich default product features and thereby accelerate your time-to-value. By using this method, you should be able to understand, plan and execute the requirements in order to prepare a store for launch quickly and reliably.The following steps are required:
- Consolidate the environment
- Identify assets
- Consolidate asset templates
- Extract data assets from source
- Transform data assets
- Resolve identifiers on target
- Load data assets on target
- Package store assets
Consolidate the environment
The process of managing data assets in WebSphere Commerce involves running utilities and processing XML documents. The WebSphere Commerce utilities include:
- massextract utility
- txttransform utility
- xmltransform utility
- idresgen utility
- massload utility
These commands can be found in WCDE_installdir\bin. These commands reference a loading utilities found in the WCDE_installdir\lib. They can be best operated in a Windows environment being referenced by batch files. Each batch file would run each of the commands listed above in a command prompt. To allow these files to run from any location within a Windows environment, so that they will set the WebSphere Commerce path, set the database environment and data asset variables. The batch files will set these environment variables by calling a file named env.bat.
The following example shows the env.bat file used for this implementation:
:: Database Variables set DBNAME=shpudb set DBUSER=wcsadmin set DBPWD=xxxxxx set CUSTOMIZEFILE=OracleConnectionCustomizer.properties :: Loading utility input/output files set EXTRACTFILE=Extract.xml set IDENTTRANSFORM=..\identtransform.xsl set IDENTTRANSFORMUSER=identtransform.xsl set TRANSFORMEDFILE=Transform.xml set RESGENFILE=ResGen.xml :: Data Asset Unique References set MEMBERID=-2001 set CATALOGNAME=ConsumerDirect set CATALOGID=10001 set STOREENTID=10001 set TRADEPOSCNID=10001 set STORENAME=ConsumerDirect :: WebSphere Commerce Path Location and Assets set WCSHOME=E:\WebSphere\CommerceDev\Commerce set DTDFILE=%WCSHOME%\schema\xml\wcs.dtd set PATH=.;%WCSHOME%\bin;%PATH% set processType=server set TOOLKITDIR=E:\WCToolkit
Identify assets
After the environment between test and staging has been consolidated, the next step is to consolidate assets. Identifying data assets can start at the subsystem level. These are the WebSphere Commerce subsystems:
Consolidate asset templates
An asset template is a file that is used as a filter to extract data from the source database. This template is in XML format which contains a function definition, an SQL statement and an execute reference.
The following example shows an asset template for users:
<?xml version="1.0" encoding="UTF-8"?> <sqlx> <!--Begin functionDef elements for assetGroup "users"--> <functionDef id="Users" description="Users" schemaentity="users"> <body> select distinct users.* from users, userreg where users.users_id > 0 AND users.users_id = userreg.users_id AND userreg.status = 1 </body> </functionDef> <execute id="Users"/> </sqlx>
The attributes of the functionDef element are used to describe the SQL statement within the child body element. The id attribute is used as a reference from the execute element. The schemaentity attribute is used to cross-reference the DTD file that contains the schema information. This value would be a table that exists in the WebSphere Commerce database. The child body element of the functionDef element consists of an SQL statement. This SQL statement can be designed as a static SQL statement that accepts parameters. For this example, a static SQL statement is used without parameters. Some characters are not permissible in this SQL statement such as the greater than and less than signs. These characters must be replaced with Unicode codes as referenced in the encoding section of the same document (for example, encoding="UTF-8"). In order to avoid ambiguities, > is used as an escape for the > character in this SQL statement.
The problem when consolidating asset templates is twofold:
- managing the referential integrity of the filtered data
- maintaining the size of the output from the filters
Managing the referential integrity
Processing multiple execute statements from the same filter will provide raw data output. In order for this raw data to be prepared and loaded into another database, the primary and foreign keys must be resolved beforehand. Therefore, when designing a filter that will provide output, special care must be considered by the ordering of the tables referenced and the filtering SQL statements.
The ordering of the tables must follow the database tables' top-down orientation. Tables that hold foreign keys must concede to tables with the associated primary keys. In some instances, the parent table, which holds a primary key of a table holding the foreign key in the filter, might not be present. In this case, the foreign key must be hard-coded throughout the data asset management process. Hence, the need for a 'Data Asset Unique Reference' section of the env.bat file. Otherwise, in all other cases, the parent tables must precede the child tables within a filter.
The filter itself could possibly define parameters that hold unique values to be used in the static SQL statement. These parameters are most likely used when restricting data within SQL (predicates of the where clause). The filters used for the examples in this document did not define any parameters in the filters.
Maintaining the size of output
To extract data to represent a subsystem would result in a very large XML output file. Therefore, subsystems are broken up into smaller subsets. This helps to easily manage the output from the extraction process and also improves performance. When breaking up the extraction filters into smaller subsets, the parent-child relationship in the filters can be temporarily ignored during the extraction process, but must merge during the Identifier resolve step. For example, the catalog subsystem can extract groupings and catalog entries in one extract filter and extract attributes in another extract filter. Before Identifier Resolve runs, the attributes output must be merged with the catalog entries. The need for this step is due to the foreign key references embedded within the attributes output that points to the catalog entries.
Extracting data assets from source
The process of extracting data assets is a careful process because problems are not found until the Identification Resolving step. Therefore, foreign key references must be acknowledged or problems might originate from the data that is extracted.
One example is when one subsystem requires primary key references from another subsystem. There are instances when subsystems must be merged. For example, the trading subsystem must be merged with the catalog subsystem. This is because the trading subsystem holds the prices for catalog entries, which preserve foreign keys for resolving the catalog subsystem. The rule is that any foreign key references that require identification resolving would need to have the table present in the same XML file that holds the primary key.
The following example shows a merge between catalog entries and their respective prices:
<catentry catentry_id="@catentry_id_13085" member_id="-2001" itemspc_id="@itemspc_id_12669" catenttype_id="ItemBean" partnumber="1234567-SKU" mfpartnumber="1234567" mfname="CompanyName" markfordelete="0" lastupdate="2003-09-13 16:03:18.000000" buyable="1" state="1"/> <offer offer_id="@offer_id_13306" tradeposcn_id="@tradeposcn_id_10151" catentry_id="@catentry_id_13085" precedence="5.00000000000000E+000" published="1" lastupdate="2003-09-13 16:03:18.000000" flags="1" identifier="1063483398911"/>In the example above, the offer element makes a reference to catentry_id_13085. In order for the IdResGen command to process, the actual base table that holds the primary key must be present in the same XML file. In this case, the catentry element is present in order to resolve the primary key. The offer element was extracted from the trading subsystem while the catentry element was extracted from the catalog subsystem. This illustrates the necessity to merge multiple extracted files together (where each extracted file represents a subset of data).
When the extraction process is ready for execution, the batch file titled Step1-extract.bat is run. This command contains one simple command:
massextract -customizer dbtypecustomizer.properties -dbname %DBNAME% -dbuser %DBUSER% -dbpwd %DBPWD% -filter extract.xml -outfile extractedData.xmlWhen you run this command, output will be sent to two or three files. The extractedData.xml holds the raw data output in XML format. Each row in a table is represented as attributes of an element referenced by its associated table name. Another output file is called trace.txt. This file holds the tracing of the executed statements performed during the extraction process. A third file, messages.txt, is generated only if there are errors during the extraction process.
Transform data assets
The process of transforming data assets prepares the extracted output for the Identification Resolving step. In order to make this preparation successful, translate the extracted data into a format that can be understood by any database instance of WebSphere Commerce. The following example shows the command syntax that accomplishes this:
xmltransform -outfile transformedOutput.xml -infile extractedData.xml -transform %IDENTTRANSFORM% -param "dtdfile=%DTDFILE%" -param "rootelement=import" -param "member_id=%MEMBERID%"As shown in the example above, the IDENTTRANFORM parameter passed from the env.bat file is the value of the actual transformation XSL file. This file contains the logic to transform the extracted data. The -param key-value pairs will be used as parameter values that are used to hard-code references within the transformed file.
The same extracted output file may have to go through different types of transform depending on the requirements imposed by the target data state. idresgen utility handles resolution for tables that have identifiers generated for them by the system. The transform process must prepare table elements in the XML files by imposing a special reference in primary and foreign keys. Also, parameters can be passed into the transform command that can be used as special values while transforming the extracted XML file.
Delimiter
For example, the following excerpt shows an extracted xml file followed by an excerpt from a transformed XML file:
<catentry catentry_id="13085" member_id="7000000000000000051" itemspc_id="12669" catenttype_id="ItemBean" partnumber="1234567-SKU" mfpartnumber="1234567" mfname=" CompanyName " markfordelete="0" lastupdate="2003-09-13 16:03:18.000000" buyable="1" state="1" /> <catentry catentry_id="@catentry_id_13085" member_id="7000000000000000051" itemspc_id="@itemspc_id_12669" catenttype_id="ItemBean" partnumber="1234567-SKU" mfpartnumber="1234567" mfname="CompanyName" markfordelete="0" lastupdate="2003-09-13 16:03:18.000000" buyable="1" state="1"/>In the example above, most attributes with a '_id' in the name have values transformed from original raw data into a special delimited '@' reference.
In the following excerpt from the XSL style sheet used to process this transformation is shown below:
<!-- attribute ends with '_id' --> <xsl:when test="contains(local-name(),'_id') and string-length(substring-after(local-name(),'_id'))=0 and not (.='0' or string-length(.)=0)"> <xsl:attribute name="{local-name()}"> <xsl:value-of select="concat('@',concat(local-name(),concat('_',.)))" /> </xsl:attribute> </xsl:when>In the example above, the transformation XSL will take the value of an ID table and prefix the '@' delimiter with the name of the attribute and postfix the original value of the key.
Alternate foreign references
Transformation may also be required to make changes to ID tables to reference other foreign keys. For example, the following example shows the MBRREL table:
<mbrrel descendant_id="@descendant_id_8802" ancestor_id="@ancestor_id_-2001" sequence="2"/> <mbrrel descendant_id="@member_id_8802" ancestor_id="@member_id_-2001" sequence="2"/>As shown above, both the descendant_id and ancestor_id attributes actually reference the member_id foreign key; therefore, the transformation process must prepare these keys appropriately. This transformation must be performed as a second substep after the first transformation which creates the initial '@' delimiter.
Super class references
To resolve super class references see the tutorial Super Class References in Member Migration.
Resolving identifiers on target
After the transformation steps have succeeded, the transformed output files must be moved to the target machine (unless the local database has been cataloged to a remote database and the database name is specified in the env.bat file). Once the transformed output files are ready for resolving on the target database, the IdResGen command can be executed. The following example shows the command syntax:idresgen -customizer dbtypecustomizer.properties -dbname %DBNAME% -dbuser %DBUSER% -dbpwd %DBPWD% -infile transformedOutput.xml -outfile resolvedOutput.xml -method mixedIn the example above, the mixed method was specified. This is because performance is not a concern; therefore, the LOAD import process will not be used. The LOAD process will put the database into check-pending state which requires DBA expertise to set integrity after any load errors.
When you run this command, output will be sent to two or three files. The resolvedOutput.xml holds the resolved data output in XML format. Each row in a table is represented as attributes of an element referenced by its associated table name. The data values for these attributes are expected to be massloaded directly into the target database. Another output file is called trace.txt. This file holds the tracing of the executed statements performed during the resolved process. A third file, messages.txt, is generated only if there are errors during the idresgen utility process.
Loading data assets on target
After the idresgen utility step has succeeded, the resolved output files must be moved to the target machine (unless the local database has been cataloged to a remote database and the database name is specified in the env.bat file). Once the resolved output files are ready for importing on the target database, the massimport command can be executed. The following example shows the command syntax:
massload -customizer dbtypecustomizer.properties -dbname %DBNAME% -dbuser %DBUSER% -dbpwd %DBPWD% -infile resolvedOutput.xml -method sqlimport -commitcount 20For this example, the method chosen was sqlimport. This method does not put the tables into check-pending state in case of load errors. On the other hand, it has the slowest performance. The commitcount was set to 20, but to improve performance, you can increase this value to 1000 or greater, depending on the number of items in the file. There is also an option that can limit the number of errors to ignore, but it was not used in this example. When you run this command, output will be sent to one or two files. The file trace.txt holds the tracing of the executed statements performed during the resolved process. A second file, messages.txt, is generated only if there are errors during the import process.
Packaging store assets
This tutorial is the recommended guide for packaging data assets into the Store Archive (SAR). The store archive file (.sar) is a ZIP archive file that contains all the assets necessary to create a site or store. It is primarily used as a vehicle for packaging and delivering the stores.Typically, a store archive is composed of the following files:
For the purposes of preparing a store archive for a rapid solution in WebSphere Commerce v5.6, the extracted data assets can be re-used and packaged into a new store archive. The following list contains a common scenario for managing data assets and packaging them into a store archive for launch.
- Web Assets: The files that create your store pages, such as HTML files, JSP files, images, graphics and include files.
- Property resource bundles: Contain the text for your store pages. If your store supports more than one language, the store archive will contain multiple resource bundles, one per supported language, plus a default resource bundle (which does not include a locale).
- Store data assets: The data to be loaded into the database. Store data assets include data such as campaigns, catalog entries, currencies, fulfillment information, pricing, shipping, store, and taxation information. The store data assets in the starter store archives provided with WebSphere Commerce are well-formed XML files valid for the loading utilities. The store archive XML files are intended to be portable and should not contain any generated primary keys that are specific to a particular instance of the database. Instead they use internal aliases, which are resolved by the idresgen utility when the store is published. The use of these conventions enables the portability of the starter store archives.
- Payment assets: Configuration information for WebSphere Commerce Payments. The payment information is not loaded through the loading utilities; it provides input to a command that configures WebSphere Commerce Payments.
- Descriptors: XML files that describe the store archive and contain information about how it should be published.
- A starter store is first published for demo purposes.
- Fit-Gap analysis is performed to find the best fit between the business requirements and the default functionality.
- With the starter stores in place, new data is created through the WebSphere Commerce tools provided in the WebSphere Commerce components.
- Data created from the starter store demo environment is then filtered and extracted.
- The extracted data, in XML format, is then transformed, where the generated primary keys are converted into internal aliases.
- Those transformed XML files are resolved against a test database for Quality Assurance.
- Those transformed XML files are then packaged into a Store Archive.
For a rapid solution, data assets must be prepared for the SAR. This enables portability of the rapid solution. The data assets that have been extracted and transformed to be used for launch must be packaged within the XML files containing the Store data assets of the SAR. A rapid development cycle can be easily supported from packaging data assets into a store archive. From the extract and transform steps provided in this document, prepare the data assets after you have packaged a store.
In the next step of this tutorial, you will learn how to update the env.bat file needed to run the WebSphere Commerce utilities.
(C) Copyright IBM Corporation 1996, 2006. All Rights Reserved.