JDBC mediator generated query
If we do not provide a structured query language (SQL) SELECT statement, then the data mediator service (DMS) generates one using the metadata provided at instance creation.
The internal query engine uses information in the metadata about tables, columns, relationships, filters, and order-bys to construct a query. As with the supplied queries, UPDATE, DELETE, and INSERT statements are automatically generated for each DataObject to be applied when the mediator commits the changes made to the DataGraph back to the database.
Filters
Filters define an SQL WHERE clause that might contain parameter markers. These are added to the DataGraph SELECT statement WHERE clause. Filters are used as is; they are not parsed or interpreted in any way so there is no error checking. If we use the wrong name, predicate, or function, it is not detected and the generated query is not valid. If a Filter WHERE clause contains parameter markers, then the corresponding parameter name and type are defined using Filter arguments. Parameter DataObjects fill in these parameters before the graph is retrieved. An example of the Filters and Parameter DataObjects for generated queries follows.
Limitation: Because of the tree-like nature of the DataGraph, any table at a branch appears in more than one subquery in the final union with the root table appearing in all paths. This means that it is not possible to filter on a table that appears in more than one path independent of all other paths. All filters defined on a particular table are joined by a boolean AND, and used everywhere that table appears.
Parameter DataObjects for generated queries
Clients use a Parameter DataObject to supply arguments applied to the filters provided in the DMS metadata. A Parameter DataObject is a DataObject, but is not part of any DataGraph. It is constructed by the JDBC DMS when requested by the client. The Parameter DataObject for generated queries is created based on the mediator's metadata. Every argument of every filter of every table is put into the Parameter DataObject. Unlike the supplied query Parameter DataObject, the parameters have the name assigned to them by the Filter arguments. The Parameter DataObject uses this name to map to the parameter to be filled in. The following sample code illustrates how a filter is created for a table in the mediator metadata. It also demonstrates the use of a Parameter DataObject to pass filter parameter values to a mediator instance. The sample assumes that the Customer table has already been defined:
// The factory is a MetadataFactory object Filter filter = factory.createFilter(); filter.setPredicate("CUSTSTATE = ? AND CUSTZIP = ?"); FilterArgument arg0 = factory.createFilterArgument(); arg0.setName("customerState"); arg0.setType(Column.String); queryInfo.getFilterArguments().add(arg0); FilterArgument arg1 = factory.createFilterArgument(); arg1.setName("customerZipCode"); arg1.setType(Column.Integer); queryInfo.getFilterArguments().add(arg1); // custTable is the Customer Table object custTable.setFilter(filter); ..... // setting up mediator DataObject parameters = mediator.getParameterDataObject(); // Notice the first parameter is the name given to the // argument by the FilterArgument. parameter.setString("customerState", "NY"); parameter.setInt("customerZipCode", 12345); DataObject graph = mediator.getGraph(parameters);
Order-by
Ordering of query results is specified using OrderBy objects that identify a column from a table to sort the results. This ordering can be either ascending or descending. The OrderBy objects are part of the metadata and are automatically applied to generated queries. An example of this for a customer table results to be sorted by first names is as follows:
// This example assumes that the custTable, a table in // the metadata, and factory, the MetaDataFactory // object, have already been created. Column firstName = ((TableImpl)custTable).getColumn("CUSTFIRSTNAME"); OrderBy orderBy = factory.createOrderBy(); orderBy.setColumn(firstName); orderBy.setAscending(true); metadata.getOrderBys().add(orderBy);Limitation: Even though Order-bys are defined on each table in the metadata, the RDBMS model requires them to be applied to the final query. This has many implications. For example, we cannot order a table and then use that in a join to another table and propagate the ordering in the first table. Because a result set is a union of all the tables in the DataGraph, the nature of the single result set requires that it be padded with nulls, which can affect the order-bys, particularly in the non-root tables. This can give unexpected results.
External tables
An external table is a table defined in the metadata that is not needed in the DataGraph returned by the JDBC DMS. This might be appropriate when we want to filter the result set based on data from a table but that table's data is not needed in the result set. An example of this with the Customers and Orders relationship would be to filter the results to return all customers who ordered items with an order date of the first of the year. In this case, we do not want any order information returned, but we do need to filter on the order information. Making the Orders table external excludes the orders information from the DataGraph and therefore reduces the DataGraph size, improving efficiency. To designate a table as external, you call the setExternal(true) method from a table object in the JDBC DMS metadata. If the client tries to access an external table from the DataGraph, an illegal argument exception occurs.
Limitation: Many RDBMSs require an orderby column appear in the final result set; the columns from an external table cannot in general be used to order a result set. Order-bys are actually applied to the result set (the word "set" is key here), and not to intermediate query results.
General limitations of generated queries
In understanding the limitations of the query generation feature in the JDBC DMS, there are two things to keep in mind. The first is that the DataGraph imposes a model that is a directed, connected graph with no cycles (that is, a model that is a tree) on a relational model that is a non-directed, potentially disconnected graph with cycles. Directed means that the developer chooses the orientation of the graph by picking a root table. Connected means that all tables that are a member of the DataGraph are reachable from the root. Any tables that are not reachable from the root cannot be included in the DataGraph. In order for a table to be reachable from the root, there must be at least one foreign key relationship defined between each pair of tables in the DataGraph. No cycles means that there is only one foreign key relationship between a pair of tables in the DataGraph. The tree nature of the DataGraph determines how the queries are built, and what data is returned from a query.
The second item to keep in mind is the following high level description of how query generation produces read queries for a DataGraph:
- The JDBC DMS creates a single result set (that is, a DataGraph) whether the DataGraph is composed from a single table or from multiple tables.
- Each path through the foreign key relationships in DMS Metadata from root to leaves represents a separate path. The data for that path is retrieved using joins across the foreign keys defined between the tables in the path. The joins are by default inner joins.
- All the paths in a DataGraph are unioned together in order to create a single result set by the query generated by the mediator, and are thus treated independently of one another.
- Any user-defined filtering is done first on the tables. Then the result is joined to the rest of the path.
- Relational databases generally require order-bys to be applied to the entire final result set and not on intermediate results.