Index Load
Index Load is an indexing service that uses the Data Load framework to load data in parallel into one or more search extension indexes.Index Load is used to populate contract prices when performance reasons require that the site use a separate Price extension index. For example, use Index Load with a Price extension index if the site contains more than 1000 contracts, or if we use an external source to populate prices. Index Load provides the following benefits over populating the catalog entry index with price data:
- It improves indexing performance by using local binding (embedded mode) on the search server to avoid making remote HTTP calls that use HTTPClient.
- The data feed is streamed directly into one or more index columns and no temporary tables are needed. This programming model allows precise data conversion and is easier to customize.
- Metrics can be displayed using the Index Load status command while indexing to help refine tuning parameters and improve performance throughput.
Index Load uses profiles to control the indexing behavior and characteristics for a search extension index. Index Load profiles are defined in the Index Load configuration file.
When you call Index Load, we can pass a profile name through a URL parameter named profile. The value of the profile parameter is used to resolve the actual file name to be loaded from the predefined configuration directory. Both the pattern name and Index Load configuration directory are defined as servlet initialization parameters in the web.xml of the Index Load servlet (SolrIndexLoadServlet).
Tuning Index Load contains more detailed information on how data flows through the multi-threaded indexing application and which tuning parameters can be used. The following diagram shows a high-level overview of Index Load.
Where Index Load contains the following components:
- Index Load Servlet (SolrIndexLoadServlet)
- The Index Load interface. It accepts commands with input information such as profile, catalog, and store. The input information is used to look up the specified configuration files.
- Loader Interface
- Creates loader units to run based on the configured load item (loaditem). Only one loader exists, which can use several load items. Each load item includes a reader, and zero or several mediators.
- Loader Item
- The runnable unit for Index Load. We can pass multiple loader items in parallel, where every loader item is an independent load unit that is controlled by a single data loader.
Within a loader, a data reader exists which can read data in multiple threads, and optional mediators. The mediators are in a chain, where the output of a mediator is the input of another mediator, with a single data writer. The target of multiple loader items can be the same or different core instances.
- Reader
- Reads original physical data from data sources in parallel and passes it to the mediator. The SolrIndexLoadQueryReader is used by default to read data from relational databases as specified by the configuration files.
- Mediator
- The BusinessObjectMediator defines a common interface to take the input from the reader and transform it to follow the convert pattern as specified in the configuration files. We can provide zero or more mediators, where the output of a mediator is the input for the following mediator. When all the mediators finish transforming, the physical data writer persists the physical objects into Solr by calling the Solrj interface.
- Batch Service
- Adds Solr documents and commits them to the Solr server. Only one batch service serves each unique Solr core, with the ability to interact with multiple index writers. The batch service contains an internal queue for buffering unfinished documents from various writers. Once the input document is ready for indexing, it is dispatched to the Solr runtime service.
The batch service is used by default to populate the Price extension index when indexing contract prices by using Index Load.
Limitations
Be aware of the following Index Load limitation:
- Index Load supports only extension indexes. Index Load does not support the Product, Category, or Unstructured indexes.
- Indexing contract prices using Index Load
We can use Index Load to index contract prices from a .CSV file.- Customizing Index Load
We can customize Index Load to suit your business needs. For example, we can customize Index Load to read from multiple sources, or change how the source input is transformed.- Monitoring Index Load
We can monitor Index Load and use the metrics to tune Index Load for optimal performance.- Index Load configuration files for indexing from database
Index Load requires configuration files before it can be run from a web browser.- Index Load configuration files for indexing from CSV files
We can load index information from a CSV file. Index Load requires configuration files before it can be run from a web browser.- Index Load configuration files for merging indexes
Index Load requires configuration files before it can merge search indexes.