SearchCellConfig commands
Overview
The SearchCellConfig commands are used to...
- Configure the location of the Search index and the IBM LanguageWare dictionaries used by Search
- Configure the file download and conversion service used when indexing file attachments
SearchCellConfig commands
To run the commands, initialize the Search configuration environment.
For SearchCellConfig commands that modify data, check out search-config.xml using
SearchCellConfig.checkOutConfig()
After making edits, check in the changes using
SearchCellConfig.checkInConfig()
When the server next restarts, the changes will take effect. Any of these changes require the indexes to be rebuilt.
SearchCellConfig.checkInConfig()
Check in the Search configuration file. The edited copy of the Search configuration file, search-config.xml, is validated against the XSD schema definition file, search-config.xsd.
The checkInConfig command copies the updated configuration file from the temporary directory to the location of the active copy of these files, overwriting the existing XML file.
For example:
SearchCellConfig.checkInConfig()
SearchCellConfig.checkOutConfig(String /tmp, String cell_name)
Use before changes are made to the Search configuration file.
SearchCellConfig.disableAttachmentHandling()
Disable the indexing of file content in the Files, Wikis, and Library (ECM Files) applications. This command does not take any input parameters.
SearchCellConfig.disableDictionary(String languageCode)
Disable the specified LanguageWare dictionary. This command accepts one argument:
languageCode The language code for the dictionary to delete. The language code typically comprises two letters conforming to the ISO standard 639-1:2002 that identifies the primary language of the dictionary. However, there are some codes that additionally define a country or variant, in which case these constituent parts are separated by an underscore. For example, Portuguese has two variants, one for Portugal (pt_PT) and one for Brazil (pt_BR). When using a code that also specifies a country, use an underscore to separate the language code and the country code rather than a hyphen; otherwise an error will be generated. For example:
SearchCellConfig.disableDictionary("fr")
SearchCellConfig.disableEcmPostFiltering()
Disable post-filtering for community libraries. Post-filtering is enabled by default.
SearchCellConfig.disableVerboseLogging()
Disable verbose logging.
Verbose logging fills SystemOut.log with detailed output that can occupy an increasing amount of disk space, unless you have configured the deployment to retain only a limited number of the most recent log files. A high turnover of logs might be a problem when we are trying to track down the cause of an issue if the log file that we are interested in has been deleted. For this reason, you might want to disable verbose logging. The performance impact of having verbose logging enabled is negligible.
SearchCellConfig.enableAttachmentHandling()
Enable the indexing of file attachments in the Files and Wikis applications.
If we already disabled the attachment handling of files during the last indexing, rebuild the index again after enabling attachment handling. Otherwise, this command won't take effect. This command does not take any input parameters.
SearchCellConfig.enableDictionary(String languageCode, String dictionaryPath)
Enable support for the specified LanguageWare dictionary. This command accepts two arguments.
languageCode The language code for the dictionary to delete. The language code typically comprises two letters conforming to the ISO standard 639-1:2002 that identifies the primary language of the dictionary. However, there are some codes that additionally define a country or variant, in which case these constituent parts are separated by an underscore. For example, Portuguese has two variants, one for Portugal (pt_PT) and one for Brazil (pt_BR). When using a code that also specifies a country, use an underscore to separate the language code and the country code rather than a hyphen; otherwise an error will be generated. dictionaryPath
The path to the directory containing the dictionary file.
For example:
SearchCellConfig.enableDictionary("fr","/opt/IBM/Connections/data/shared/search/dictionary")
We can also specify the path using a WebSphere environment variable. In the following example, the "${SEARCH_DICTIONARY_DIR}" value is used to point to the shared Search dictionary directory.
SearchCellConfig.enableDictionary("fr","${SEARCH_DICTIONARY_DIR}")
SearchCellConfig.enableEcmPostFiltering()
Enable post-filtering for community libraries. Post-filtering is enabled by default.
SearchCellConfig.enableVerboseLogging()
Enable more detailed status reporting during crawling and indexing in the form of more verbose logging to SystemOut.log.
Verbose logging is automatically enabled when Connections is installed.
We can use the following commands to tune the frequency with which status information is logged to SystemOut.log during different stages of the crawling and indexing process:
- SearchCellConfig.setVerboseInitialLoggingInterval(int interval)
- SearchCellConfig.setVerboseSeedlistRequestLoggingInterval(int interval)
- SearchCellConfig.setVerboseIncrementalCrawlingLoggingInterval(int interval)
- SearchCellConfig.setVerboseIncrementalBuildingLoggingInterval(int interval)
SearchCellConfig.excludeInactiveProfilesSearchResults()
Documents corresponding to inactive user profiles are excluded from search results. In a default installation of Connections, inactive user profiles are automatically excluded from search results.
SearchCellConfig.includeInactiveProfilesSearchResults()
Documents corresponding to inactive user profiles are included in search results. In a default installation of Connections, inactive user profiles are automatically excluded from search results.
SearchCellConfig.listDictionaries()
List the LanguageWare dictionaries configured for Search. These dictionaries are used by common Search to support indexing multilingual content and searching in multiple languages. This command does not take any input parameters.
SearchCellConfig.setBackupType(String type)
Type of backup to create.
Specify the backup type:
new Create a new index backup every time. dual Create dual copies and overwrites the oldest existing backup. overwrite Overwrite the existing index backup.
For example:
SearchCellConfig.setBackupType("new")
SearchCellConfig.setDefaultDictionary(String languageCode)
Configure the default LanguageWare dictionary used by the Search application. The default dictionary must be one of the enabled dictionaries. This command takes a single argument:
languageCode The language code for the dictionary to delete. The language code typically comprises two letters conforming to the ISO standard 639-1:2002 that identifies the primary language of the dictionary. However, there are some codes that additionally define a country or variant, in which case these constituent parts are separated by an underscore. For example, Portuguese has two variants, one for Portugal (pt_PT) and one for Brazil (pt_BR). When using a code that also specifies a country, use an underscore to separate the language code and the country code rather than a hyphen; otherwise an error will be generated. A matching dictionary must exist in the list of configured dictionaries for the language specified as a parameter.
For example:
SearchCellConfig.setDefaultDictionary("fr")
SearchCellConfig.setDeletePersistedPages(String enabled)
Whether to delete the persisted pages after a successful incremental index. By default, the value is set to true. This command takes a single argument:
enabled A string that determines whether persisted pages are to be deleted after a successful incremental index. This string represents a boolean, that is, it should be set to true or false.
When this functionality is enabled, persisted pages from the initial index creation are also deleted after a successful incremental index.
For example:
SearchCellConfig.setDeletePersistedPages("false")
SearchCellConfig.setDownloadThrottle(long downloadThrottle)
Set the duration of a rest period between successive files downloads in a single file-download thread.
Specify the download throttle size in milliseconds. The download throttle is set to 500 by default.
Can increase the load on the Files server.
For example:
SearchCellConfig.setDownloadThrottle("500")
SearchCellConfig.setEcmPostFilteringConnectionTimeOut(connectionTimeOutInMillis)
Connection timeout value for post-filtering.
If the timeout occurs, community library documents are removed from the search results. Results for community documents that have no access control are still shown.
connectionTimeOutInMillis The connection timeout for post-filtering in milliseconds.
For example:
SearchCellConfig.setEcmPostFilteringConnectionTimeOut(1000)
SearchCellConfig.setEcmPostFilteringMaxGapSize(maxGapSize)
Maximum gap size allowed for post-filtering.
If a user uses the pagination controls in the Search user interface, post-filtering calculation is performed when jumping from page 1 of the search results to, for example, page 4. However, you may not want to allow post-filtering calculation when jumping to page 100 for performance reasons. This command specifies the maximum gap allowed for post-filtering calculations between the current page and the requested page.
Parameter:
maxGapSize The maximum gap allowed between the current page (for which the accurate index is known) and the requested page for post-filtering calculations.
For example:
SearchCellConfig.setEcmPostFilteringMaxGapSize(250)
SearchCellConfig.setEcmPostFilteringMultiplier(multiplier)
Multiplier for post filtering.
When a user requests a certain page size for their search results, the Search application attempts to populate the page with the specified number of results. For example, if the user requests a page size of 10, the Search application checks more than 10 documents. However, a limit is required to avoid performance issues. A multiplier of 3 specifies that up to 30 documents are loaded to identify 10 documents to which the user has access. In most cases, statistically, this should be enough to fill the page. If the page cannot be fully populated after checking all 30 documents, a page with fewer search results is returned to the user.
If we frequently receive partially filled search result pages in Connections, You should change this parameter.
Parameter:
Multiplier Number of documents checked in the attempt to populate the search results page.
For example:
SearchCellConfig.setEcmPostFilteringMultiplier(20)
SearchCellConfig.setEcmPostFiltering(multiplier,maxGapSize,connectionTimeOutInMillis,socketDataTimeOutInMillis)
Enable post-filtering settings for community libraries with the values specified.
Parameters:
Multiplier Number of documents checked in the attempt to populate the search results page. maxGapSize The maximum gap allowed between the current page (for which the accurate index is known) and the requested page for post-filtering calculations. connectionTimeOutInMillis The connection timeout for post-filtering in milliseconds. socketDataTimeOutInMillis The socket data timeout for post-filtering in milliseconds.
For example:
SearchCellConfig.setEcmPostFiltering(20,100,250,1000)
SearchCellConfig.setEcmPostFilteringSocketDataTimeOut(socketDataTimeOutInMillis)
Set the socket data timeout value for post-filtering.
If the timeout occurs, community library documents are removed from the search results. Results for community documents that have no access control are still shown.
Parameter:
socketDataTimeOutInMillis The socket data timeout for post-filtering in milliseconds.
For example:
SearchCellConfig.setEcmPostFilteringSocketDataTimeOut(3000)
SearchCellConfig.setIndexingResumptionAllowed(boolean allowed)
Enable or disable resumption of interrupted or failed indexing tasks that have not reached a resume point.
allowed A boolean value.
For example, to enable indexing resumption:
SearchCellConfig.setIndexingResumptionAllowed("true")
SearchCellConfig.setMaxCrawlerThreads(String maxThreadNumber)
Maximum number of seedlist threads that can be used when crawling. By default, the value is set to 2.
SearchCellConfig.setMaxCrawlerThreads("3")
SearchCellConfig.setMaximumAttachmentSize(int maxAttachmentSize)
Set the limit on the size of files that can be downloaded for indexing. Files that are greater than the configured maximum attachment size are not downloaded or processed for content indexing. By default, the limit is set to 50 MB, which means that files over 50 MB are not indexed.
Files that are under the specified size are downloaded to a temporary directory located in the index directory, where they go through the text extraction process. The extracted text is then indexed. The temporary directory size available must be greater than the maximum file size allowed for content indexing. This command accepts one argument:
maxAttachmentSize The maximum file size in bytes of any file attachment eligible for indexing. This is an integer value.
For example:
SearchCellConfig.setMaximumAttachmentSize("52428800")
SearchCellConfig.setMaximumConcurrentDownloads(int maxConcurrentDownloads)
Maximum number of threads that perform file downloading on a Search server.
Specify the maximum number of threads. The argument must be an integer greater than zero. Default is 3. The value of the maxConcurrentDownloads argument must not exceed the maximum number of threads set for the DefaultWorkManager Work Manager resources at the Search server scope.
For example:
SearchCellConfig.setMaximumConcurrentDownloads("10")
SearchCellConfig.setMaxIndexerThreads(String maxThreadNumber)
Maximum number of indexer threads that can be used when indexing. By default, the value is set to 1.
Specify the number of threads allowed.
For example:
SearchCellConfig.setMaxIndexerThreads("3")
SearchCellConfig.setMaximumTempDirSize(int maxTempDirSize)
Maximum size of a temporary directory used by a Search server for the files conversion process.
Specify the maximum size in bytes. The argument must be an integer greater than zero. Default is 100 MB.
Files are downloaded to a temporary directory, which is located in the index directory. The temporary directory size available must be greater than the maximum file size allowed for content indexing.
For example:
SearchCellConfig.setMaximumTempDirSize("51200")
SearchCellConfig.setMaxPagePersistenceAge(String maxAgeInHours)
Maximum age for persisted pages in a seedlist persistence directory. By default, the value is set to 720 hours (30 days).
If the pages are older than the maximum age, they are ignored when building an index or resuming a crawl. This command takes a single argument:
maxAgeInHours A string representing an integer that specifies the maximum age in hours of the persisted pages.
For example:
SearchCellConfig.setMaxPagePersistenceAge("42")
SearchCellConfig.setPostBackupScript(String script)
Which shell script or third-party application runs on completion of the backup task.
Specify the name of the shell script or application file.
For example:
SearchCellConfig.setPostBackupScript("backup.sh")
To disable the script, run the command again with an empty string as the argument. For example:
SearchCellConfig.setPostBackupScript("")
SearchCellConfig.setSandIndexerTuning(String indexer, Int iterations)
Set the number of iterations used by a specified social analytics job.
Arguments:
indexer Name of the social analytics indexing job. Valid values: evidence, graph, manageremployees, and tags. iterations Number of iterations for the specified social analytics indexing job.
For example:
SearchCellConfig.setSandIndexerTuning("manageremployees",200)
SearchCellConfig.setSandIndexerTuning("graph",400)
SearchCellConfig.setVerboseIncrementalBuildingLoggingInterval(int incrementalBuildingInterval)
Control the frequency with which update indexing progress is logged to SystemOut.log. Update indexing of a Connections application or set of applications, is an indexing job that updates an index that already has content from all applications that are to be indexed as part of the current indexing job.
Parameter:
incrementalBuildingInterval Number of documents. For example, if an interval of 20 is specified, then for every 20 documents that have been indexed, the number of documents indexed when indexing a particular application during the current indexing job is logged. The incrementalBuildingInterval parameter is set to 100 by default.
We can find additional logging information about update indexing progress in SystemOut.log by searching for occurrences of the CLFRW0600I logging message. For example:
CLFRW0600I: Search is continuing to build the index for blogs: 40 documents indexed.
For example:
SearchCellConfig.setVerboseIncrementalBuildingLoggingInterval(100)
SearchCellConfig.setVerboseIncrementalCrawlingLoggingInterval(int incrementalCrawlingInterval)
Control the frequency with which seedlist update crawling progress is logged to SystemOut.log. An update crawl of an application fetches data that was created, updated, or deleted since the previous crawl of that application began.
incrementalCrawlingInterval Number of seedlist entries. For example, if an interval of 100 is specified, then, for every 100 entries that have been crawled, the number of entries that have been crawled for a particular application during the current indexing job is logged. The incrementalCrawlingInterval parameter is set to 100 by default.
We can find additional logging information about initial index creation in SystemOut.log by searching for occurrences of the CLFRW0589I logging message. For example:
CLFRW0589I: Search is continuing to build the index for profiles: 1,600 seedlist entries indexed.
For example:
SearchCellConfig.setVerboseIncrementalCrawlingLoggingInterval(100)
SearchCellConfig.setVerboseInitialLoggingInterval(int initialInterval)
Control the frequency with which initial index creation progress is logged to SystemOut.log.
initialInterval Number of seedlist entries. A seedlist entry is an indexing instruction that specifies an action, such as the creation, deletion, or update of a specified document in the Search index. For example, if an interval of 500 is specified, then for every 500 entries processed, the number of seedlist entries indexed so far for an application by the current indexing job is logged. The initialInterval parameter is set to 250 by default. We can find additional logging information about initial index creation in SystemOut.log by searching for occurrences of the CLFRW0581I logging message. For example:
CLFRW0581I: Search is continuing to build the index for activities: 3500 seedlist entries indexed.
For example:
SearchCellConfig.setVerboseInitialLoggingInterval(500)
SearchCellConfig.setVerboseLogging(int initialInterval, int seedlistRequestInterval, int incrementalCrawlingInterval, int incrementalBuildingInterval)
Enable verbose logging with the specified initial interval, seedlist request interval, crawling interval, and incremental building interval.
Run this command has the same net effect as calling the following commands in sequence:
- SearchCellConfig.enableVerboseLogging()
- SearchCellConfig.setVerboseInitialLoggingInterval (initialInterval)
- SearchCellConfig.setVerboseSeedlistRequestLoggingInterval (seedlistRequestInterval)
- SearchCellConfig.setVerboseIncrementalCrawlingLoggingInterval (incrementalCrawlingInterval)
- SearchCellConfig.setVerboseIncrementalBuildingLoggingInterval (incrementalBuildingInterval)
SearchCellConfig.setVerboseSeedlistRequestLoggingInterval(int seedlistRequestInterval)
Control the frequency with which seedlist crawling progress is logged to SystemOut.log.
seedlistRequestInterval Number of seedlist page requests. A seedlist crawl is a sequence of seedlist page requests, which are HTTP GET operations that fetch seedlist pages. A seedlist page can contain zero or more seedlist entries up to a specified maximum. For example, if an interval of 1 is specified, then after every seedlist request, the crawling progress of the application being currently crawled is logged. The seedlistRequestInterval parameter is set to 1 by default. We can find additional logging information about seedlist crawling in SystemOut.log by searching for occurrences of the CLFRW0604 logging message. For example:
CLFRW0604 : Current seedlist state: Finish Date: Thu May 12 10:14:58 IST 2011; Start Date: Thu Jan 01 01:00:00 GMT 1970; Type: 1; Last Modified: Thu Jan 01 01:00:00 GMT 1970; Finished: false; Started: true; ACL Start: 0; Offset: 0;
For example:
SearchCellConfig.setVerboseSeedlistRequestLoggingInterval(1)
SearchCellConfig.addFieldBoost(string boostName, string fieldName, float boost)
Add a field with a specific boost to be taken into account in query time.
boostName Name of the boost element to apply changes to. fieldName Name of the indexing field to which the boost is to be applied. boost The relevance score associated with the specified fields in the index. The default values are 2.0 and 3.0 for tag and title.
Example:
SearchCellConfig.addFieldBoost("content", "author", "2.0")
SearchCellConfig.addRecencyBoost(string boostName, float rboost)
Add a tag element that enables boosting results by their recency so that more recently created or edited documents are given a higher score. Recency boost values are 0 - 1. The smaller the recency value is, the higher the score new or edited documents get.
Arguments:
boostName Name of the boost element to apply changes to. rboost The recency boost value to apply (0 - 1).
Example:
SearchCellConfig.addRecencyBoost("content", "0.2")
SearchCellConfig.updateFieldBoost(string boostName, string fieldName, float boost)
Update the value of an existing field tag.
Arguments:
- boostName. The name of the boost element to apply changes to.
- fieldName. The name of the indexing field to which the boost is to be applied.
- boost. The relevance score associated with the specified fields in the index. The default values are 2.0 and 3.0 for tag and title
Example:
SearchCellConfig.updateFieldBoost("content", "author", "3.2")
SearchCellConfig.updateRecencyBoost(string boostName, float rboost)
Update an existing recency boost value.
Arguments:
- boostName. The recency boost value to apply (0 - 1).
- rboost. The recency score associated with the specified fields in the index.
Example:
SearchCellConfig.updateRecencyBoost("content", "0.2")
SearchCellConfig.deleteRecencyBoost(string boostName)
Deletes a recency boost tag.
Arguments:
- boostName. The name of the boost element to apply changes to.
Example:
SearchCellConfig.deleteRecencyBoost("content")
SearchCellConfig.deleteFieldBoost(string boostName, string fieldName)
Deletes a field boost tag.
Arguments:
- boostName. The name of the boost element to apply changes to.
- fieldName. The name of the indexing field from which the relevant tag is deleted from the configuration.
Example:
SearchCellConfig.deleteFieldBoost("content", "author")
SearchCellConfig.deleteBoosts(string boostName)
Delete the boost element that refers to the "content" boostName.
boostName is the name of the boost element to apply changes to.
Example:
SearchCellConfig.deleteBoosts("content")
SearchCellConfig.deleteBoostSettings()
Delete all the boosting parameters from the configuration file.
Example:
SearchCellConfig.deleteBoostSettings()
Parent topic:
Administer Search
Related: