Search service configuration parameters
The values set for parameters of a portal search service apply to that search service and all its collections. They do not affect other search services of the portal or their search collections. If you modify a search service parameter that affects search collections, this modification will apply only to newly created search collections created in the search service. Existing search collections will not be affected by the updated parameter value. SOAP support for remote search services has been deprecated with WebSphere Portal v8.0. EJB is still supported.
The parameter list in both the search services panel of the Manage Search portlet and in the following table show several parameters that end with the suffix _EXAMPLE. These are not used by the portal. They serve as an example for the same parameter without the suffix _EXAMPLE. They give an example value that we might use. Deleting these parameters or modifying their value has no effect.
To set a parameter that is listed here, but not in the portlet, just add it. To do this, type the parameter and the value in the entry fields Parameter key: and New parameter value: and click the Add Parameter button.
In the following list the abbreviation pse in parameters or values stands for Portal Search Engine.
Search service parameters
CLEAN_UP_TIME_OF_DAY_HOURS Time of day at which the portal performs the maintenance process for search collections to remove outdated files and broken links. Possible values are positive integers from 0 to 24 for the full hours of the day. Default is 0, which runs the cleanup at midnight. Modified values are applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. DefaultCollectionsDirectory Default directory for search collections. If we use Portal Search locally, this parameter is optional. If we specify no value for this parameter, the default collection directory is... WP_PROFILE/PortalServer/collections
If you set up a remote search service, this parameter is mandatary.
CONFIG_FOLDER_PATH Determine where the configuration data for search collections is stored. The default is... WP_PROFILE/CollectionsConfig
EJB If you set up an EJB remote search service, use this parameter to specify the EJB name in JNDI. Example: ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome
If set, you also need to set the IIOP_URL parameter.
EJB_Example Example value for the parameter EJB ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome
HTTP_MAX_BODY_SIZE_MB Limit how much content is fetched during a crawl from application files, such as PDF, Microsoft Word etc. The specified unit is MB. Default is 20 MB. If a file exceeds the specified limit, the document is truncated, and Portal Search indexes the fetched portion as is possible. However, indexing might fail on truncated documents; in this case the document will not be listed under search results at all. If modified, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. DCS might not be able to convert the content of truncated application files. If DCS fails to convert a truncated application file, it logs an error to the SystemErr.log file. If tracing is enabled for the portal, Portal Search logs a warning message to the portal log file. HTTP_MAX_SEEDLIST_SIZE_MB Limit how much portal content is fetched during a crawl from our own portal site. It determines the amount of space that is reserved for listing portal site resources or managed Web content resources. The specified unit is MB. Default is 4 MB. If a crawl exceeds the limit set for this parameter, the crawl fails, and Portal Search logs an error message. In this case, or if returned search results do not represent to complete extent of the portal site resources, increase this value. If modified, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. HTTP_NON_APPL_MAX_BODY_SIZE_MB Limit how much content of each HTML page is fetched from Web sites of collections that belong to this search service. The specified unit is MB. Default is 0.2 MB. This means that the amount of content sent for indexing is always the first 0.2 MB of text. If modified, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. IIOP_URL If you set up a remote search service using EJB, use this parameter to specify the IIOP URL. An example value is iiop://localhost:2811 . IIOP_URL_Example Example value for the parameter IIOP_URL . iiop://localhost:2811
PSE_TYPE Type of search service. Possible values are localhost, ejb, and soap. The default value is localhost for local search service. If we use Portal Search locally, this parameter is optional. If you set up a remote search, this parameter is mandatary. In this case specify the type of remote service that we use, EJB or SOAP. If we specify ejb here, you also need to specify the values for the parameters EJB and IIOP_URL . If we specify soap here, you also need to specify the values for the parameter SOAP_URL. SEARCH_SECURITY_MODE Define access control enforcement during search. Three filter modes are supported. Specify one of the following values, depending on the filter mode to use:
SECURITY_MODE_PREFILTER Pre-filtering provides the fastest filtering, as it is performed in the search index level. An additional advantage is that remote secured content sources can be searched from portal. However, as it is based on search index only, the search result list can be temporarily inconsistent with user access rights if these access rights were changed after the last crawl:
- If users who had their access rights restricted after the last crawl, they might get search results listed to which they had access before, but to which they no longer have access. When these users click such a link in the search result list, they cannot access the document.
- If a user was given access rights on documents after the last crawl, the user will not get these documents listed among the search results until after the next crawl.
If the search service contains Portal content (a collection containing a content source of type Portal site), then this security mode is invalid and must not be used.
SECURITY_MODE_POSTFILTER Post-filtering provides the safest but costly filtering approach. It checks access permission in real time for each returned search result against Portal Access Control. As a result we can use it only for local content sources. This was the only filtering mode available before portal V 7.0. SECURITY_MODE_PRE_POST_FILTER Default. Pre-post-filtering combines the two filter modes above. Balanced method for enforce access control. Filters most irrelevant documents at the pre-filtering phase based on the search index. This results in fewer rejections in the post-filtering phase. As it still uses post-filtering, we can apply it only for local content sources. As it uses pre-filtering, search result lists might be temporarily inconsistent with users' access rights until after the next crawl. SEEDLIST_PAGE_TIMEOUT Increase the timeout for fetching the seedlist page. The specified unit for the value is seconds. Default is 150 sec. This means that the portal search attempts to fetch the seedlist main URL for 150 seconds. If modified, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections.
SOAP_URL If you set up a remote search service using SOAP, use this parameter to specify the SOAP URL. Example: http://localhost:10000/WebScannerSOAP/servlet/rpcrouter
SOAP_URL_Example Example value for the parameter SOAP_URL . http://localhost:10000/WebScannerSOAP/servlet/rpcrouter
The following parameters are reserved for internal use only. Do not change their values.
CONTENT_SOURCE_TYPE_FEATURE_NAME Reserved for internal use only. Do not change its value. Default is ContentSourceType . CONTENT_SOURCE_TYPE_FEATURE_VAL_PORTAL Reserved for internal use only. Do not change its value. Default is Portal . CONTENT_SOURCE_TYPE_FEATURE_VAL_WEB Reserved for internal use only. Do not change its value. Default is Web . SecurityResolverId Reserved for internal use only. Do not change its value. Default is com.ibm.lotus.search.plugins.provider.core.PortalSecurityResolverFactory. SetProperties Reserved for internal use only. Do not change its value. Possible values are on or off . The default value is on . startup Reserved for internal use only. Do not change its value. Default is false. VALIDATE_COOKIE Reserved for internal use only. Do not change its value. Default is 123 . WORK_MANAGER Specify the work manager. Reserved for internal use only. Do not change its value. Default is wps/searchIndexWM . WORK_MANAGER_DEPLOY Example of the deployed WORK_MANAGER parameter. Example: wps/searchIndexWM . WORK_MANAGER_NATIVE This is an example of the parameter WORK_MANAGER for native threads for debug purposes only. Example: force.hrl.work.manager.use.native.threads .
WORK_MANAGER_NAME JNDI name of the work manager that Portal Search uses.
Parent: Administer Portal Search
Related:
Manage search services
Configure the default location for search collections