Search service configuration parameters
Overview
The values set for parameters of a portal search service apply to that search service and all its collections. They do not affect other search services of the portal or their search collections. If we delete a search service, the portal does not delete the search collections related to this search service. Delete the search collections using the Manage Search administration portlet. If we delete the default search service, it is re-created new when we restart the portal. The Manage Search portlet lists the Default Portal Search Service and its collection Portal Content or other collections in the default portal language and not in the language the user selected as preferred language for the portal or set in the browser. For example, if the portal default language is set to English and the user selected German as the preferred portal language or set the browser language to German, the Default Portal Search Service and its collections show in English.
Search service configuration parameters
The parameter list in both the search services pane of the Manage Search portlet and in the following information shows several parameters that end with the suffix _EXAMPLE. These example parameters are not used by the portal. They serve as an example for the same parameter without the suffix _EXAMPLE. They give an example value that we might use. Deleting these parameters or modifying their value has no effect.
In the following list, the abbreviation pse in parameters or values stands for Portal Search Engine.
boostingSettings Specify which metadata fields are given extra weight in an overall rank score during a search. We can also specify how much the selected metadata fields contribute to relevance circulation when running a search. Can include any other metadata fields (with string based values) "boost" should be specified in a range between 1.0 to 10.0, and should be used with care (suggested to stay in the range between 1.0 and 3.0). See also: Search Integration in HCL WebSphere Portal
fieldBoost Define which metadata fields have extra weight when search results are returned, and how much extra weight is given to the specified fields. Provide the following attributes:
field The relevant string-based metadata field that you would like search to focus on. Field values include title, description, and keywords. boost The relative amount of extra weight added to the rank score. Set between a range of 1.0 and 10.0. The suggested range of the attribute is between 1.0 and 3.0. phraseBoost Nonessential variable that focuses search results on specified languages, for example, English. boostingSetting_Example phraseBoost: {Enabled:true}, fieldBoost: {field:title, boost:3.0}, {field:description, boost:3.0}, {field:keywords, boost:2.0} CLEAN_UP_TIME_OF_DAY_HOURS Time of day at which the portal runs the maintenance process for search collections to remove outdated files and broken links. Possible values are positive integers 0 - 24 for the full hours of the day. The default value is 0 , which runs the cleanup at midnight. If we modify the value for this parameter, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. DefaultCollectionsDirectory Specify the default directory for search collections. For Portal Search locally, this parameter is optional. If we specify no value for this parameter, the default collection directory is WP_PROFILE/PortalServer/collections. If we set up a remote search service, this parameter is mandatory. DEFAULT_SEARCH_OPERATOR Specify how the Portal search engine responds to search queries with two or more terms. The default value is or. Only one search term must be in the document in order for that document to be displayed in the search results list. Change this value to and to retrieve only those documents containing all of the search terms listed in the query. After changing this parameter, restart the Portal server and remote search service. CONFIG_FOLDER_PATH Determine where the configuration data for search collections is stored. The default is WP_PROFILE/CollectionsConfig. EJB If we set up a remote search service using EJB, specify the EJB name in JNDI. For example: ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome
If set, also set the IIOP_URL parameter.
EJB_Example ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome ExternalSecurityResolverUrl Configure the Portal Search service with the information about an external security resolver. Required for security filtering of IBM Connections resources to function properly. An example value of the resolver URL is https://host:port/ConnectionsResourceId/seedlist/authverify/getACLTokens where ConnectionsResourceID is any IBM Connections resource identifier. HTTP_MAX_BODY_SIZE_MB Limit how much content is fetched during a crawl from application files, such as PDF or Microsoft Word. The specified unit is MB. The default value is 20 MB. If a file exceeds the specified limit, the document is truncated, and Portal Search indexes the fetched portion as is possible. However, indexing might fail on truncated documents; in this case the document is not listed under search results at all.
- If we modify the value for this parameter, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections.
- Document Conversion Services might not be able to convert the content of truncated application files. If Document Conversion Services fails to convert a truncated application file, it logs an error to the SystemErr.log file. If tracing is enabled for the portal, Portal Search logs a warning message to the portal log file.
HTTP_MAX_SEEDLIST_SIZE_MB Limit how much portal content is fetched during a crawl from our own portal site. It determines the amount of space reserved for listing portal site resources or managed web content resources. The specified unit is MB. The default value is 4 MB. If a crawl exceeds the limit set for this parameter, the crawl fails, and Portal Search logs an error message. In this case, or if returned search results do not represent to complete extent of the portal site resources, increase this value. If we modify the value for this parameter, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. HTTP_NON_APPL_MAX_BODY_SIZE_MB Limit how much content of each HTML page is fetched from websites of collections that belong to this search service. The specified unit is MB. The default value is 0.2 MB. The amount of content sent for indexing is always the first 0.2 MB of text. If we modify the value for this parameter, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. IIOP_URL If we set up a remote search service using EJB, use this parameter to specify the IIOP URL. An example value is iiop://localhost:2811. IIOP_URL_Example iiop://localhost:2811 PSE_TYPE Type of Portal search engine type search service. Possible values are localhost, ejb, and soap. The default value is localhost for local search service. For Portal Search locally, this parameter is optional. If we set up a remote search, this parameter is mandatory. In this case specify the type of remote service that we use, EJB or SOAP. If we specify ejb here, we also need to specify the values for the parameters EJB and IIOP_URL. If we specify soap here, we also need to specify the values for the parameter SOAP_URL. SEARCH_SECURITY_MODE Define access control enforcement during search. Three filter modes are supported. Specify one of the following values, depending on the filter mode to use:
SECURITY_MODE_PREFILTER Use pre-filtering mode. Pre-filtering provides the fastest filtering, as it is performed in the search index level. An extra advantage of this filtering mode is that remote secured content sources can be searched from portal. However, as it is based on search index only, the search result list can be temporarily inconsistent with user access rights if these access rights were changed after the last crawl:
- If users who had their access rights restricted after the last crawl, they might get search results listed to which they had access before, but to which they no longer have access. When these users click such a link in the search result list, they cannot access the document.
- If a user was given access rights on documents after the last crawl, the user will not get these documents listed among the search results until after the next crawl.
If the search service contains Portal content (a collection containing a content source of type Portal site), then this security mode is invalid and must not be used.
SECURITY_MODE_POSTFILTER Use post-filtering mode. Post-filtering provides the safest but costly filtering approach. It checks access permission in real time for each returned search result against Portal Access Control. As a result use it only for local content sources. This was the only filtering mode available before portal V 7.0. SECURITY_MODE_PRE_POST_FILTER Use pre-post-filtering mode. Default. Pre-post-filtering combines the two filter modes previously mentioned. It provides a balanced method for enforce access control. It filters most irrelevant documents at the pre-filtering phase based on the search index. This results in fewer rejections in the post-filtering phase. As it still uses post-filtering, we can apply it only for local content sources. As it uses pre-filtering, search result lists might be temporarily inconsistent with users' access rights until after the next crawl. SEEDLIST_PAGE_TIMEOUT Increase the timeout for fetching the seedlist page. The specified unit for the value is seconds. The default value is 150 sec. The portal search attempts to fetch the seedlist main URL for 150 seconds. If we modify the value for this parameter, the new value is applied only to newly created collections of the search service. We cannot update this parameter for existing search collections. SOAP_URL If we set up a remote search service using SOAP, use this parameter to specify the SOAP URL. An example value is http://localhost:10000/WebScannerSOAP/servlet/rpcrouter. SOAP_URL_Example http://localhost:10000/WebScannerSOAP/servlet/rpcrouter The following parameters are reserved for internal use only. Do not change their values.
CONTENT_SOURCE_TYPE_FEATURE_NAME Reserved for internal use only. Do not change its value. The default value is ContentSourceType. CONTENT_SOURCE_TYPE_FEATURE_VAL_PORTAL Reserved for internal use only. Do not change its value. The default value is Portal. CONTENT_SOURCE_TYPE_FEATURE_VAL_WEB Reserved for internal use only. Do not change its value. The default value is Web. SecurityResolverId Reserved for internal use only. Do not change its value. The default value is com.ibm.lotus.search.plugins.provider.core.PortalSecurityResolverFactory. SetProperties Reserved for internal use only. Do not change its value. Possible values are on or off. The default value is on. startup Reserved for internal use only. Do not change its value. Default is false. VALIDATE_COOKIE Reserved for internal use only. Do not change its value. The default value is 123. WORK_MANAGER Specify the work manager. Reserved for internal use only. Do not change its value. Default is wps/searchIndexWM. WORK_MANAGER_DEPLOY Example: wps/searchIndexWM. WORK_MANAGER_NATIVE WORK_MANAGER for native threads for debug purposes only Example: force.hrl.work.manager.use.native.threads WORK_MANAGER_NAME JNDI name of the work manager that Portal Search uses.
Example
RESOURCE_ENVIRONMENT_PROVIDER_NAME SearchPropertiesService facetedFields
WORK_MANAGER_DEPLOY wps/searchIndexWM EJB_Example ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome DefaultCollectionsDirectory CONTENT_SOURCE_TYPE_FEATURE_NAME ContentSourceType EJB MAX_BUILD_BATCH_SIZE 10000 fieldTypes
WORK_MANAGER_NATIVE force.hrl.work.manager.use.native.threads IIOP_URL
VALIDATE_COOKIE 123
WORK_MANAGER_NAME wps/searchIndexWM PortalCollectionSourceName Portal Content Source CONTENT_SOURCE_TYPE_FEATURE_VAL_PORTAL Portal PSE_TYPE localhost HTTP_MAX_BODY_SIZE_MB 20 MAX_BUILD_INTERVAL_TIME_SECONDS 300 SetProperties on startup false PortalCollectionName Default Search Collection IIOP_URL_Example iiop://localhost:2811 CLEAN_UP_TIME_OF_DAY_HOURS 0 mappedFields
SOAP_URL_Example http://localhost:10000/WebScannerSOAP/servlet/rpcrouter OPEN_WCM_WINDOW /wps/myportal/wcmContent?WCM_GLOBAL_CONTEXT= SecurityResolverId com.ibm.lotus.search.plugins.provider.core.PortalSecurityResolverFactory DEFAULT_acls_FIELDINFO contentSearchable=false, fieldSearchable=true, returnable=true, sortable=false, supportsExactMatch=true, parametric=false, typeAhead=false SOAP_URL
CONTENT_SOURCE_TYPE_FEATURE_VAL_UPLOAD Upload CONTENT_SOURCE_TYPE_FEATURE_VAL_WEB Web OpenResultMode new SEARCH_SECURITY_MODE SECURITY_MODE_PRE_POST_FILTER boostingSettings
Specify which metadata fields are given extra weight in an overall rank score during a search. We can also specify how much the selected metadata fields contribute to relevance circulation when running a search. Can include any other metadata fields (with string based values) "boost" should be specified in a range between 1.0 to 10.0, and should be used with care (suggested to stay in the range between 1.0 and 3.0). See also: Search Integration in HCL WebSphere Portal
fieldBoost Define which metadata fields have extra weight when search results are returned, and how much extra weight is given to the specified fields. Provide the following attributes:
field The relevant string-based metadata field that you would like search to focus on. Field values we can include are title, description, and keywords. boost The relative amount of extra weight added to the rank score. Set between a range of 1.0 and 10.0. The suggested range of the attribute is between 1.0 and 3.0. phraseBoost Nonessential variable that focuses search results on specified languages, for example, English. boostingSetting examples:
- phraseBoost: {Enabled:true}, fieldBoost: {field:title, boost:3.0}, {field:publishdate, boost:4.0}, {field:description, boost:3.0}, {field:keywords, boost:2.0}
- phraseBoost: {Enabled:true}, fieldBoost: {field:title, boost:3.0}, {field:creation, boost:4.0}, {field:description, boost:3.0}, {field:keywords, boost:2.0}
- phraseBoost: {Enabled:true}, fieldBoost: {field:title, boost:3.0}, {field:lastmodified, boost:4.0}, {field:description, boost:3.0}, {field:keywords, boost:2.0}
If we change a boostingSettings parameter the modification applies only to newly created search collections created in the search service. Existing search collections are not affected by the updated parameter value.
To implement:
- Document settings for existing search collection
- Export existing collection as backup
- Change boostingSettings for the search service.
- Delete existing search collection
- Recreate the search collection
The new search collection should use new boostingSettings.
Parent Administer Portal SearchRelated tasks:
Manage search services
Configure the default location for search collections