Search issues

Search issues
See also: support.hcltechsw.com
CRAWLING_RESULT_HTTP_ERROR

Issue: Cannot create search collection content cource. Receive errors: EJPJO0046E and CRAWLING_RESULT_HTTP_ERROR and http status code 200

Error EJPJO0046E: Failed to connect to content source WCM Content Source. Either an incorrect URL is specified, the authentication information for the content source is incorrect, or the site is blocked by robot.txt.

com.ibm.hrl.portlets.WsPse.PortalWebScannerException: EJPJO0046E: Failed to connect to content source WCM Content Source. Either an incorrect URL is specified, the authentication information for the content source is incorrect, or the site is blocked by robot.txt.
at com.ibm.hrl.portlets.WsPse.GeneralUtils.throwPortalWebScannerException(GeneralUtils.java:268)
more........
com.ibm.lotus.search.engine.SearchAdminException: SearchAdminService.verifyCrawler() returned result code CRAWLING_RESULT_HTTP_ERROR and http status code 200
at com.ibm.lotus.search.engine.SearchEngineWebScannerLiteImp.checkCrawler(SearchEngineWebScannerLiteImp.java:709)

Cause: The most common cause is Portal is unable to negotiate a SSL connection with the target Portal. In another case, this was caused by a firewall blocking connections to the Portal server from the remote search server on port 443.
Resolution:

HTTP URLs for the seedlist will be redirected to a secure HTTPS connection. If using remote search service, verify you have imported the certificate from the Portal Server into the Remote Search Server. If using a standalone Portal, verify you have imported that Portal's certificate into it's own truststore.
WAS Admin Console > Security > SSL Certificate and Key Management > Key stores and certificates > NodeDefaultTrustStore > Signer Certificates > Retrieve from Port (usually port 443)

This issue has been seen multiple times when the incorrect host name is used for the credentials in the Security Tab.

When creating the content source, click the Security Tab and verify an admin userid and password are entered as well as the CORRECT host name for the server. After entering this info, be sure to hit the "Create" button in the upper right corner of the panel before hitting the "Create" button at the bottom of the panel to actually create the content source.
Check if a firewall is blocking connections to the Portal server from the remote search server on port 443. Try "telnet host port".
Try the seedlist URL in a browser to verify it works ok (no hidden spaces etc)
Uncheck the robots.txt option under advanced tab (only available for some content source types)
Check if the wrong content source type selected. e.g. Web Site instead of WCM Site

Issue with deleting a Portal search collection

Issue: You are encountering problems deleting a portal search collection.
Cause: The deletion fails via the icon in the Manage Search UI because the collection is faulty.
Resolution: There are several different ways to delete a portal search collection.

Manage Search Portlet
First try deleting the collection using the trash can icon under...
Manage Search > Collections from all Services

Manage Faulty Collections
Check if any collections are listed under...
Manage Search > Search Services > (select search service)

Look for a link at the bottom of the page "Manage faulty collections". If present, use that link. If not, proceed to the steps below.

ConfigEngine Task

Take a co-ordinated backup of the Portal file system and database. It is always a good idea to have a recent backup anyway.
Remove the Default Search Service and associated collections:
ConfigEngine action-delete-search-services-and-collections-wp.search.service

The ConfigEngine task above will not delete any remote search services. It was designed to delete Default Search Service and it's associated collections after a remote search service has been configured. After running the task, the Default Search Service will not be recreated during Portal restart.
If the task above does not completely remove the collections see: Removing search collections
If additional steps are needed to delete the collections in question, proceed to the manual steps below.

Delete Collections from the file system

Backup then remove the actual collection directories.
Sample location/contents:
ls -l /opt/HCL/wp_profile/PortalServer/collections
drwxrxrx 4 portal portal 256 Dec 16 2015 DefaultSearchCollection.6a1ad4be

Note we may need to stop the Portal server or remote search server if we can not delete these directories.
Delete the collections from the collections backup directory IF there is one configured in the search service. For example:
<PortalServer_root>/collections_config_backup

To determine the location of the backup directory, click the Edit/Pencil icon next to the search service and look for parameter: RECOVERY_BACKUP_LOCATION

JCR

If deleting JCR Collections, backup then remove the JCR collection properties files. Sample location and directory contents:
ls -l /opt/HCL/wp_profile/PortalServer/jcr/searchIndexes
rwrr 1 portal portal 400 Nov 15 10:44 JCRCollection1.properties

PSE
Check if there are any remaining "PSE Sources":
As wpsadmin, go to:
Administration (pencil icon) > Resource Permissions > PSE Sources

If there are any present that need to be deleted, it will require use of SQL against the database. Check what is currently in the tables with these queries:
SELECT FROM release.PSE_SOURCE;
SELECT FROM release.PSE_DESC;
SELECT * FROM release.PROT_RES WHERE RES_TYPE IN (43);

Open a case with HCL Support including the output from the above queries to suggest how to remove stale PSE Sources. In the meantime, verify you have a current database backup per above recommendation.

java.net.SocketTimeoutException reading seedlist

Issue: The following error message occurs in the SystemOut.log of the HCL WebSphere Portal Server:
SeedlistCrawl E Failed to execute http request for url http://host:port/authoring/seedlist/server?Action=GetDocuments&Format=ATOM&Locale=en_US&Range=100&Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JCRRetrieverFactory&Start=0&SeedlistId=1@OOTB_CRAWLER1

java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) more...
at com.ibm.lotus.search.engine.SeedlistCrawler.processSeedlistPage(SeedlistCrawler.java:389)

Cause: The seedlist crawler has not enough time to fetch documents.
Resolution: Edit the collection under...
Administration > Manage Search

...and increase the value...
Stop fetching a document after (sec)

High availability of the remote search service in a HCL WebSphere Portal cluster

Question: What is the support statement for high availability of the remote search service in a HCL WebSphere Portal cluster?
Cause: To support Portal Search in a clustered environment, you must install and configure search for remote search service on a WebSphere Application Server node that is not part of the HCL WebSphere Portal server cluster.
Answer: At this time, there is no supported configuration for high availability of the remote search service in a HCL WebSphere Portal cluster.
A HCL WebSphere Portal remote search service (WebScannerSoap.ear or WebScannerEjbEar.ear) cannot be in a cluster itself. It does not benefit from HCL WebSphere Portal cluster scalability improvements. If you add new nodes to the cluster, you do not improve the search capacity.

Passing multiple keywords with single quotes produce invalid query statement exception

Observed Behavior: When a customer specify multiple keywords that use single quotes for abbreviations, it is not handled properly by WCM/JCR query parser. It only works when a single keyword is specified.
Expected Behavior: When a customer specify multiple keywords with single quotes, it should have the same behavior when a single keyword with single quotes is specified where it can find the wcm objects that matched on the specified keyword/s.
Defect Status: Resolved
Designated/Resolved version CF212
Problem Resolution: Apply CF212 which resolves defect DXQ-27192.

Why does the Portal Search REST API not show the exact number of results?

Question: Why does the Portal Search REST API not indicate the exact number of results returned ?
Enter a URL such as this:
http://hostname.mydomain.com:10039/wps/mycontenthandler/searchfeed/search?index=Default+Search+Service::C:\IBM\WebSphere\wp_profile\PortalServer\collections\articles&query=article&pageSize=2&page=2&sortkey=relevance&lang=en&facet={%22id%22:%20%22category%22,%20%22count%22:%20%22ALL%22,%20%22sortOrder%22:%20%22DESC%22}

The opensearch:totalResults field contains the number of results.
But the actual number of results are more. Why?
Answer: In order to make 'exactmatch=true' you have to change the security filtering mode. For the search service properties change SEARCH_SECURITY_MODE to a value of SECURITY_MODE_PREFILTER. This causes security filtering to be done on the search server side only using the ACL information stored in the search collection. Which is the majority of the cases should be sufficient. The default mode is pre- and post-filtering. This is why exactMatch is set to false, because the client side filtering could potentially still remove a few entries.

Directory Search via the WCM UI throws error

Issue: While performing a Directory Search via the WCM UI the following error appears on the UI...
Too many names were found. Type more characters of the name, then search again.

Searching for other strings is successful, but when searching for this particular string it gives the error above.
Cause: This can be caused by the "picker settings" search limits being reached.
To confirm we can set the traces and recreate the problem
Search the trace output for this type of message...
[1/13/21 10:52:55:777 EST] 000001b5 ProfileManage < com.ibm.ws.wim.ProfileManager searchRepository RETURN returning 53 entities

Resolution: To resolve the problem, increase the search limits in the "picker settings". This change needs to be preformed on each node in the cluster to ensure all nodes pickup the setting successfully.

Portal Search Crawler returns no documents or a subset of the expected number of documents

Issue: Portal Search Crawler returns no documents or a subset of the expected number of documents. When pasting the seedlist URL into a browser it returns the expected number of documents.
Cause The problem is caused by a misconfiguration
Items to check:

Access the seedlist settings in the Integrated Solutions Console (WAS Admin Console) under...
Resources > Resource Environment > Resource Environment Providers > WCM_SearchService > custom properties

Reduce the value of SearchService.DefaultSeedPageSize from 200 to 50
Set...
SearchService.SearchSeed.ExcludeFileAttachments=true

In many cases the business does not require documents referenced by content items to be crawled.
Try using a large "range" parameter at the end of the seedlist, eg &range=3000
Verify the user configured on the "Security" Tab for the Content Source has assigned roles/access to the items that are missing.
Access the WCM Authoring Portlet and verify that we can preview one of the missing documents when logged in as the user configured in the "Security" Tab. Check if a user can be found when clicking to the preview button. Maybe a message that no valid Presentation Template / Authoring Template mapping existed for that content occurs. Adding a valid mapping to the Site Area that contains the content allowed will help to find all expected documents during the next crawl.
Verify all site areas you wish to include in the collection has the box checked "Include this item in search collections. Changes in that section will take effect next time the collection is updated." To see this checkbox click the option "Show Hidden Fields".
Verify the authoring template for the content you wish to crawl has the box "Search Collection Visibility" checked under "Default content properties"
Increase SEEDLIST_PAGE_TIMEOUT in the Search Service Configuration, especially if the seedlist URL is returning a large number of items.
For details, please check: Search service configuration parameters
Edit the content source then experiment with different values for the setting "Stop fetching a document after"
Set...
SearchService.SearchSeed.ExcludeFileAttachments=true

Review Hints and tips for Portal Search crawls
Re-crawl the collection or click the run icon for the content source after each config change
If all above steps do not help, the collection may be faulty. Delete and recreate the collection and content sources.

While attempting to upgrade the Remote Search Server using IIM the following errors occur...WASX7070E and ADMC0016E

Issue: The Remote Search Server was previously installed using the manual procedure and has been running with no problems. We recently attempted to upgrade the Remote Search Server using IIM and the following error occurred....
[wsadmin] WASX7015E: Exception running command: "source "/opt/WebSphere/9.0/AppServer/profiles/prs_profile/ConfigEngine/config/work/was/curJaclScript.jacl""; exception information:
[wsadmin] com.ibm.ws.scripting.ScriptingException: WASX7070E: The configuration service is not available.
[wplc-create-ear] Exception found when executing wsadmin
BUILD FAILED
/opt/WebSphere/9.0/PortalRemoteSearch/dcs/wp.dcs.remotedcs/config/includes/wp.dcs.remotedcs_cfg.xml:37: Exception found when executing wsadmin

Total time: 9 seconds

stop-wsadmin-listener:
Mon Apr 12 09:58:09 EDT 2021
[wplc-discard-wsadmin-session] [04/12/21 09:58:09.117 EDT] FFDC1007I: FFDC Provider Installed: com.ibm.ffdc.util.provider.FfdcOnDirProvider@94c98012
[wplc-discard-wsadmin-session] [04/12/21 09:58:09.232 EDT] CWPKI0041W: One or more key stores are using the default password.
[wplc-discard-wsadmin-session] [04/12/21 09:58:09.234 EDT] CWPKI0051I: The process has the java security property jdk.certpath.disabledAlgorithms set to [MD2, MD5, SHA1 jdkCA & usage TLSServer, RSA keySize < 1024, DSA keySize < 1024, EC keySize < 224]. The WebSphere Application server is setting the java security property jdk.certpath.disabledAlgorithms to [MD2, RSA keySize < 1024, MD5].
[wplc-discard-wsadmin-session] [04/12/21 09:58:09.235 EDT] CWPKI0051I: The process has the java security property jdk.tls.disabledAlgorithms set to [SSLv3, RC4, DES, MD5withRSA, DH keySize < 1024, DESede, EC keySize < 224, 3DES_EDE_CBC, anon, NULL, DES_CBC]. The WebSphere Application server is setting the java security property jdk.tls.disabledAlgorithms to [SSLv3, RC4, DH keySize < 768, MD5withRSA].
[wplc-discard-wsadmin-session] [04/12/21 09:58:09.236 EDT] CWPKI0027I: Disabling default hostname verification for HTTPS URL connections.
[wplc-discard-wsadmin-session] [04/12/21 09:58:09.251 EDT] CWSCF0002I: The client code is attempting to load the security configuration the server and this operation is not allowed.
/opt/WebSphere/9.0/PortalRemoteSearch/ConfigEngine/config/includes/default_cfg.xml:142: Unable to create remote administration client: ADMC0016E: The system cannot create a SOAP connector to connect to host your_host.com at port 9043.

Cause: The upgrade failed because it was attempted with IIM. Since it was originally installed manually, the upgrade needs to be done manually too.
Resolution:
If the Remote Search Server was originally installed manually, then subsequent upgrades must also be performed manually.
"If you originally installed the remote service by using manual steps, then you must use manual steps to upgrade it after you apply the Combined Cumulative Fix on the portal server."

Passing multiple keywords with single quotes produce invalid query statement exception

Observed Behavior
When a customer specify multiple keywords that use single quotes for abbreviations, it is not handled properly by WCM/JCR query parser. It only works when a single keyword is specified.
Expected Behavior:
When a customer specify multiple keywords with single quotes, it should have the same behavior when a single keyword with single quotes is specified where it can find the wcm objects that matched on the specified keyword/s.
Problem Resolution: Apply CF212 which resolves defect DXQ-27192.

Search crawler deletes resources

Problem: Certain configurations allow search crawlers to delete HCL WebSphere Portal resources.
Symptom: HCL WebSphere Portal resources (pages, portlets, etc.) are systematically deleted.
Cause:
An unidentified search crawler accesses the administration pages. In traversing the site, it may attempt to follow every link, including Manage Pages' delete, Resource Permissions' delete user from role, etc. Depending on how the search crawler user agent behaves, confirmation pop-ups may be ignored or accepted, resulting in deleted resources, access controls, etc.
Diagnosis: Immediately prior to the failure, HCL WebSphere Portal repeatedly writes the following to SystemOut.log:
SystemOut O WARNING: Unknown
User Browser to WCL DeviceContext. Dump UserAgent: UnidentifiedAgent

where "UnidentifiedAgent" is the agent that the search crawler advertises to the site.
Resolution: Restore any deleted or modified resources according to your disaster recovery plan.
If the agent can be identified as a search crawler, HCL WebSphere Portal will not serve it action URLs. For all search crawlers which traverse your site, specify their user agents as search engines according to the documentation. If you allow search crawlers with unidentified agents to crawl sensitive pages, you should configure the default agent to be a search crawler.
As an additional safeguard, if the crawler accesses HCL WebSphere Portal as an authenticated user, we may restrict its access to sensitive resources.
Search engines and authenticated pages

Cannot find index PSE.localhost

Issue: You find this in the logs:
SecurityManag W com.ibm.hrl.portlets.WsPse.SecurityManager hasAccess Cannot find index PSE.localhost.D:\HCL\WebSphere\wp_profile\PortalServer\jcr\searchIndexes\JCRCollection5
SeedlistIndex E TS0171E: Error during search service initialization for workspace with id 5 com.ibm.siapi.SiapiException: Message0:
SEVERITY_ERROR: Message ID: [SIAPI0009E] Resource Bundle:[com.ibm.siapi.SiapiResources] Message Text: [] Message Arguments: Arg0:
String:[D:\HCL\WebSphere\wp_profile\PortalServer\jcr\searchIndexes\JCRCollection5]
at com.ibm.hrl.wp.siapi.pseAdapter.util.PSEutil.verifyCollectionSecurityAccess(UnknownSource)
SearchRequest E TS0025E: Error processing text search request.com.ibm.icm.ts.tss.TextEngineException: TS0172E: Error Search Factory or
Search service is not properly initialized,refer to earlier logs.

Cause: You are using remote search server for JCRCollection and haven't configured these settings in JCR_ConfigService_PortalContent:
jcr.textsearch.PSE.type
jcr.textsearch.EJB.IIOP.URL
jcr.textsearch.EJB.EJBName
jcr.textsearch.enabled

Resolution: Set the following properties:
Change the jcr.textsearch.indexdirectory property to point to a directory on the remote search server. For example,
jcr.textsearch.indexdirectory=C:/JCR

Change the jcr.textsearch.PSE.type property to EJB.
Change the jcr.textsearch.EJB.IIOP.URL property to the URL of the naming service that is used to access the WebScanner EJB. For example,
iiop://localhost:2811

Change the jcr.textsearch.EJB.EJBName property to the name of the WebScanner EJB. For example, ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome.
Change the jcr.textsearch.enabled value to true.

Remote search cause error "WSVR0068E: Attempt to start EnterpriseBean SEStandalone#PSEStandaloneEJB.jar#WebScannerLiteEJB failed"

Issue: When setting up Portal Remote Search the following exception is thrown in the SystemOut.log while starting the PSEStandalone App on the Remote Search server:
WSVR0068E: Attempt to start EnterpriseBean PSEStandalone#PSEStandaloneEJB.jar#WebScannerLiteEJB failed with exception: com.ibm.ejs.container.EJBConfigurationException: Bean class com.ibm.hrl.portlets.WsPse.WebScannerLiteEJB could not be loaded
at com.ibm.ws.metadata.ejb.EJBMDOrchestrator.loadCustomerProvidedClasses(EJBMDOrchestrator.java:4151)
Bean class com.ibm.hrl.portlets.WsPse.WebScannerLiteEJB could not be found or loaded
Caused by: java.lang.ClassNotFoundException: javax.ejb.EJBObject

In addition the contents of <wp_profile-root>/installableApps/extract/lib had outdated files.
Resolution:

Unzip the PSELibs.zip file from the <portal-server-root>/search/wp.search.libs to location <wp_profile-root>/installableApps/extract/lib
Restarted the server. This should resolve the issue.
Please also check if the PSE Shared Library had been mapped to the PSEServlet instead of the PSEStandalone Application.
From remote search server console go to:
Applications > Enterprise Applications > PSEStandalone > Shared library references

Verify that the PSE Shared Library is mapped to "PSEStandalone".

How to include the files (e.g. PDF or DOC) located in WCM File Resource Components in search results?
To include files (e.g. PDF or DOC files) located in the Web Content Management (WCM) File Resource Components in a search result.
The WCM Seedlist is based on WCM Content Items which implicitly provide a "rendering context" aka a path for a URL to render the content on the page. To include the files from File Resource Components it is needed to link them to content items in some manner. The options are:

Add a file resource element within a content item and upload the file to that element instead of using a File Resource Component
Add a file resource reference element to the content item to point to the File Resource Component
Use WCM API to generate URL's to all the library file resource components, then point a crawl URL to the .jsp
Use WCM API to create one piece of content per library file resource component

If the first two options above are not working as expected, check the setting for "SearchService.SearchSeed.ExcludeFileAttachments" located in:
Resources > Resource Environment > Resource Environment Provides > WCM_SearchService > custom properties

Cannot create Portal search collection content source
Issue You try to create a Portal search collection and it fails. These exceptions are in logs:
EJPJO0105E: Failed to execute content source myCompany in collection /usr/PSEcollections/WPS8.5/myCollection.com.ibm.hrl.portlets.WsPse.PortalWebScannerException:
EJPJO0105E: Failed to execute content source myCompany ....
java.security.PrivilegedActionException: java.lang.reflect.InvocationTargetException
at java.security.AccessController.doPrivileged (AccessController.java:462) ....
com.ibm.lotus.search.engine.SearchAdminException: Failed to start crawler as SearchAdminService.verifyCrawler() returned result code CRAWLING_RESULT_BAD_HTTP_STATUS .......

Cause: This issue is commonly caused by failing to import the certificate from the Portal server into the truststore on the remote search server.
Resolution: This can be resolved by importing the certificate from the Portal server to the remote search server.

From the WAS admin console of the remote search service server
Security > SSL Certificate and Key Management > Key stores and certificates > NodeDefaultTrustStore > Signer Certificates > Retrieve from Port

Enter the portal server host, its SSL port, and an alias.

Click Retrieve Signer Information.

Click OK.

Note: use the port "WC_adminhost_secure" on the Portal server. We can determine this port number by logging into the Deployment Manager on the Portal:
Cluster > Servers > select the server > Ports

More Information:
https://help.hcltechsw.com/digital-experience/9.5/admin-system/sso_portal_rss.html

After installing the latest cumulative fix Portal search application does not start

Issue: After installing the latest cumulative fix Portal search application does not start.
Logs show the following error:
PortalCollect E com.ibm.hrl.portlets.WsPse.PortalCollectionsService
createWebScannerLite EJPJO0032E: Unable to create Webscanner com.ibm.lotus.search.engine.SearchAdminException: Failed to get crawler status from persistence layer.
at com.ibm.lotus.search.engine.SearchAdminService.init(SearchAdminService.java:688)
...
Caused by:
com.ibm.lotus.search.engine.PersistenceException: Failed to parse CrawlerStatus xml.
at com.ibm.lotus.search.engine.PersistenceService.getCrawlerStatus(PersistenceService.java:313)
at com.ibm.lotus.search.engine.SearchAdminService.init(SearchAdminService.java:686)
...
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)

SystemErr.log:
SystemErr R [Fatal Error] :-1:-1: Premature end of file.

Cause: The above exception means that the parsing of the crawler status is failing. The crawler statuses are kept in the plain XML files on the disk, and the server should be tolerant to the state, when the files are not present on the disk.
Resolution: To resolve the issue, remove the bad "crawler status" XML files from their location and restart the server. Here is the sample path:
wp_profile_root/PortalServer/CollectionsConfig

Only remove the files which start from crawlers_status_***.xml. This should have a clean start.

Error "collection already exists" trying to recreate a collection

Issue: You try to recreate a collection that was previously deleted and you receive the error: "collection already exists"
Cause: This can happen if you have manually deleted the collection directory and removed the entry for that collection from the Search Admin Portlet. The search admin portlet may also indicate that no corrupt collections exist. However there may be a backup collection directory configured in the search service.This backup copy prevents the recreation of the search collection.
Resolution: Determine the location of the backup configuration. For example:
../portalserver/collections_config_backup

To determine the location of the backup directory, click the Edit/Pencil icon next to the search service and look for parameter:
RECOVERY_BACKUP_LOCATION

In addition, check if the PSE Source for the collection still exists under:
Administration > Access > Resource Permissions > PSE Sources

If the PSE Source is still there, open a case with HCL Support to assist with deletion of the resource.

How to identify large files resources in the JCR Database?

When using HCL WebSphere Portal it might be needed to identify large file resources in the JCR database. This document provides hint to find out such large files.
It is possible to use sql-queries as described in the following example to find items larger than 10 MB:
select componentid, itemid, versionid, compkey, attr0000001057, attr0000001092, attr0000001151, attr0000001183
from wcm_icmadmin.icmut01345001 with (nolock)
where (datalength(wcm_icmadmin.icmut01345001.attr0100600126) > 10000000)
order by attr0000001057 desc, itemid, versionid

Use the WCM "findlargeresources" module. This module is included out of the box for Portal 8.x. Here is a sample invocation of the module that returns a list of all file resources in the WCM Database over 1000 bytes in size:
http://myserver.mycompany.com:10039/wps/wcm/myconnect?MOD=findlargeresources&min_size=1000

Also check:
PK75187: UTILITY TO FIND LARGE RESOURCE FILES (IMAGES AND OTHER FILES) STORED IN WCM

Exclude portlets or pages when building Search Collection with Portal Seedlist

In HCL WebSphere Portal it is possible to build search collections. In some situations it might be needed to exclude some portlets or pages from that collections. This document provide steps that need to be done to exclude the portlets and pages.
To exclude portlets from "Portal" Seedlist output configure the portlet under Administration > Portlets and add a parameter:

Parameter name: INCLUDE_IN_SEARCH_INDEX
Parameter value: false

All portlets on a page must have this config setting for the page to be excluded by the Portal Seedlist.

Configure search with external security TAI

Issue: You have a portal Search collection that cannot use a content source URL that is be protected by the external security manager.
Cause: The Portal Search crawler has no mechanism to authenticate into an external security manager. When configuring a Portal search collection, the content source URL must not be protected by the external security manager. The external security manager considers the HCL WebSphere Portal search engine crawler to be an anonymous user.
Resolution:
An example of the auto-generated URL for a Portal site content source is:
http://hostname.domain.com:10039/wps/portal/!ut/p/c0/04_Sj9CP1I8y04_MydQvDHVUBAAnJsjH/?WPSRedirectURL=http%3A%2F%2F%2Fwps%2Fmyportal%2F%21ut%2Fp%2Fc0%2F04_SB8K8xLLM9MSSzPy8xBz9CP0os3g_f6NQNxNPQ0MLM1dDAyMzDxMnnzBPA39vQ_2CbEdFAMYKaQ4%21%2F
If the Tivoli Access Manager for e-business standard WebSEAL junction is /portaljunction, then based on the above auto-generated URL, the following URL must be unprotected by the external security manager:
http://hostname.domain.com:10039/portaljunction/wps/myportal/!ut/p/c0/04_SB8K8xLLM9MSSzPy8xBz9CP0os3g_f6NQNxNPQ0MLM1dDAyMzDxMnnzBPA39vQ_2CbEdFAMYKaQ4!

Otherwise, you will not retrieve the intended content and might receive a message similar to the following:
EJPJP0009E: Wrong root url for Portal site crawler.

What follows are some sample Tivoli Access Manager for e-business pdadmin commands that make the above content source URL available to anonymous users such as the Portal Search Engine crawler:

pdadmin sec_master> acl create acl_websphere_portal_search_crawler
pdadmin sec_master> acl modify acl_websphere_portal_search_crawler set any-other Tr
pdadmin sec_master> acl modify acl_websphere_portal_search_crawler set unauthenticated Tr
pdadmin sec_master> acl show acl_websphere_portal_search_crawler ACL Name:
acl_websphere_portal_search_crawler Description: Entries: User sec_master TcmdbsvaBRl Any-other Tr Unauthenticated Tr
pdadmin sec_master> acl attach /WebSEAL//portaljunction/wps/myportal/!ut/p/c0/04_SB8K8xLLM9MSSzPy8xBz9CP0os3g_f6NQNxNPQ0MLM1dDAyMzDxMnnzBPA39vQ_2CbEdFAMYKaQ4!acl_websphere_portal_search_crawler
pdadmin sec_master> acl find acl_websphere_portal_search_crawler/WebSEAL//portaljunction/wps/myportal/!ut/p/c0/04_SB8K8xLLM9MSSzPy8xBz9CP0os3g_f6NQNxNPQ0MLM1dDAyMzDxMnnzBPA39vQ_2CbEdFAMYKaQ4!

When the end user executes a search query and gets a list of search results, WebSEAL will automatically rewrite the URL links within the HTML response. WebSEAL will ensure the URL links sent from back-end HCL WebSphere Portal servers across standard WebSEAL junctions are reconstructed appropriately for use by clients. For example, in the URL links, the hostname will change from HCL WebSphere Portal to WebSEAL and the junction name will be inserted.

Attempting to upgrade the Remote Search Server. Error occurs....'CRIMA1006E...cannot exist in the same package group'

Issue: While trying to upgrade the Remote Search Server using IIM from the command line....
./imcl install com.ibm.websphere.PORTAL.REMOTESEARCH.v85 -repositories /path/to/repository.config> -installationDirectory portal_remotesearch_root -acceptLicense

...the following error occurs....
CRIMA1006E ERROR: The following errors were generated while installing.
CRIMA1006E ERROR: Packages Portal Remote Search 8.5.0.0 CF19 and IBM SDK, Java Technology Edition, Version 8 8.0.6.25 cannot exist in the same package group

Resolution: This error condition can be avoided by either using the IIM GUI or....if you want to use the command line....you must run the IIM in console mode. These steps can be found at the link below under the 'Use Console Mode Interface' section....
https://help.hcltechsw.com/digital-experience/9.5/overview/ccf_95_remote_search.html

Sample steps for implementing a search query using WCM HTML Component and Search Component

Create a HTML Component with these contents:
<script>
function addFilter(queryIn)
{
return queryIn;
}
<script>

<form onSubmit="this.search_query.value=addFilter(this.query.value)">
Query: <input name="query"/>
<input type=hidden name="search_query"/>
<form>

Next create a Search Component with these contents:
Header:
<table>

Results:
<tr><td>
[AttributeResource attributeName="namelink" separator=","]<br>
[AttributeResource attributeName="summary" separator=","] <td><tr>

Footer:
<tr><td>
<td><tr>
<table>

Separator:
<tr><td bgcolor="#FFFAA" colspan="2"/></tr>

NoResultDesign:
There are no results for your query. Please refine your search and try again.

Be sure to select the desired collection to search when creating the Search Component.
For this to work, you need to reference both components created above in the SAME presentation template.

See also
support.hcltechsw.com