+

Search Tips   |   Advanced Search


Search issues

See also: support.hcltechsw.com

CRAWLING_RESULT_HTTP_ERROR

Issue: Cannot create search collection content cource. Receive errors: EJPJO0046E and CRAWLING_RESULT_HTTP_ERROR and http status code 200

Cause: The most common cause is Portal is unable to negotiate a SSL connection with the target Portal. In another case, this was caused by a firewall blocking connections to the Portal server from the remote search server on port 443.

Resolution:


Issue with deleting a Portal search collection

Issue: You are encountering problems deleting a portal search collection.

Cause: The deletion fails via the icon in the Manage Search UI because the collection is faulty.

Resolution: There are several different ways to delete a portal search collection.


java.net.SocketTimeoutException reading seedlist

Issue: The following error message occurs in the SystemOut.log of the HCL WebSphere Portal Server:

Cause: The seedlist crawler has not enough time to fetch documents.

Resolution: Edit the collection under...

...and increase the value...


High availability of the remote search service in a HCL WebSphere Portal cluster

Question: What is the support statement for high availability of the remote search service in a HCL WebSphere Portal cluster?

Cause: To support Portal Search in a clustered environment, you must install and configure search for remote search service on a WebSphere Application Server node that is not part of the HCL WebSphere Portal server cluster.

Answer: At this time, there is no supported configuration for high availability of the remote search service in a HCL WebSphere Portal cluster.

A HCL WebSphere Portal remote search service (WebScannerSoap.ear or WebScannerEjbEar.ear) cannot be in a cluster itself. It does not benefit from HCL WebSphere Portal cluster scalability improvements. If you add new nodes to the cluster, you do not improve the search capacity.


Passing multiple keywords with single quotes produce invalid query statement exception

Observed Behavior: When a customer specify multiple keywords that use single quotes for abbreviations, it is not handled properly by WCM/JCR query parser. It only works when a single keyword is specified.

Expected Behavior: When a customer specify multiple keywords with single quotes, it should have the same behavior when a single keyword with single quotes is specified where it can find the wcm objects that matched on the specified keyword/s.

Defect Status: Resolved

Designated/Resolved version CF212

Problem Resolution: Apply CF212 which resolves defect DXQ-27192.


Why does the Portal Search REST API not show the exact number of results?

Question: Why does the Portal Search REST API not indicate the exact number of results returned ?

Enter a URL such as this:

The opensearch:totalResults field contains the number of results.

But the actual number of results are more. Why?

Answer: In order to make 'exactmatch=true' you have to change the security filtering mode. For the search service properties change SEARCH_SECURITY_MODE to a value of SECURITY_MODE_PREFILTER. This causes security filtering to be done on the search server side only using the ACL information stored in the search collection. Which is the majority of the cases should be sufficient. The default mode is pre- and post-filtering. This is why exactMatch is set to false, because the client side filtering could potentially still remove a few entries.


Directory Search via the WCM UI throws error

Issue: While performing a Directory Search via the WCM UI the following error appears on the UI...

Searching for other strings is successful, but when searching for this particular string it gives the error above.

Cause: This can be caused by the "picker settings" search limits being reached.

To confirm we can set the traces and recreate the problem

Search the trace output for this type of message...

Resolution: To resolve the problem, increase the search limits in the "picker settings". This change needs to be preformed on each node in the cluster to ensure all nodes pickup the setting successfully.


Portal Search Crawler returns no documents or a subset of the expected number of documents

Issue: Portal Search Crawler returns no documents or a subset of the expected number of documents. When pasting the seedlist URL into a browser it returns the expected number of documents.

Cause The problem is caused by a misconfiguration

Items to check:

  1. Access the seedlist settings in the Integrated Solutions Console (WAS Admin Console) under...

      Resources > Resource Environment > Resource Environment Providers > WCM_SearchService > custom properties

  2. Reduce the value of SearchService.DefaultSeedPageSize from 200 to 50

  3. Set...

      SearchService.SearchSeed.ExcludeFileAttachments=true

    In many cases the business does not require documents referenced by content items to be crawled.

  4. Try using a large "range" parameter at the end of the seedlist, eg &range=3000

  5. Verify the user configured on the "Security" Tab for the Content Source has assigned roles/access to the items that are missing.

  6. Access the WCM Authoring Portlet and verify that we can preview one of the missing documents when logged in as the user configured in the "Security" Tab. Check if a user can be found when clicking to the preview button. Maybe a message that no valid Presentation Template / Authoring Template mapping existed for that content occurs. Adding a valid mapping to the Site Area that contains the content allowed will help to find all expected documents during the next crawl.

  7. Verify all site areas you wish to include in the collection has the box checked "Include this item in search collections. Changes in that section will take effect next time the collection is updated." To see this checkbox click the option "Show Hidden Fields".

  8. Verify the authoring template for the content you wish to crawl has the box "Search Collection Visibility" checked under "Default content properties"

  9. Increase SEEDLIST_PAGE_TIMEOUT in the Search Service Configuration, especially if the seedlist URL is returning a large number of items.

    For details, please check: Search service configuration parameters

  10. Edit the content source then experiment with different values for the setting "Stop fetching a document after"

  11. Set...

  12. Review Hints and tips for Portal Search crawls

  13. Re-crawl the collection or click the run icon for the content source after each config change

  14. If all above steps do not help, the collection may be faulty. Delete and recreate the collection and content sources.


While attempting to upgrade the Remote Search Server using IIM the following errors occur...WASX7070E and ADMC0016E

Issue: The Remote Search Server was previously installed using the manual procedure and has been running with no problems. We recently attempted to upgrade the Remote Search Server using IIM and the following error occurred....

Cause: The upgrade failed because it was attempted with IIM. Since it was originally installed manually, the upgrade needs to be done manually too.

Resolution:

If the Remote Search Server was originally installed manually, then subsequent upgrades must also be performed manually.


Passing multiple keywords with single quotes produce invalid query statement exception

Observed Behavior

When a customer specify multiple keywords that use single quotes for abbreviations, it is not handled properly by WCM/JCR query parser. It only works when a single keyword is specified.

Expected Behavior:

When a customer specify multiple keywords with single quotes, it should have the same behavior when a single keyword with single quotes is specified where it can find the wcm objects that matched on the specified keyword/s.

Problem Resolution: Apply CF212 which resolves defect DXQ-27192.


Search crawler deletes resources

Problem: Certain configurations allow search crawlers to delete HCL WebSphere Portal resources.

Symptom: HCL WebSphere Portal resources (pages, portlets, etc.) are systematically deleted.

Cause:

An unidentified search crawler accesses the administration pages. In traversing the site, it may attempt to follow every link, including Manage Pages' delete, Resource Permissions' delete user from role, etc. Depending on how the search crawler user agent behaves, confirmation pop-ups may be ignored or accepted, resulting in deleted resources, access controls, etc.

Diagnosis: Immediately prior to the failure, HCL WebSphere Portal repeatedly writes the following to SystemOut.log:

where "UnidentifiedAgent" is the agent that the search crawler advertises to the site.

Resolution: Restore any deleted or modified resources according to your disaster recovery plan.

If the agent can be identified as a search crawler, HCL WebSphere Portal will not serve it action URLs. For all search crawlers which traverse your site, specify their user agents as search engines according to the documentation. If you allow search crawlers with unidentified agents to crawl sensitive pages, you should configure the default agent to be a search crawler.

As an additional safeguard, if the crawler accesses HCL WebSphere Portal as an authenticated user, we may restrict its access to sensitive resources.

Search engines and authenticated pages


Cannot find index PSE.localhost

Issue: You find this in the logs:

Cause: You are using remote search server for JCRCollection and haven't configured these settings in JCR_ConfigService_PortalContent:

Resolution: Set the following properties:

Change the jcr.textsearch.indexdirectory property to point to a directory on the remote search server. For example,

Change the jcr.textsearch.PSE.type property to EJB.

Change the jcr.textsearch.EJB.IIOP.URL property to the URL of the naming service that is used to access the WebScanner EJB. For example,

Change the jcr.textsearch.EJB.EJBName property to the name of the WebScanner EJB. For example, ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome.

Change the jcr.textsearch.enabled value to true.


Remote search cause error "WSVR0068E: Attempt to start EnterpriseBean SEStandalone#PSEStandaloneEJB.jar#WebScannerLiteEJB failed"

Issue: When setting up Portal Remote Search the following exception is thrown in the SystemOut.log while starting the PSEStandalone App on the Remote Search server:

In addition the contents of <wp_profile-root>/installableApps/extract/lib had outdated files.

Resolution:

  1. Unzip the PSELibs.zip file from the <portal-server-root>/search/wp.search.libs to location <wp_profile-root>/installableApps/extract/lib

  2. Restarted the server. This should resolve the issue.

  3. Please also check if the PSE Shared Library had been mapped to the PSEServlet instead of the PSEStandalone Application.

    From remote search server console go to:

      Applications > Enterprise Applications > PSEStandalone > Shared library references

    Verify that the PSE Shared Library is mapped to "PSEStandalone".


How to include the files (e.g. PDF or DOC) located in WCM File Resource Components in search results?

To include files (e.g. PDF or DOC files) located in the Web Content Management (WCM) File Resource Components in a search result.

The WCM Seedlist is based on WCM Content Items which implicitly provide a "rendering context" aka a path for a URL to render the content on the page. To include the files from File Resource Components it is needed to link them to content items in some manner. The options are:

If the first two options above are not working as expected, check the setting for "SearchService.SearchSeed.ExcludeFileAttachments" located in:


Cannot create Portal search collection content source

Issue You try to create a Portal search collection and it fails. These exceptions are in logs:

Cause: This issue is commonly caused by failing to import the certificate from the Portal server into the truststore on the remote search server.

Resolution: This can be resolved by importing the certificate from the Portal server to the remote search server.

  1. From the WAS admin console of the remote search service server

      Security > SSL Certificate and Key Management > Key stores and certificates > NodeDefaultTrustStore > Signer Certificates > Retrieve from Port

  2. Enter the portal server host, its SSL port, and an alias.

  3. Click Retrieve Signer Information.

  4. Click OK.

Note: use the port "WC_adminhost_secure" on the Portal server. We can determine this port number by logging into the Deployment Manager on the Portal:

More Information:


After installing the latest cumulative fix Portal search application does not start

Issue: After installing the latest cumulative fix Portal search application does not start.

Logs show the following error:

SystemErr.log:

Cause: The above exception means that the parsing of the crawler status is failing. The crawler statuses are kept in the plain XML files on the disk, and the server should be tolerant to the state, when the files are not present on the disk.

Resolution: To resolve the issue, remove the bad "crawler status" XML files from their location and restart the server. Here is the sample path:

Only remove the files which start from crawlers_status_***.xml. This should have a clean start.


Error "collection already exists" trying to recreate a collection

Issue: You try to recreate a collection that was previously deleted and you receive the error: "collection already exists"

Cause: This can happen if you have manually deleted the collection directory and removed the entry for that collection from the Search Admin Portlet. The search admin portlet may also indicate that no corrupt collections exist. However there may be a backup collection directory configured in the search service.This backup copy prevents the recreation of the search collection.

Resolution: Determine the location of the backup configuration. For example:

To determine the location of the backup directory, click the Edit/Pencil icon next to the search service and look for parameter:

In addition, check if the PSE Source for the collection still exists under:

If the PSE Source is still there, open a case with HCL Support to assist with deletion of the resource.


How to identify large files resources in the JCR Database?

When using HCL WebSphere Portal it might be needed to identify large file resources in the JCR database. This document provides hint to find out such large files.

It is possible to use sql-queries as described in the following example to find items larger than 10 MB:

Use the WCM "findlargeresources" module. This module is included out of the box for Portal 8.x. Here is a sample invocation of the module that returns a list of all file resources in the WCM Database over 1000 bytes in size:

Also check:


Exclude portlets or pages when building Search Collection with Portal Seedlist

In HCL WebSphere Portal it is possible to build search collections. In some situations it might be needed to exclude some portlets or pages from that collections. This document provide steps that need to be done to exclude the portlets and pages.

To exclude portlets from "Portal" Seedlist output configure the portlet under Administration > Portlets and add a parameter:

All portlets on a page must have this config setting for the page to be excluded by the Portal Seedlist.


Configure search with external security TAI

Issue: You have a portal Search collection that cannot use a content source URL that is be protected by the external security manager.

Cause: The Portal Search crawler has no mechanism to authenticate into an external security manager. When configuring a Portal search collection, the content source URL must not be protected by the external security manager. The external security manager considers the HCL WebSphere Portal search engine crawler to be an anonymous user.

Resolution:

An example of the auto-generated URL for a Portal site content source is:

If the Tivoli Access Manager for e-business standard WebSEAL junction is /portaljunction, then based on the above auto-generated URL, the following URL must be unprotected by the external security manager:

Otherwise, you will not retrieve the intended content and might receive a message similar to the following:

What follows are some sample Tivoli Access Manager for e-business pdadmin commands that make the above content source URL available to anonymous users such as the Portal Search Engine crawler:

When the end user executes a search query and gets a list of search results, WebSEAL will automatically rewrite the URL links within the HTML response. WebSEAL will ensure the URL links sent from back-end HCL WebSphere Portal servers across standard WebSEAL junctions are reconstructed appropriately for use by clients. For example, in the URL links, the hostname will change from HCL WebSphere Portal to WebSEAL and the junction name will be inserted.


Attempting to upgrade the Remote Search Server. Error occurs....'CRIMA1006E...cannot exist in the same package group'

Issue: While trying to upgrade the Remote Search Server using IIM from the command line....

...the following error occurs....

Resolution: This error condition can be avoided by either using the IIM GUI or....if you want to use the command line....you must run the IIM in console mode. These steps can be found at the link below under the 'Use Console Mode Interface' section....


Sample steps for implementing a search query using WCM HTML Component and Search Component

  1. Create a HTML Component with these contents:

      <script>
      function addFilter(queryIn)
      {
      return queryIn;
      }
      <script>

      <form onSubmit="this.search_query.value=addFilter(this.query.value)">
      Query: <input name="query"/>
      <input type=hidden name="search_query"/>
      <form>

  2. Next create a Search Component with these contents:

    Header:

      <table>

    Results:

      <tr><td>
      [AttributeResource attributeName="namelink" separator=","]<br>
      [AttributeResource attributeName="summary" separator=","] <td><tr>

    Footer:

      <tr><td>
      <td><tr>
      <table>

    Separator:

      <tr><td bgcolor="#FFFAA" colspan="2"/></tr>

    NoResultDesign:

    There are no results for your query. Please refine your search and try again.

Be sure to select the desired collection to search when creating the Search Component.

For this to work, you need to reference both components created above in the SAME presentation template.


See also

support.hcltechsw.com