+

Search Tips   |   Advanced Search

Search collections


Configure

To configure search collections, click...

Options...

We cannot create additional search collections for the default Content Model search service. When we specify the directory location for the collection, be aware that creating the collection can overwrite files is that directory.


Set up a JCR search collection

A JCR search collection is a special purpose search collections used by WebSphere Portal applications. It is not designed to be used alongside user-defined search collections and requires a special setup, including the creation of a new content source for the search collection.

The portal installation creates a default JCR search collection named JCRCollection1. If this collection is removed or does not exist for other reasons, we can manually re-create the JCR search collection.

WCM Authoring search is requires a JCR search collection available, paired with the respective content source. If the JCR search collection gets deleted, a search is not possible using the Authoring portlet. The JCR search collection can only be used by a search portlet that knows how to present and deal with the search result in which the returned information is useless in a more generic context of search.

The JCR search collection is flagged so that it does not participate in search using the All Sources search scope. An administrator cannot manually add it. The JCR search collection is a special purpose search collection which the JCR requires to allow specialized application to perform low-level searches in the repository.

To create a new JCR search collection.

  1. Go to...

      Portal Administration | Search Administration | Manage Search | Search collections | New collection

    ...and specify the following values...

    Search Service For stand-alone environments, select Default Portal Service.
    For clustered environment, select Remote Search Service.
    Location of collection For example, /opt/IBM/Portal/WebSphere1/PortalServer/JCR
    Name of collection. The name of the collection should be JCRCollection1
    Description of collection Optional. Specify JCR seedlist collection.
    Specify Collection language Default is English (United States).

    After creating the new collection we can see the name of the collection you have created in the list.

  2. Double-click the collection that you have created.

  3. To create the content source for the new search collection, click New Content Source and set....

    Content source type Seedlist Provider
    Content Source Name JCRSource
    Collect documents linked from this URL http://server:10039/seedlist/server?Action=GetDocuments&Format=ATOM&Locale=en_US&Range=100&Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JCRRetrieverFactory&Start=0&SeedlistId=1@OOTB_CRAWLER1

    For virtual portal...

    http://server:port_number/seedlist/server/vpcontext?Action=GetDocuments&Format=ATOM&Locale=en_US&Range=100&Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JCRRetrieverFactory&Start=0&SeedlistId=1@OOTB_CRAWLER1

    In the URL above the range parameter specifies the number of documents in one page of a session.

    On the Security panel, set User Name and Password, then click Create

  4. Click Create to create the new content source.

    If the Content Source was created successfully, the following message will be displayed on the page:

      EJPJB0025I: Content source source_name in collection collection_name is OK.

  5. To start the crawler manually, navigate to the content source and click the Start Crawler button for the content source.

    To schedule the seedlist crawler, click the Edit Content Source button, and click the Scheduler tab. Specify the date and time and the frequency for the crawl. The crawler will be triggered automatically at the time that you scheduled.

Verify icm.properties has jcr.textsearch.enabled=true


Web Content Manager

If we use Web Content Manager, the JCRCollection1 collection is created the first time you create a web content item, if it does not already exist. In this case, it might not be necessary to create the collection manually, although it is fine to create it manually first, if required. It is used by the search feature within the Web Content Manager authoring portlet.

If you delete this search collection, you will not be able to search for items within the authoring portlet.

If you create the virtual portal with content, the portal creates the JCR collection for the virtual portal by default. If you create only the virtual portal and add no content to it, the portal creates no JCR collection with it. It will get created only when content is added to the virtual portal.

We can view the URL of the JCR collection in the search administration portlet Manage Search of the virtual portal. The URL looks as follows:

http://host_name:port_number/seedlist/server?Action=GetDocuments&Format=ATOM&Locale=en_US&Range=100&Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JCRRetrieverFactory&Start=0&SeedlistId=wsid@ootb_crawlerwsid

...where wsid is the actual workspace ID of the virtual portal. The workspace ID is the identifier of the workspace in which the content item is created, stored and maintained.

For example, if the workspace ID of the virtual portal is 10, then the URL looks as follows:

http://host_name:port_number/seedlist/server?Action=GetDocuments&Format=ATOM&Locale=en_US&Range=100&Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JCRRetrieverFactory&Start=0&SeedlistId=10@ootb_crawler10


Manage the content sources of a search collection

To work with content sources of a collection select...

Then select a search collection by clicking the collection name link. Portal Search displays the Content Sources panel. It shows the status of the selected search collection and lists its content sources and their status. It shows information related to the individual content sources, and lets you perform tasks on these content sources.

We can select the following option icons and perform the following tasks in relation to the search collection which we selected from the Search Collections list:

See also...


Export and import search collections

  1. From Manage Search portlet, select option...

      Import or Export Collection

  2. On the source portal export the search collections.

    This exports the configuration data and document URLs of the search collections.

    1. Verify the portal application process has write access to the target directory.
    2. When we specify the target directory location for the export, be aware that the export can overwrite files is that directory.

  3. For each collection, document...

    • The target file names and directory locations to which you export the collections.
    • The location, name, description, and language

  4. Create the search collections on the target portal.

    This task creates the empty shell for the search collection. Complete the following data entry fields...

    Location of Collection New collection location.
    Name of Collection The name can match the old setting, but does not have to match it.
    Description of Collection The description can match the old setting, but does not have to match it.
    Specify Collection Language Select this to match the old setting.
    Select Summarizer You do not need to select this option. The value will be overwritten by the import.

    You do not have to add content sources or documents; that will be completed by the import task.

  5. Import the data of the search collections into the target portal.

    For the import source information use the documented file names and directory locations to which you exported the collections before the portal upgrade.

  6. Configure Portal Search on the target portal.

  7. Verify the portal application process has write access to the target directory.

  8. Import collection data only into an empty collection.

    Do not import collection data into a target collection that has content sources or documents already.

    When you import search collection data into a collection, most of the collection configuration data are also imported, including...

    • content sources
    • schedulers
    • filters
    • language settings

    If we configured such settings when creating the new collection, they are overwritten by the imported settings.

  9. To migrate from one portal version to a higher version, delete the search collections between the export and the re-import.

  10. When you import a portal site collection from a V5.1 portal to a V6 portal, the collection configuration data are imported, but not the documents. Therefore, to enable users to search the portal site collection on the target portal, we can either...

    This restriction does not apply if you migrate the portal site search collections between Version 6 portals.

  11. When you import a collection, a background process fetches, crawls, and indexes all documents which are listed by URL in the previously exported file. Therefore be aware of the memory and time required for crawls.

Related : Prepare for Portal Search
Portal Search
Administer Portal Search
Reset the default search collection
Delayed cleanup of deleted portal pages
Tips for using Portal Search
Seedlist 1.0 REST service API