Create WebSphere Portal search collections

 

+

Search Tips   |   Advanced Search

 

To administer search collections...

Launch | Main Menu | Administration | Search Administration | Manage Search | Search Collections

This panel includes creating, updating, and removing search collections, and other administrative tasks that refer to search collections.

 

Manage search collections

When you select Search Collections, Manage Search displays the Search Collections panel. It lists the search collections in the portal and related information, and it allows you to select options and perform tasks on the search collections and their content sources.

The selectable options that are displayed and available for collections and content sources depend on their type and setup.

In the Search Collections panel we can select the following option icons and perform the following tasks:

  • Change the search collection with which you want to work. To do this, select another search collection from the pull-down list.

  • New collection. Select this option to create a new search collection.

  • Refresh the list of collections.

  • Locate a collection and perform one of the following tasks by clicking the appropriate icon for that collection:

    • Search and Browse Collection.

      Work with the documents of the selected collection. We can perform the following administrative tasks:

      • Browse the documents of the selected collection.
      • View the individual documents of the selected collection.
      • Search the documents of the selected collection.
      • Edit the fields of the documents in the selected collection.
      • Delete documents from the selected collection.

      The panel design of the Browse Documents page is similar to that of the Search and Browse portlet that users use to search documents.

    • Import or Export Collection.

      Useful for prevention of data loss when upgrading software versions...

      1. Export all search collection data to XML files
      2. Upgrade software.
      3. Use the exported files to return the search collection data back into the new software version

      Before exporting verify the portal application process has write access to the target directory.

      We can import collection data only into an empty collection. We cannot import collection data into a target collection that already has content sources or documents.

      When you import collection data into a collection, all collection settings are overwritten by possibly imported settings. For example, the language setting is overwritten.

      When you import a collection, a background process fetches, crawls, and indexes all documents that are listed by URL in the previously exported file. Understand the implications of this process...

    • Refresh Collection Data.

      Manually refresh the selected search collection. The index performs a complete re-crawl on all the content sources of the search collection.

    • Add Document.

      Manually add a new document to a collection.

    • Pending Documents.

      The documents returned by a crawl of the selected search collection are sent to the Pending Documents box if you disable the option for adding them to the collection automatically. Use the Pending Documents panel to accept or reject these documents. By accepting documents you make them available for search by users. When you accept a document, we can also edit its metadata.

      You disable the option Add all documents to collection automatically for a content source in the Manage Search portlet. If you do this, documents that result from a crawl are moved to the Pending Documents box.

      The Pending Documents icon appears for a collection only if there are pending documents from a content source of that collection available.

    • Category Tree

      If you are using a rule based taxonomy for the selected search collection, use this option to manage that taxonomy, that is to work with categories and filter rules.

      The Category Tree icon appears for a collection only if a user-defined categorizer has been defined for that collection.

    • Delete Collection.

      Delete the selected search collection.

  • Select a collection by clicking the collection name link. Portal Search displays the Content Sources and the Status of the selected collection. We can select the following option icons and perform the following tasks:

    • New Content Source.

      Create a new content source for this collection. We can create more than one content source for a search collection.

    • Refresh the list of content sources and the status shown for this collection.

    • Work with the content sources of the collection.

    • View the Collection Status information of the selected search collection. The status fields show the following data that changes over the lifetime of the search collection:

      Search Collection Name Name of the selected search collection.
      Search Collection Location Location of the selected search collection in the file system. This is the full path where all data and related information of the search collection is stored.
      Collection Description Description of the selected search collection if available.
      Search Collection Language Language for which the search collection and its index are optimized. The index uses this language to analyze the documents when indexing, if no other language is specified for the document. This feature enhances the quality of search results for users, as it allows them to use spelling variants, including plurals and inflections, for the search keyword.
      Categorizer used Categorizer that is used by the search collection.

      See also...

      Summarizer used Whether a static summarizer is enabled for this search collection.
      Remove common words from queries Whether the indexer and the search filter out common words from documents, such as and, the, of.
      Last update completed Date when a content source defined for the search collection was last updated by a scheduled update.
      Next update scheduled Date when the next update of a content source defined for the search collection is scheduled.
      Number of active documents Number of active documents in the search collection, that is, all documents that are available for search by users.

      To update the status information, click Refresh. Clicking the refresh button of the browser will not update the status information.

      If you delete a portlet from the portal after a crawl of the portal site, the deleted portlet is no longer listed in the search results. However, refreshing the view does not update the status information about the Number of active documents. This information is not updated until after the next cleanup of portal resources.

 

Parent topic:

Setting up search collections