Create search collections


Overview

To administer search collections, click...

The Search Collections panel lists the search collections. The selectable options that are displayed and available for collections and content sources depend on their type and setup.

Locate a collection and perform one of the following tasks by clicking the appropriate icon for that collection:

  • Refresh Collection Data

    Manually refresh the selected search collection. The index performs a complete re-crawl on all the content sources of the search collection.

  • Add Document

    Manually add a new document to a collection. You can specify the new document either as a file by a file location or as a Web document by a URL. Depending on whether you selected File or URL, you need to update the document location in the panel for editing the document content information:

    • For content specified by file location, the field Edit Document Information for URL - Update machine name and driver for this URL has a partial file location filled in, based on the file location that you entered as follows: file:// [machine name]/C$/your_path/your_file_name. Update the contents of the field to a valid file location by which users can access the document. To do this, replace the string [machine name] by the name of the machine on which the document resides.

        For security reasons some browsers prevent access to the file system. If environment requires searching files, you find information about how to configure the browser for accessing the file system in the Internet.

    • For content specified by URL, the field Edit Document Information for URL - Update machine name and driver for this URL has a document URL filled in, based on the URL that you entered. Update this URL as necessary to a valid URL by which users can access the document.

      The document that you add must be accessible to the crawler and to the users who will search the document. For example, a document specified by file location must be available in a public share, if you want anonymous users to be able to search it.

  • Pending Documents

    The documents returned by a crawl of the selected search collection are sent to the Pending Documents box if you disable the option for adding them to the collection automatically. Use the Pending Documents panel to accept or reject these documents. By accepting documents you make them available for search by users. When you accept a document, you can also edit its metadata. You disable the option Add all documents to collection automatically for a content source in the Manage Search portlet. If you do this, documents that result from a crawl are moved to the Pending Documents box. The Pending Documents icon appears for a collection only if there are pending documents from a content source of that collection available.

  • Category Tree

    If you are using a rule-based taxonomy for the selected search collection, use this option to manage that taxonomy, that is to work with categories and filter rules.

    The Category Tree icon appears for a collection only if a user-defined categorizer has been defined for that collection.

  • Delete Collection

    Delete the selected search collection.

  • Select a collection by clicking the collection name link. Portal Search displays the Content Sources and the Status of the selected collection. You can select the following option icons and perform the following tasks:

    • New Content Source

      Create a new content source for this collection. You can create more than one content source for a search collection.

    • Refresh the list of content sources and the status shown for this collection.

    • Work with the content sources of the collection.

        Search Collection Name: >Shows the name of the selected search collection.
        Search Collection Location:

        Shows the location of the selected search collection in the file system. This is the full path where all data and related information of the search collection is stored.
        Collection Description:

        Shows the description of the selected search collection if available.
        Search Collection Language:

        Shows the language for which the search collection and its index are optimized. The index uses this language to analyze the documents when indexing, if no other language is specified for the document. This feature enhances the quality of search results for users, as it allows them to use spelling variants, including plurals and inflections, for the search keyword.
        Categorizer used:

        Shows the categorizer used by the search collection.
        Summarizer used:

        Shows whether a static summarizer is enabled for this search collection.
        Remove common words from queries:

        Shows whether the indexer and the search filter out common words from documents, such as and, the, of.

        These words are also called stop words. The following words are filtered out for English: about all also am an and any are as at be been but by can de did do does for from had has have he her him his how if in into is it its may more my nbsp new no non not of on one or other our she so some than that the their then there these they this those thus to up us use was we were what when where which while why will with would you yours .
        Last update completed:

        Shows the date when a content source defined for the search collection was last updated by a scheduled update.
        Next update scheduled:

        Shows the date when the next update of a content source defined for the search collection is scheduled.
        Number of active documents:

        Shows the number of active documents in the search collection, that is, all documents that are available for search by users.
        Notes:

        1. To update the status information, click Refresh

          Clicking the refresh button of the browser will not update the status information.

        2. If you delete a portlet from the portal after a crawl of the portal site, the deleted portlet is no longer listed in the search results. However, refreshing the view does not update the status information about the Number of active documents. This information is not updated until after the next cleanup run of portal resources.


    Parent

    Set up search collections
    Set up JCR search collections
    Delayed cleanup of deleted portal pages


    Related tasks


    Manage the content sources of a search collection
    Export and import search collections
    Hints and tips for using Portal Search

     


    +

    Search Tips   |   Advanced Search