Create search collections
Overview
To administer search collections, click...
Administration | Search Administration | Manage Search | Search Collections
The Search Collections panel lists the search collections. The selectable options that are displayed and available for collections and content sources depend on their type and setup.
- Change the search collection with which you want to work. To do this, select another search collection from the pull-down list.
- New collection. Create a new search collection.
You cannot create additional search collections for the default Content Model search service.
When you specify the directory location for the collection, be aware that creating the collection can overwrite files in that directory.
- Refresh the list of collections.
Locate a collection and perform one of the following tasks by clicking the appropriate icon for that collection:
- Search and Browse Collection
Work with the documents of the selected collection. You can perform the following administrative tasks:
- Browse the documents of the selected collection.
- View the individual documents of the selected collection.
- Search the documents of the selected collection.
- Edit the fields of the documents in the selected collection.
- Delete documents from the selected collection.
- The panel design of the Browse Documents page is similar to that of the Search and Browse portlet that users use to search documents.
- Import or Export Collection
Import or export the selected search collection. Portal Search provides a Portal Search XML interface for this feature. The export and import operations can be of benefit when you upgrade to software levels which are not necessarily compatible with the data storage format of older versions of the software. To prevent loss of data, you export all data of search collections to XML files before upgrading the software. Then after upgrading the software level, you can use the previously exported files to return the search collection data back into the new software level.
Before you export a collection, verify the portal application process has write access to the target directory location. Otherwise you might get an error message, such as File not found.
You can import collection data only into an empty collection. You cannot import collection data into a target collection that has content sources or documents already.
When you import collection data into a collection, all collection settings are overwritten by possibly imported settings. For example, the language setting is overwritten.
When you import a collection, a background process fetches, crawls, and indexes all documents that are listed by URL in the previously exported file.
Refresh Collection Data Manually refresh the selected search collection. The index performs a complete re-crawl on all the content sources of the search collection.
Add Document Manually add a new document to a collection. You can specify the new document either as a file by a file location or as a Web document by a URL. Depending on whether you selected File or URL, you need to update the document location in the panel for editing the document content information:
- For content specified by file location, the field Edit Document Information for URL - Update machine name and driver for this URL has a partial file location filled in, based on the file location that you entered as follows: file:// [machine name]/C$/your_path/your_file_name. Update the contents of the field to a valid file location by which users can access the document. To do this, replace the string [machine name] by the name of the machine on which the document resides.
For security reasons some browsers prevent access to the file system. If environment requires searching files, you find information about how to configure the browser for accessing the file system in the Internet.
- For content specified by URL, the field Edit Document Information for URL - Update machine name and driver for this URL has a document URL filled in, based on the URL that you entered. Update this URL as necessary to a valid URL by which users can access the document.
The document that you add must be accessible to the crawler and to the users who will search the document. For example, a document specified by file location must be available in a public share, if you want anonymous users to be able to search it.
Pending Documents The documents returned by a crawl of the selected search collection are sent to the Pending Documents box if you disable the option for adding them to the collection automatically. Use the Pending Documents panel to accept or reject these documents. By accepting documents you make them available for search by users. When you accept a document, you can also edit its metadata. You disable the option Add all documents to collection automatically for a content source in the Manage Search portlet. If you do this, documents that result from a crawl are moved to the Pending Documents box. The Pending Documents icon appears for a collection only if there are pending documents from a content source of that collection available.
Category Tree If you are using a rule-based taxonomy for the selected search collection, use this option to manage that taxonomy, that is to work with categories and filter rules.
The Category Tree icon appears for a collection only if a user-defined categorizer has been defined for that collection.
Delete Collection Delete the selected search collection.
Select a collection by clicking the collection name link. Portal Search displays the Content Sources and the Status of the selected collection. You can select the following option icons and perform the following tasks:
- New Content Source
Create a new content source for this collection. You can create more than one content source for a search collection.
- Refresh the list of content sources and the status shown for this collection.
- Work with the content sources of the collection.
Search Collection Name: >Shows the name of the selected search collection.
Search Collection Location:Shows the location of the selected search collection in the file system. This is the full path where all data and related information of the search collection is stored.
Collection Description:Shows the description of the selected search collection if available.
Search Collection Language:Shows the language for which the search collection and its index are optimized. The index uses this language to analyze the documents when indexing, if no other language is specified for the document. This feature enhances the quality of search results for users, as it allows them to use spelling variants, including plurals and inflections, for the search keyword.
Categorizer used:Shows the categorizer used by the search collection.
Summarizer used:Shows whether a static summarizer is enabled for this search collection.
Remove common words from queries:Shows whether the indexer and the search filter out common words from documents, such as and, the, of.
These words are also called stop words. The following words are filtered out for English: about all also am an and any are as at be been but by can de did do does for from had has have he her him his how if in into is it its may more my nbsp new no non not of on one or other our she so some than that the their then there these they this those thus to up us use was we were what when where which while why will with would you yours .
Last update completed:Shows the date when a content source defined for the search collection was last updated by a scheduled update.
Next update scheduled:Shows the date when the next update of a content source defined for the search collection is scheduled.
Number of active documents:Shows the number of active documents in the search collection, that is, all documents that are available for search by users.
Notes:
- To update the status information, click Refresh
Clicking the refresh button of the browser will not update the status information.
- If you delete a portlet from the portal after a crawl of the portal site, the deleted portlet is no longer listed in the search results. However, refreshing the view does not update the status information about the Number of active documents. This information is not updated until after the next cleanup run of portal resources.
Parent
Set up search collections
Set up JCR search collections
Delayed cleanup of deleted portal pages
Related tasks
Manage the content sources of a search collection
Export and import search collections
Hints and tips for using Portal Search