Manage Search
Use the Manage Search portlet to administer search.
To manage Portal Search, click the Launch icon for
the main menu on the header bar, then . The portal displays the
administration portlet Manage Search.
You can use it to view the following search related resources and work
with them:
- Search Services. Search Services
represent separate instances of the search engine provided by WebSphere Portal
for use by the Search Center.
- Search Collections and content sources A search
collection contains one or more content sources of Web content or portal content
and the related full-text indexes. This allows searches of the content by
end users.
- Search Scopes and Custom Links . Search scopes
limit search results to specific content locations and specific document types.
Administrators define search scopes to enable users to target their searches.
Custom Links are Web link shortcuts that allow users to do direct searches
to popular Web search engines, such as Google or Yahoo.
Notes:
- This portlet help gives instructions for using the Manage Search portlet
only. For more information about search services, collections and scopes,
planning considerations and how to configure search in your portal refer to
the WebSphere Portal Information Center > Portal Search.
- For some portlet panels Manage Search shows a bread
crumb trail of your navigation path below the portlet title bar. If you want
to return to a previous panel of the portlet, click the appropriate link in
the bread crumb trail.
- When you work with the Manage Search portlet, data
entry fields marked with a red asterisk ( * ) are mandatory.
Search Services
Search Services allows
you to view and manage the WebSphere Portal search services. Search Services
represent separate instances of the search engine provided by WebSphere Portal
that can be used for searching content by means of the Search Center. When
you create a search collection, you have to select a search service. That
search service will be used to perform searches that end users request on
that collection. A search service can be used for searching multiple search
collections. You can set parameters to configure a portal search service.
This allows you to set up separate instances of search services with different
configurations. You can also set up multiple portal search services and thereby
distribute the search load over several nodes. The following Search Services
are provided by WebSphere Portal by default:
- Portal Search Service
- Select the Portal Search Service to manage search collections that contain
portal pages, content managed by Web Content Management, or indexed Web pages.
For a cluster portal environment you need to set up a remote search service.
For details about how to do this refer to the Portal Search documentation
in the WebSphere Portal Information Center.Note: The HTTP crawler of the Portal
Search Service does not support JavaScript. Text that is generated by JavaScript
might not be available for search.
- Content Model Search Service
- Select the Content Model Search Service to manage the search collection
that contains content stored on the Java Content Repository (JCR).Note: At
this time the Content Model Search Service has only one search collection.
This search collection is provided with the portal installation by default.
You cannot modify this default Content Model search collection or create additional
search collections under the Content Model Search Service.
You can also create additional custom search services
and add them to your portal.
Managing search services
To
manage search services, click Search Services. Manage
Search shows the Search Services page. It lists the Search Services in your
portal and their status, that is whether they are available or not. Select
the following options or icons and perform the following tasks on search services:
Creating a new search service
To
create a new search service, click the New Search Service button. Manage
Search displays the New Search Service page. Enter the required
data in the fields and select from the available options:
- Service name
- Enter a name for the new search service. The name must be unique within
the current portal or virtual portal. This field
is mandatory.
- Service implementation
- Select the required search service implementation from the drop-down menu.
- Service parameters
- Existing service parameters are listed in the table. Select from the following
tasks:
- Adding a new service parameter
- If required, enter a new service parameter key and its value, and click
the Add Parameter button. Manage Search refreshes the
parameter list with the new parameter added.
- Editing a parameter
- To edit a parameter, proceed as follows:
- Locate that parameter in the list and click the Edit icon.
Manage Search shows the Edit parameter page.
- Enter a new value for the parameter as required. (The Parameter Key field
is blocked from updates.)
- Click OK to save your update, or click Cancel to
return and keep the previous value.
For more details about the parameters, refer to the Portal Search
topics in the Information Center.
-
- To delete a service parameter, locate that parameter in the list and click
the Delete icon. When the confirmation prompt shows,
confirm by clicking OK, or click Cancel to
return without deleting the service parameter.
When you have completed the data entry and selection
of options, click OK to save the new search service.
To return without saving, click Cancel.
Managing
the collections of a search service
To manage the collections of
a search service, click the name of that search service in the services list.
You can also select Search Collections from the main
Manage Search portlet panel. Manage Search displays the Search Collections
page. It lists the search collections of the selected search service. You
can now manage these search collections and their content sources. For details
about search collections and how to manage them refer to Search Collections and content sources.
Editing a search service
To edit a search
service, locate that search service in the list and click the Edit icon. Manage
Search displays the Edit Search Service page. Update the service data and
select from the available options as required:
- Service name
- Update the name for the search service as required. The name must be unique
within the current portal or virtual portal.
For the other data entry fields and options, proceed
as described under Creating
a new search service.
To delete a search service, locate that search service in
the list and click the Delete icon. When the confirmation
prompt shows, confirm by clicking OK, or click Cancel to
return without deleting the search service.
Search Collections and content sources
Search
Collections allows you to view and manage the search collections and their
content sources in the portal. You can build and maintain search collections
of Web content, Web Content Management Content, and portal content, and the
related search collections. Users can then search these collections by using
the portal Search Center. If you configure a copy of the Search and Browse
portlet for a search collection, users can perform advanced searches on that
collection.
A search collection can have one or more content sources
with content such as Web pages, Web Content Management content, or portal
pages and portlets.
During the search collection build process, content
is retrieved for indexing through a crawler (robot) from the content sources.
The search collection stores keywords and metadata, and maps them to their
original source. It thereby allows fast processing of requests from the Search
and Browse or Search Center portlets.
Searchable resources can be stored
on the local portal server or on remote content sources. Content can be processed
by the crawlers, if it is accessible through the HTTP protocol. For example,
this can be portal pages, Web Content Management content, and documents and
content hosted by Web servers. The documents can be of different types, for
example, editable text files, office suite documents, such as Microsoft and
OpenOffice, or PDF files.
Note: In order to make documents available for
search by users, make sure you perform the following tasks:
- To add documents collected by a crawl, do either of the following:
- Select global acceptance of documents returned by a crawl. You do this
by enabling the option Add all documents to collection automatically when
adding a new content source to a collection.
- Accept documents individually after a crawl: Click the View
Pending Documents icon and accept the required documents from
the document list resulting from the crawl. For details refer to Working
with Pending Documents.
- The Search Center is part of the default portal installation. If you want
your users to be able to perform more advanced searches by using the Search
and Browse end user portlet, install the Search and Browse portlet, configure
it with the name of the search collection that you created, and add it to
the users’ pages. For details refer to the Search topic in the Information
Center, section Configuring the Search and Browse portlet for users.
For more details about how to work with search collections
and content sources, refer to the following sections:
Managing Search Collections
To
manage search collections and their content sources, click Search
Collections. Manage Search shows the Search Collections page.
It lists the search collections in your portal, together with related information,
such as the following:
- The name of the search collection
- The description of the search collection, if available
- The search service by which the collection is indexed and searched
- The number of documents in the collection
- The icons for performing tasks on the search collection.
From the Search Collections panel, select the following options
or icons and perform the following tasks on search collections:
- Search Service. If you clicked Search
Collections from the main Manage Search panel, the Search Collections
panel lists all the search collections in your portal. To restrict the list
to search collections of one search service, select that search service from
the search services pull-down list. If you entered the search Collections
panel by clicking a search service name in the list of search services, the
list shows only the collections for that service. If you want to view other
collections, select the search service as required from the pull-down list.
- New collection. Select this option to create
a search collection.
- Refresh. Select this option to refresh the list
of search collections. This updates the information and the available option
icons for the collections. Examples:
- If a crawl is running or was completed, the number of documents is updated.
- If a crawl was completed on a collection since the last refresh, option
icons can appear, such as View Pending Documents or Search
and Browse the Collection.
- If another administrator also worked on search collections at the same
time, the information is updated accordingly.
- Arrow icons. To go to a different page in the list
of search collections, click the required arrow icon, or enter a page number
in the page number entry field and click the Go icon.
Both options are available at the top and the bottom of the search collections
list.
- Click one of the links or icons for a specific search collection and perform
one of the following tasks. Note: The icons for some tasks are only available
if the current user can perform the specific task on the search collection.
Creating a search collection
To
create a new search collection, proceed by the steps laid out in the following.
Note: The
parameters that you select here when you create the search collection cannot
be changed later. Therefore plan well ahead and apply special care when you
create a new search collection. If you want to change parameters for a search
collection, you have to create a new search collection and select the required
parameters for it. You can then export the data from the old collection and
import it into the new collection. For details about how to do this refer
to Exporting a search collection and Importing a search collection.
- Click New Collection. Manage Search displays the
Create Collection panel.Note: The parameters that you select here when you
create the search collection cannot be changed later. If you want to change
parameters for a search collection, you have to create a new search collection
and select the required parameters for it. You can then export the data from
the old collection and import it into the new collection. For details about
how to do this refer to Exporting
a search collection and Importing
a search collection.
- Location of Collection. Use this entry field to
type the directory path where you want the new search collection to be created
and the related data to be saved. This field is mandatory as indicated by
the red asterisk ( * ). The location of a collection is the directory in which
the collection data is stored. It can be a full path or a path relative to
the Collections Locations search service parameter. Depending
on what you type, the search collection is created in the following location:
- If you type a name of your choice, the location for the new search collection
is combined from the default directory for search locations and the name you
type. Example: If you type my_collection_location,
the new search collection is created under the directory wp_root/collections/my_collection_location .
For details about the default directory for search collections and how you
configure it refer to the Portal Search topic in the WebSphere Portal Information
Center under Configuring the Manage Search portlet.
- If you want to create the search collection in a location that is different
from the default search collection location, type the full directory location
as required. The new search collection will be created under the directory
location that you specified.
- Name of Collection. Use this entry field to type
the name that you want to give to the new search collection. The name that
you enter here will show for the search collection in the search collection
list and in the hierarchy tree of available content sources when you select
locations for scopes. If you do not enter a name, the location that you entered
in the previous field is used as a name for the search collection.
- Description of Collection. Use this entry field
to type a description for the new search collection. The description that
you enter here will show for the search collection in the search collection
list.
- Specify Collection Language. Use this pull-down
selection list to select the required language for the search collection.
The search collection and its index is optimized for this language. This feature
enhances the quality of search results for users, as it allows them to use
spelling variants, including plurals and inflections, for the search keyword.
Portal search uses this language for indexing if there is no language defined
for the document. Select one of the Unspecified options
in order to index documents without any stemming of the words.Note: This
setting is not overwritten when you import a search collection, for example,
during the migration of a search collection. If you create the search collection
for the purpose of migrating an existing search collection, fill this in to
match the setting in the source collection that you want to migrate.
- Select Categorizer. Use this pull-down selection
list to select the required categorizer for the search collection. Possible
values are:
- None.
- Pre-Defined. This categorizer is based on a static taxonomy.
- User-Defined. This categorizer is rule-based.
- Select Summarizer. Use this pull-down selection
list to select the required summarizer for the search collection. Possible
values are:
- None
- No summary is generated for documents. If you select this option, the
Search Center uses the description metadata from the document, if the document
has one.
- Automatic
- An automatic summarizer is used.
- Remove common words from queries. If you want the
index of the search collection to filter out common words, mark the checkbox
for this option. If you select this option, the indexer and the search will
filter out common words from indexed documents and search strings. Examples
for English are: and, or, the, of, in, on. Note: This
setting is not overwritten when you import a search collection, for example,
during the migration of a search collection. If you create the search collection
for the purpose of migrating an existing search collection, fill this in to
match the setting in the source collection that you want to migrate.
- Click OK to save your updates, or click Cancel if
you do not want to save the updates.
- Manage Search returns to the previous panel. If you clicked OK,
the Search Collection list shows the new search collection by the name that
you specified. If you did not specify a name, the list shows the directory
path location that you specified.
Viewing the status of a search collection
To
view the status of the search collection, click the collection
name in the list of search collections. Manage Search shows the Content
Sources and the Search collection status information of
the selected search collection. The status fields show the following data
that changes over the lifetime of the search collection:
- Search collection name:
- Shows the name of the selected search collection. If you did not enter
a name for the Search collection, the collection location is shown here instead.
- Search collection location:
- Shows the location of the selected search collection in the file system.
This is the full path where all data and related information of the search
collection is stored.
- Collection description:
- Shows the description of the selected search collection, if available.
- Search collection language:
- Shows the language for which the search collection and its index is optimized.
The index uses this language to analyze the documents when indexing, if no
other language is specified for the document. This feature enhances the quality
of search results for users, as it allows them to use spelling variants, including
plurals and inflections, for the search keyword.
- Categorizer used:
- Shows the categorizer that is used by the search collection. For more
information about setting a rule-based categorizer for a search collection
see Configuring the
Destination Categories. For more information about setting a static
categorizer for a search collection see the Taxonomy Manager portlet help.
- Summarizer used:
- Shows whether a static summarizer is enabled for this search collection.
- Remove common words from queries:
- Shows whether common words are filtered out during indexing this search
collection.
- Last update completed:
- Shows the date when a content source defined for the search collection
was last updated by a scheduled crawl and indexed.Note: The timeout that you
might have set under Stop collecting after (minutes): works
as a fuzzy time limit. It might be exceeded by some percentage, as indexing
the documents after the crawl takes additional time. Therefore allow some
tolerance.
- Next update scheduled:
- Shows the number of active documents in the search collection, that is,
all documents that are available for search by users.
- Number of active documents:
- Shows the number of active documents in the search collection, that is,
all documents that are available for search by users.
Note: To view updated status information about the search
collection, click the Refresh button of the browser.
On the same
panel you can also manage
the content sources of the search collection.
If you have a faulty
search collection in your portal, the portlet shows a link that takes you
to that faulty collection.
Working with Pending Documents
By
default an indexer crawl on a search collection makes the returned documents
available for search by users.
If you want to select and approve these
documents before they are made available for search by users, remove the check
mark from the option Add all documents to collection automatically for
the content source under the Advanced Parameter tab
when adding a new content source.
Documents that result from a crawl on that content source are then moved to
the Pending Documents box for approval. The documents are not indexed and
cataloged until an administrator processes them in the Pending Documents panel.
The
Pending Documents panel contains a list with all documents that the index
crawler collected. This includes documents from all content sources defined
for the selected search collection, except for those content sources for which
the option Add all documents to collection automatically was
enabled. In the Pending Documents panel you can edit and accept, or reject
the documents individually. To perform these tasks, proceed as follows:
- Locate the search collection for which you want to accept or reject documents.
- Click the View Pending Documents icon next to that
search collection. Manage Search displays the Pending Documents panel. If
the list has more than one page of pending documents, use the arrows or the
pull-down list to select other pages.
- To view a document, click the document title in the list. Manage Search
displays the document in a new window, depending on whether the appropriate
viewer for that document type is configured for the browser.
- To modify the information for a document, click Edit for
the document which you want to modify. Manage Search displays the panel for
editing the document information. This panel has two boxes. One shows the Document
content (Read only) as it was returned by the crawler. The fields
in this box are blocked. The other box is named Updated content.
The fields in this box are empty. You enter the new information as required.
You can modify the following:
- The Title of the document.
- The Author of the document.
- The Subject of the document.
- The Modification date, that is, the date when the
document was last modified.
- The Destination Categories of the document. You
can add or remove categories associated with the document. This option is
only available if a categorizer was selected when the collection was created.
- The Description of the document.
- The Keywords of the document.
- Enter your updates as required.
- Click Copy to copy the data from Document Content
to Updated Content. Use this option if you want to keep some of the document
information and only make additions or minor changes to it. You can still
overwrite the copied information under Updated Content.Note: If you fill in
one or more of the fields in the Updated Content and you click OK,
all data under Document Content are overwritten by the data in the fields
under Updated Content, even if some of these fields are left empty.
- Click Reset to cancel your updates and return to
the original state of the panel.
- Click OK to save your updates and return to the
previous panel.
Click Cancel to return without saving
the updates.
- Manage Search returns to the previous panel.
- Select Accept for the documents that you want to
make available to users for search.
Select Reject for
the documents that you do not want to make available to users for search.
- Click Reset to cancel your selection and return
to the original state of the Pending Documents panel. Clicking Reset works
only if you have not clicked Apply yet after you made
your selection.
- Click Accept All to accept all listed documents.
- Click Reject All to reject all listed documents.
- Click Apply to make your selections become effective.
Manage Search enters the documents you accept into the system, and indexes
and catalogues them. Manage Search discards the documents you reject. Once
you click Apply, you cannot use Reset to
reset the list of documents.
- Click Refresh to refresh the list of pending documents.
This updates the list with the new documents that came in while you were working
on the pending documents.
- Click the appropriate link in the bread crumb trail at the top of the
portlet to return to the list of search collections.
If a document is changed on its original content source, for example
on the HTTP server where it is stored, it will appear again under Pending
Documents after the next crawl. You can then modify, accept, or reject that
document again from the Pending Documents panel.
Searching
and browsing a Search Collection
To browse a search collection
proceed as follows:
- Locate the search collection which you want to browse.
- Click the Search and Browse Collection icon for
that collection. The Browse Documents panel is displayed.
From the Browse Documents panel you can browse through the entire
search collection. If a collection is associated with a category tree, you
can navigate the tree and see which documents are associated with each category.
You can also delete documents and edit the metadata associated with documents
as in the Pending Documents panel. For more information
about these operations refer to Working
with pending documents. Use the Search feature
to perform a search on the collection. To return to the list of collections,
click the appropriate link in the bread crumb trail at the top of the portlet.
Migrating search collections
When you upgrade
to a higher version of WebSphere Portal, the data storage format is not necessarily
compatible with the older version. To prevent loss of data, export all data
of search collections to XML files before upgrading. After the upgrade you
create a new search collection and use the previously exported data to import
the search collection data back into your upgraded portal.
Notes:
- If you do not perform these steps, the search collections are lost after
you upgrade your WebSphere Portal.
- When you create the search collection on the upgraded portal, type data
and make selections as follows:
- Fill the location, the name, and the description of the new collection
in as required. You can match the old settings or type new ones.
- For Remove common words from queries and Specify
Collection Language: Select these settings to match the settings
of the old search collection. Note: These settings are not overwritten when
you import a search collection, for example, during the migration of a search
collection. If you create the search collection for the purpose of migrating
an existing search collection, select these to match the setting in the source
collection that you want to migrate.
- You do not need to select a categorizer and summarizer. These settings
are overwritten by the settings when importing the data from the source search
collection.
- You cannot migrate a portal site collection between different versions
of Web Sphere Portal. If you upgrade your portal from one version to another,
you need to re-create the portal site collection. Proceed as follows:
- Document the configuration data of your portal site content source.
- Delete the existing portal content source.
- Upgrade your portal.
- On the upgraded portal create a new portal site content source. Use the
documented configuration data as required.
- Execute the new portal content source.
Portlets that were crawled in the portal before the upgrade, but do
not exist in the upgraded portal, are not returned by a search.
For
more detailed information about these tasks refer to the topics about migrating,
importing, and exporting search collections in the portal Information Center.
For
details about how to export and import search collections refer to Exporting
a search collection and Importing
a search collection.
Exporting a search
collection
To export a search collection and its data, proceed
as follows:
- Before you export a collection, make sure that the portal application
process has write access to the target directory location. Otherwise you might
get an error message, such as File not found.
- Locate the search collection that you want to export.
- Click the Import or Export Collection icon next
to the search collection in the list. Manage Search displays the Import and
Export Search Collection panel.
- In the entry field Specify Location (full path with XML extension):
type the full directory path and XML file name to which you want to export
the search collection and its data. Document the names of the collections
and the directory locations and target file names to which you export the
collections for the import that follows.
- Click Export to export the search collection data.
Manage Search writes the complete search collection data to an XML file and
stores it in the directory location that you specified. You can use this file
later as the source of an import operation to import the search collection
into another portal.
- To return to the previous panel without exporting the search collection,
click the appropriate link in the bread crumb trail at the top of the portlet.
Importing a search collection
To
import the data of a search collection, proceed as follows:
- Before you can import the collection data, you need to create the empty
shell for the search collection. You do this by creating
a search collection. You only need to fill in the mandatory data entry
field Location of Collection. Do not add content sources
or documents, as that will be completed by the import.
- On the search collection list locate the search collection into which
you want to import the search collection data.
- Click the Import or Export icon next to the search
collection in the list. Manage Search displays the Import and Export Search
Collection panel.
- In the entry field Specify Location (full path with XML extension):,
type the full directory path and XML file name of the search collection data
which you want to import into the selected search collection.
- Click Import to import the search collection data.
Manage Search imports the complete search collection data from the specified
XML file into the selected search collection.
- To return to the previous panel without importing a search collection,
click the appropriate link in the bread crumb trail at the top of the portlet.
- If required, you can now add content sources and documents to the search
collection.
Note: When importing a collection, be aware of the following:
- Import collection data only into an empty collection. Do not import
collection data into a target collection that has content sources or documents
already.
- When you import collection data into a collection, all collection settings
are overwritten by possibly imported settings. For example, the language setting
is overwritten, or a user-defined categorizer is added, if it was specified
for the imported search collection.
- When you import a collection, a background process fetches, crawls, and
indexes all documents that are listed by URL in the previously exported file.
This process is asynchronous. It can therefore take considerable time until
the documents become available.
- When you import a collection that contains a portal site content source
created in a previous version of WebSphere Portal, you need to regather the
portal content by deleting the existing portal site content source, creating
a new portal site content source, and starting a crawl on it.
Refreshing collection data
Refreshing
the data of a search collection updates that collection by renewed crawling
of all the content sources that are associated with it. To refresh a search
collection, click the icon Refresh Collection Data (regathering) for
that collection. Manage Search performs complete new crawls over all its content
sources. To verify progress and completion of the regathering, click the collection
and view the Collection Status information.Note: This might require a considerable
amount of system resources, as all content sources of the search collection
are crawled at the same time.
Adding documents
to a search collection
You can manually add documents to a search
collection. To do this, proceed as follows:
- Click the Add Document icon for that collection.
Manage Search displays the Add Document panel.
- Select whether you want to load the document by File or
by URL.
- Enter the location of the document that you want to add to the search
collection:
- For a file enter the directory location and file name in the entry field Specify
file location:. Use the Browse button if
required.
- For a Web document enter the URL in the entry field for URL.
- Click Continue. Manage Search displays the panel
for editing the document information.
- Update the document location, depending on whether you selected File or
URL in the previous panel:
- For content specified by file location in the previous panel, the field Edit
Document Information for URL - Update machine name and driver for this URL has
a partial file location filled in, based on the file location that you entered
as follows: file://[machine name]/your_file_path/your_file_name .
Update the contents of the field to a valid file location by which users can
access the document. To do this, replace the string [machine name] by
the name of the machine on which the document resides.
- For content specified by URL in the previous panel, the field Edit
Document Information for URL - Update machine name and driver for this URL has
a document URL filled in, based on the URL that you entered. Update this URL
as necessary to a valid URL by which users can access the document.
Note: The document that you add must be accessible to the crawler and
to the users who will search the document. For example, a document specified
by file location must be available in a public share, if you want anonymous
users to be able to search it.
- The other fields and options under the Document Content tab
are similar to those listed under Working
with Pending Documents. Proceed as described there.
- If you are using a rule-based categorizer taxonomy for the search collection
to which you are adding the document, the panel shows a Destination
Categories tab. Click this tab. Manage Search displays the panel
for selecting destination categories. Select the categories to associate them
with the document as required.
- Click OK. Manage Search adds the document to the
collection, indexes it, and returns to the search collections list.
- Add further documents as required.
To
delete a search collection, proceed as follows:
- Click the Delete icon for the search collection
which you want to delete.
- Confirm that you want to delete the search collection by clicking OK.
Manage Search deletes the search collection and removes it from the list.
If you do not want to delete the collection, click Cancel.
Note: If you delete the search collection before an upgrade to a higher
version of WebSphere Portal, make sure you export the search collection for
later import before you delete it. For details refer to Migrating
search collections.
Managing
the user-defined taxonomy for a search collection
If you associated
a search collection with a user-defined rule-based categorizer at creation
time, you can define its categories and create filter rules per category.
For details about this refer to Configuring
the Destination Categories.
Note: Performing category specific searches
on search collections with a taxonomy is only supported by the Search and
Browse portlet.
Rules determine which documents are associated with
categories. They control which of the documents that are fetched from the
content sources enter the search collection, and to which categories they
are assigned:
- Documents which pass at least one rule in a category are automatically
associated with that category.
- Documents can be associated with more than one category. If a document
passes rules in several categories, it is associated with all these categories.
- If no rules are defined for a category, then all fetched documents are
automatically associated with that category. This can be of benefit if you
need a category of type uncategorized for further refining
that taxonomy.
- If a document does not pass any of the rules of a category, it is not
associated with that category at all. A document which is not associated with
any category does not enter the system and is not indexed or cataloged.
- Each content source is associated with a fixed list of categories. Documents
from a specific content source can only be associated with the pre-defined
categories of that content source, depending on whether they pass the defined
rules.
- If you did not mark Add all documents to collection automatically when
creating the content source, you can change the association of documents with
categories in the Pending Documents panel. You can also change the category
of a document in the Browse Documents panel.
The categories that are defined per content source are a subset
of the entire category tree. The category tree is arranged in a hierarchy.
The tree starts with the Root category. All other categories stem from the
Root category.
You can select categories for the content sources that
you select for search scopes.
If you do not have the option Add
all documents to collection automatically enabled, you can always
change the automated association created by the system between a document
and a category. You perform this change from the Pending Documents panel,
before the document is indexed and cataloged.
To manage the categories
for a search collection associated with a rule-based categorizer, proceed
as follows:
- Locate the required search collection on the search collection list. This
search collection needs to have a rule-based categorizer.
- Click the Manage Collection Taxonomy for that search
collection. Manage Search displays the Manage Category Tree panel. It shows
the following.
- A Category Tree; it shows a hierarchical tree view
of the categories. Categories with subcategories have a box. Click the box
to collapse or expand that part of the tree hierarchy in the view.
- A Manage Categories box; use this box to manage
the categories for the taxonomy.
- A Manage Category Rules box; use this box to manage
the rules for the taxonomy.
- Proceed with one of the tasks described in the following:
Managing categories
To manage categories,
click one of the categories that are shown in the Category Tree. Managing
categories for the selected search collection comprises the following tasks:
- Renaming a category
- To rename a category in the taxonomy tree, proceed as follows:
- From the category tree view, select the required category which you want
to rename.
- Enter the new name for the category name in the entry field Current
category.
- Click Rename. Manage Search renames the category
and shows the new name in the tree view.
-
- To delete a category from the taxonomy tree, proceed as follows:
- From the category tree view, select the required category which you want
to delete.
- Click Delete. You get a prompt to confirm the deletion.
- Confirm that you want to delete the category by clicking OK.
Manage Search removes the category with all its subcategories from the taxonomy
and the tree view.
- Creating a new category
- To add a new category to a search collection associated with a user-defined
rule-based categorizer, proceed as follows:
- From the category tree view, select the required parent category under
which you want to add a new category.
- Enter the name for the new category in the entry field Sub-category
name.
- Click Create. Manage Search adds the new category
to the taxonomy and shows it in the tree view.
Managing category rules
Rules
are applied as filters to documents when inserting them into a collection.
There are two types of rules:
- URL rule
- A URL rule applies to the documents URL. It is expressed as a pseudo "regular
expression". It describes a partial URL. All documents which have the rule
text as a substring in their URL pass the rule.
Example: if the rule text
is *hr*, then the URL http://myco.com/internal/hr/local/default.htm passes
the rule. http://myco.com/internal/finance/local/default.htm does
not pass the rule.
- Content rule
- A content rule is applied to the text of the document. It is expressed
in the same format as a query. If the document is valid for this query, it
passes the rule. For the query syntax, refer to the help of the Search and
Browse portlet that you can use for search of documents.
Examples: the
rule hr "human resources" specifies documents that
contain the term hr or the phrase human resources.
The rule +hr -benefits specifies documents that contain
the term hr but not the term benefits. This
applies to words in their stemmed form. If you selected Unspecified for the
language of the search collection, it applies to words in non-stemmed form.
The Manage Category Rules box
lists the rules that apply as filters to the selected category. Use the minus
( - ) and plus ( + ) signs to collapse and expand the filters table. You can
perform the following tasks with Manage Category Rules:
- Creating a rule
- To create a rule for a category, proceed as follows:
- Select the category for which you want to create a rule from the tree
view.
- Click Create in the Manage Category Rules box.
Manage Search displays the Create Category rule box.
- In the Rule name entry field, type the name for
the rule.
- Depending on the rule type that you want to create, select whether you
want to apply the new rule to URL text or Content.
- In the Select documents containing entry field
type the details of the rule. For a content rule, type the strings to be applied.
For a URL rule, type the partial URL string.
- Click the Create icon to create the rule.
- The new rule is added to the list in the Rules box.
- Click OK to save your updates and return to the
Manage Category Tree panel. Click Cancel to return
to the Manage Category Tree panel without saving.
- If you click OK, the new rule is added to the list
in the Manage Category Rules box. You can now select
it and associate it with a category. It will then be used during crawling
and indexing.
- Associating a rule with a category
- To associate a rule with a category, proceed as follows:
- From the tree view, select the required category with which you want to
associate a rule.
- Click Select in the Manage Category
Rules box. Manage Search displays the Select box
for selecting rules.
- Select a rule from the rule list.
- Click Add to add the rule to the selected rule
list for the category.
- Select additional rules as required.
- Click OK when you have finished selecting rules.
Manage Search associates the selected rules with the category and returns
to the Manage Category Tree panel.
- Dissociating a rule from a category
- To dissociate a rule from a category, proceed as follows:
- Select the required category from which you want to dissociate the rule
from the tree view.
- Click Delete for the rule which you want to dissociate
from the category from the Manage Category Rules box. You get a prompt to
confirm the deletion.
- Confirm that you want to delete the filter rule by clicking OK.
This rule is no longer associated with the category.
- Managing rules
- To add rules to the system or delete rules from the system, click the Manage
Rules button. After creating the rule you can associate the rule
with a category.
Managing the content sources of
a search collection
To work with the content sources of a search
collection, click the collection name in the list of
search collections. Manage Search lists the Content Sources and
the Search collection status information of the selected
search collection. A search collection can be configured to cover more than
one content source. The list shows the following information for the listed
content sources:
- The name of the content source
- Status information for the content source
- The icons for performing tasks on the content sources.
From the Content Sources panel, you can
select the following options or icons and perform the following tasks on content
sources:
- Search collection: To change to the content sources
of a different search collection and work with them, select the required search
collection from this pull-down list.
- New Content Source. Click this option to add a new content source to
the search collection.
- Refresh. Click this icon to refresh the status
information about the content source. While a crawl on the content source
is running, this updates the information about the crawl run time and the
documents collected so far.
- View the status information for the content source:
- Documents
- The number of documents in the content source. If you click Refresh during
a crawl, this shows how many documents the crawler has fetched so far from
the content source.
- Run Time
- The Run Time of the last crawler run on the content sources. If you click Refresh during
a crawl, this shows how much time the crawler has used so far to crawl the
content source.
- Last Run
- The date and time when the Last Run started by which the content source
was crawled.
- Next Run
- The date and time of the Next Run by which the content source will be
crawled, if scheduled.
- Status
- The Status of the content source, that is, whether the content source
is idle or a crawl is currently Running on
the content source.
- Select one of the icons for a specific content source and perform one
of the following tasks:
- View Content Source Schedulers. This icon is displayed
only if you defined scheduled crawls for this content source. If you click
this icon, the portlet lists the scheduled crawls, together with the following
information:
- Start Date
- Start Time
- Repeat Interval
- Next Run Date
- Next Run Time
- Status. This can be disabled or enabled. You can
click the link to toggle between enabling and disabling the scheduler.
- Start Crawler. Click this icon to start a crawl
on the content source. This updates the contents of the content source by
a new run of the crawler. While a crawl on the content source is running,
the icon changes to Stop Crawler. Click this icon to
stop the crawl. For details refer to Starting
to collect documents from a content source.
- Verify Address of Content Source. Click this icon
to verify that the URL of the content source is still live and available.
Manage Search returns a message about the status of the content source. For
details refer to Verifying
the address of a content source.
- Edit Content Source. Click this icon to edit a
content source. This includes configuring parameters, schedules, categories,
and filters for the selected content source. For details refer to Editing
a content source.
- Delete Content Source. Click this icon to delete
the selected content source. For details refer to Deleting
a content source.
On the same panel you can also view
the status of the search collection.
Adding
a new content source
When you create a new content source for a
search collection, that content source will be crawled and the search collection
will be populated with documents from that content source. You can determine
where the index will crawl and what kind of information it will fetch. To
create a new content source for a search collection, proceed as follows:
- Click New Content Source in the Content Sources
panel. Manage Search displays the panel named Create a New Content
Source. The title bar also shows the search collection for which
you create the content source.
- Select the type of the content source that you want to create from the
pulldown list:
- Web site. Select this option for all remote sites.
This includes Web sites and remote portal sites. Note that only anonymous
pages can be indexed and searched on remote portal sites.
- Portal site. Select this option if the content
source is your local portal site.
- Managed Web Content site. To make a content source
of this type available to Portal Search, you need to create it in the Web
Content Management Authoring portlet. You select the appropriate option to
make it searchable and specify the search collection to which it belongs.
When you have completed creating the Managed Web Content site, it will be
listed among the content sources for the search collection that you specified.
For more details about this refer to the Web Content Management documentation.
Your selection determines some of the entry fields and options that
are available for creating the content source. For example, the option Obey
Robots.txt under the tab Advanced Parameters is
available only if you select Web site as the content
source type.
- Select the tabs to configure various types of parameters of the content
source:
- Set the General Parameters
- Set the Advanced Parameters
- Configuring the Schedulers
- Configuring the Filters
- Configuring Security
- Configuring the
Destination Categories. This tab is only available if you selected
User-Defined Categorizer when creating the search collection.
- After you have set all required parameters, click Create to
create the new content source with the parameters you have selected.
Click Cancel if
you do not want to create a new content source and save the updates.
- Manage Search takes you back to the main panel. If you clicked Create,
it displays the new content source in the content source list, using the URL
you gave as the content source location.
Set the general parameters for a content
source
To set the general parameters for the content source, proceed
by filling in the entry fields and making your selections in the Create a
New Content Source box. The available fields and options differ, depending
on the type of content source that you select:
- Click the General Parameters tab.
- Content Source Name: Enter the name for the content
source in this entry field.
- Collect documents linked from this URL: Type the
required Web URL or portal URL in this entry field. This determines the root
URL from which the crawler starts. This field is mandatory. For portal content
sources, the value for this field is filled in by Manage Search.Note: A crawler
failure can be caused by URL redirection problems. If this occurs, try by
editing this field accordingly, for example, by changing the URL to the redirected
URL.
- Make your selection from the following options by selecting from the drop-down
lists. The available fields and options differ, depending on the type of content
source that you selected.
- Levels of links to follow:
- For crawling Web sites: This determines the crawling depth, that is the
maximum number of levels of nested links which the crawler will follow from
the root URL while crawling.
- Number of linked documents to collect:
- For crawling Web sites: This determines the maximum number of documents
that will be indexed by the crawler during each crawling session. The number
of indexed documents includes documents that are re-indexed as their content
or category have changed.
- Portal user ID:
- For crawling secured portal sites: Type the user ID that you want the
crawler to use for crawling the portal site.
- Portal user password:
- For crawling secured portal sites: Type the password for the crawler user
ID.
- Stop collecting after (minutes):
- This sets the maximum number of minutes the crawler may run in a single
session.Note: The timeout that you set here works as a fuzzy time limit. It
might be exceeded by some percentage. Therefore allow some tolerance.
- Stop fetching document after (seconds):
- This sets the maximum time limit in seconds for completing the initial
phase of the HTTP connection, that is for receiving the HTTP headers. This
time limit must be finite as it is used to prevent the crawler from getting
stuck infinitely on a bad connection. However, it allows the crawler to fetch
large files which take a long time to fetch, for example ZIP files.
- Links expire after (days):
-
This parameter determines the number of days a document
will be kept in the search collection since the last time it was found by
a crawler. It is initialized for each document at the time the document is
fetched by the crawler. This means that each time a crawler finds a document,
the document is time stamped. This applies even if the crawler finds the document,
but does not necessarily index it, for example, because it has not changed.
In that case the time stamp of the document is still renewed.
When
the time stamp expires, the document is removed from the search collection
at the time of the next cleanup. The cleanup demon is scheduled to run once
a day.
- Remove broken links after (days):
- This parameter determines the number of days a document will be kept in
the system after it becomes a "broken link". A document is considered to be
a broken link if it is not found any more in a crawling session by any of
the crawlers that previously found this document. In this case the crawler
puts a time stamp on the document. When this time stamp expires, the document
is removed from the search collection during the next cleanup. The cleanup
demon is scheduled to run once a day.
If all the content sources that previously
contained this document are deleted from the system, then no crawler can determine
that the document is a broken link. In this case the document is removed when
its links expire.
- Click the next tab to set more parameters for the content source.
Set the advanced parameters for a content
source
To set the advanced parameters for the content source, proceed
as follows in the Create a New Content Source box:
- Click the Advanced Parameters tab.
- Make your selection from the following options by selecting from the drop-down
lists, marking the checkboxes, or entering data as required:
- Number of parallel processes:
- This determines the number of threads the crawler uses in a crawling session.
- Default character encoding:
- This sets the default character set that the crawler uses if it cannot
determine the character set of a document. Note: The entry field for the Default
character encoding contains the initial default value windows-1252,
regardless of the setting for the Default Portal Language under . Enter the required default
character encoding, depending on your portal language. Otherwise documents
might be displayed incorrectly under Browse Documents.
- Always use default character encoding:
- If you check this option, the crawler always uses the default character
set, regardless of the document character set. If you do not check this option,
the crawler tries to determine the character sets of the documents.
- Add all documents to collection automatically:
- If you check this option, the crawler puts all documents directly in their
destination folders and indexes them.
If you do not check this option,
the crawler puts all documents in the Pending Documents box. The documents
are only put in their destination folders and indexed after an administrator
manually approves them. For more information about Pending Documents and manual
approval see Working with Pending
Documents.
- Obey Robots.txt
- If you select this option, the crawler observes the restrictions specified
in the file robots.txt when accessing URLs for documents. This option is only
available if the content source type is Web site.
- Proxy server: and Port:
- The HTTP proxy server and port used by the crawler. If you leave this
value empty, the crawler does not use a proxy server.
- Socks server: and Port:
- The socks server and port used by the crawler. If you leave this value
empty, the crawler does not use a socks server.
- Click the next tab to set more parameters for the content source.
Configuring the Schedulers
To
configure a schedule, click the Schedulers tab. The
Scheduler shows two boxes:
- Define Schedule. Use this box to add a new schedule.
- Scheduled Updates. This box shows a list of schedules
at which crawls are performed.
You can perform the following tasks with the Scheduler:
- Adding a schedule
- To add a schedule, perform the following steps in the Define
Schedule box:
- From the From: and At: drop-down
menus, select the date and time for the first execution of the crawler.
- Under Update every: specify the interval at which
you want the crawler to run. Type the number of time units and select the
type of time unit, for example 2 and week(s) for
a bi-weekly schedule.
- Click the Create icon in the Define
Schedule box. The scheduler shows the newly created schedule in
the Scheduled Updates box.
Note: The time interval between the crawler runs must be more than the
maximum crawler execution time. The reason is that a crawler cannot be executed
if it is currently running. If a crawler job is started while the crawler
is running, this execution is ignored, and the crawler is only executed at
the next scheduled time, provided that it is not running already.
-
- To delete a schedule, perform the following steps:
- Select the schedule which you want to delete from the schedule list.
- Click Delete. The Scheduler prompts you to confirm
the deletion.
- Confirm that you want to delete the schedule by clicking OK.
The Scheduler removes the schedule from the list.
After you have configured the scheduler, click the next
tab to set more parameters for the content source.
Configuring
the Filters
The crawler filters control the crawler progress and
the type of documents that are indexed and cataloged. To configure filters,
click the Filters tab. This tab is only available if
the content source type is Web site. You can define new filters in the Define
Filter Rules box. The defined filters are listed in the Filtering
Rules box.
Crawler filters are divided into the following
two types:
- URL filters
- They control which documents are crawled and indexed, based on the URL
where the documents are found.
- Type filters
- They control which documents are crawled and indexed, based on the document
type.
If you define no filters at all, all documents from
a content source will be fetched and crawled. If you define include filters,
only those documents which pass the include filters are crawled and indexed.
If you define exclude filters, they override the include filters, or, if you
define no include filters, they limit the number of documents that are crawled
and indexed. More specifically, if a document passes one of the include filters,
but also passes one of the exclude filters, it is not crawled, indexed, or
cataloged.
You can perform the following tasks with the Filters box:
- Creating a filter
- To add a new filter, perform the following steps:
- Enter the filter name in the entry field Rule name:.
- Make the required selection from the following radio button options:
- Apply rule while: Collecting documents or Adding
documents to index
- Rule type: Include or Exclude
- Rule basis: URL text or File
Type.
- This step depends on your selection for the rule basis in the previous
step:
- If you selected URL text as filter body type, enter the URL filter, for
example */hr/*.
- If you selected File Type as filter body type, select the required document
type from the pull-down list.
Note: When you use the option Apply rule while Collecting
documents with Rule type: Include, make
sure that the URL in the field Collect documents linked from this
URL: fits the specified rule; otherwise no documents will be
collected. For instance, crawling the URL http://www.ibm.com/products with
the URL filter */products/* will not give any results,
because the rule has a trailing slash, but the URL does not. But either crawling http://www.ibm.com/products/ with
the URL filter */products/* (both with trailing slash)
or crawling http://www.ibm.com/products with the URL
filter */products* (no trailing slash) will work.
- Click the Create icon in the Define Filter Rules
box. The new filter appears in the appropriate list of filters. The filters
are listed in separate boxes, depending on whether the filter was created
as an include or exclude filter, and whether it was defined for crawling or
indexing.
- Continue adding the filters that you need.
- If you want to delete a filter from the list, select that filter, and
click Delete.
After you have configured the filters, click the next tab to set more
parameters for the content source.
-
- To delete a filter from the list, perform the following steps:
- Select the filter which you want to delete from the list.
- Click Delete. You get a prompt to confirm the deletion.
- Confirm that you want to delete the filter by clicking OK.
The filter is removed from the list.
After you have configured the filters, click the next
tab to set more parameters for the content source.
Configuring
security for a content source
You can configure the security for
indexing secured content sources and repositories that require authentication.
To configure the security for a content source, click the Security tab.
Manage Search shows two boxes:
- Define Security Realm. Use this box to add new
secured content sources.
- Security realms. This box shows a list of existing
security realms.
In the Define Security Realm box fill in the following data entry
fields:
- Host name. Fill in the URL of the secured content
source or repository.
- User Name. Fill in the user ID with which you access
the secured content source or repository.
- Password. Fill in the password for the user ID
you filled in under User Name.
- Realm. Fill in the realm of the secured content
source or repository.
After you have filled in all required data, click the Create icon
in the Define Security Realm box. The list in the Security Realms box now
shows the security realm which you configured for the content source.
After
you have configured security, click another tab to set more parameters for
the content source. If you have set all required parameters and made all required
updates, click Create to create the new content source
with the parameters you have selected.
Configuring
the Destination Categories
Manage Search displays the Destination
Categories tab only for search collections for which you selected
a user-defined rule-based categorizer during creation. You can use this tab
to associate categories with the content source that you are creating. If
you do this, all documents that arrive from that content source are associated
with the categories you selected, depending on whether they pass the existing
filters. A category which is associated with a content source is also called
a destination category.
The Destination Categories panel shows the Category
Tree that you created by using the Category Tree option on the main Manage
Search panel. The category nodes have checkboxes next to them. You can select
the categories that you want to associate with the content source by marking
the checkboxes. Categories with subcategories have small boxes. Click these
boxes to collapse or expand parts of the tree hierarchy.
The category
tree also has a pop up menu with the following options:
- Manage Category Tree. Click this to manage the
category tree. For example, you can add, edit, or delete categories. After
you have completed managing the category tree, you return to the Create a
New Content Source or Edit a New Content Source box.
- Select all categories. Click this to associate
all categories with the content source.
- Clear all selected. Click this to clear all selections
from the tree nodes. You must have at least one category selected, otherwise
you get an error message when you click OK.
The Destination Categories panel also shows the Destination
Category List box. It lists all categories that are associated
with the content source. In the case of a large category tree, this list might
give you a better overview of the selected categories. Click the plus ( +
) and minus ( - ) signs to expand and collapse the Destination Categories
List.
Completing the creation of a content source
- After you have set all required parameters and made all required updates,
click Create to create the new content source with
the parameters you have selected. Click Cancel if
you do not want to create a new content source and save the updates.
- Manage Search takes you back to the main panel. If you clicked Create,
it displays the new content sources in the search collection list, using the
URLs you gave as the content source locations.
Editing a content source
To edit a
content source, proceed by the following steps:
- Click Edit Content Source for the content source
that you want to edit. Manage Search opens the Edit Content Source
Configuration box. It looks just like the Create a New Content
Source box, but shows the configuration data that you entered when creating
the content source.
- Update the parameter options as required.
- When you have made all your updates, click Save.
Manage Search returns to the previous panel. All updates you made are now
enabled.
- To return without saving your updates, click Cancel.
Note: If you modify a content source that belongs to a search scope,
update the scope manually to make sure that the scope still covers that content
source. Especially if you changed the name of the content source, edit the
scope and make sure that it is still listed there. If not, add it again.
To delete a content
source, proceed by the following steps:
- Click Delete Content Source for the content source
that you want to delete. You get a prompt to confirm the deletion.
- Confirm that you want to delete the content source by clicking OK.
The content source is removed from the content source list.
Note: Documents that were collected from this content source will remain
available for search by users under all scopes which included the content
source before it was deleted. These documents will be available until their
expiration time ends as specified under Links
expire after (days):.
Starting
to collect documents from a content source
You can start an update
from a content source manually. To do this, proceed by the following steps:
- Click Start Crawler for the content source for
which you want to start the update. This starts the crawl. Documents are fetched
from this content source. If they are new or modified, they are updated in
the search collection.
- To view the updated status information about the progress of the crawl
process, click Refresh. The following status information
is updated:
- Documents
- Shows how many documents the crawler has fetched so far from the selected
content source.
- Run time
- Shows how much time the crawler has used so far to crawl the content source.
- Status
- Shows whether the crawler for the content source is running or idle.
To update the status information, click the Refresh icon.
You
can also stop a running update of a content source manually. To do this, proceed
by the following steps:
- Locate the content source for which you want to stop the update from the
content sources list. Make sure you select a content source for which the
status information shows Running.
- Click Stop Collecting for that content source.
This stops the crawl.
Verifying the address of a content source
Use
the option Verify Address to verify the URL address
of a selected content source.
Locate the content source which you want
to verify and click Verify Address for that content
source. If the Web content source is available and not blocked by a robots.txt
file, Manage Search returns the message Content Source is OK.
If the content source is invalid, inaccessible, or blocked, Manage Search
returns an error message.
When you create a new content source, Manage
Search invokes the Verify Address feature.
Search Scopes and Custom Links
Search
Scopes allows you to view and manage search scopes and custom
links. The search scopes are displayed to end users as search options in the
drop-down list of the search box in the banner and in the Search Center portlet.
Users can select the scope relevant for their search queries. You can configure
scopes by one of the following:
- One or more search locations (content sources).
- Document features or characteristics, such as the document type.
WebSphere Portal is shipped with three scopes:
- All Sources
- This includes documents with all features from all content sources in
the search by a user.
- Managed Web Content
- This restricts the search to sites that were created by Web Content Management.
You can add your own custom search scopes. You can add
an icon to each scope. Users will see this icon for the scope in the pull-down
selection list of scopes.
You can also add new custom links to search
locations. This includes links to external Web locations, such as Google or
Yahoo. The Search Center global search lists these custom links for users
in the selection menu of search options.
Managing
Search Scopes and Custom Links
To manage search scopes and custom
links, click Search Scopes. Manage Search shows the
Search Scopes and Custom Links panel. It lists the search scopes and custom
links and related information:
- For search scopes:
- The name of the search scope
- The description of the search scope
- The status of the search scope, for example, whether it is active and
available to users for selection
- The icons for performing tasks on the scopes.
- For custom links:
- The name of the custom link
- The URL for the custom link
- The status of the custom link, for example, whether it is active and available
to users for their searches
- The icons for performing tasks on the custom links.
Select the following options or icons and perform the following
tasks on search scopes and custom links:
- New Scope. Click this option to create a new search
scope. For details refer to Creating
a new search scope.
- Refresh. Click this option to refresh the list
of search scopes. This updates the information for the scopes, for example,
the status of scopes, or updates that another administrator might have made
on scopes.
- Move Down and Move Up arrows.
Click these arrows in the list to move search scopes up and down in the list.
This determines the sequence by which the scopes are listed in the drop-down
menu from which users select search options for their searches with the Search
Center portlet.
- Edit Search Scope. Click this icon to work with
a search scope or modify it. For details refer to Editing
a search scope.
- Delete Search Scope. Click this icon to delete
a search scope.
- New Custom Link. Click this option to add a new
custom link. For details refer to Adding
a new custom link.
- Edit Custom Link. Click this icon to work with
a custom link or modify it. For details refer to Editing
a custom link.
- Delete Custom Link. Click this icon to delete
a custom link.
Creating a new search scope
To
create a new search scope, click the New Scope button.
Manage Search displays the New Search Scope page. Enter the required data
in the fields and select from the available options:
- Scope Name:
- Enter a name for the new search scope. The name must be unique within
the current portal or virtual portal. This field
is mandatory.
- Description:
- If required, enter a description for the search scope.
- Custom Icon URL:
- Enter the URL location where the portal can locate the scope icon that
you want to be displayed with the search options for end users. If the icon
file exists in the default icon directory wps/images/icons,
you only need to type the icon file name. If the icon file is located in a
different directory path, type the absolute file path with the file name.
Click Check icon path to ensure that the icon is available
at the URL you specified.
- Status:
- Set the status of the search scope as you require. To make the scope available
to users, set the status to Active.
- Visible to anonymous users:
- Select Yes to make the search scope available to
users who use your portal without logging in. Select No to
make the scope available to authenticated users only.
- Query text (optional):
- Enter a query text. This query text will be invisibly appended to all
searches in this scope. Search by users will return results that match both
the user search and the query text that you enter in this field. Both sets
of results will be weighted with the same relevance in the result list. The
query text that you enter must conform to the syntax rules of entering a query
in the Search Center. For more details about these query syntax rules refer
to the Search Center portlet help.
- Select Features
-
- Click this button to select document features. Manage Search displays
the Add Feature page.
- Select the feature(s) as required. These features will be applied as additional
filters when users select this scope for their search.
- When you have completed selecting features, click OK to
save these features to the new search scope. To return without saving, click Cancel.
- Select Locations
-
- Click this button to select document locations. Manage Search displays
the Add Locations page.
- Select the location(s) as required. Only documents from these search locations
or content sources will be searched when users select this scope for their
search.
- When you have completed selecting locations, click OK to
save them to the new search scope. To return without saving, click Cancel.
When you have completed the data entry and selected
the options as required, click OK to save the new search
scope. To return without saving, click Cancel.
In
order to set names and descriptions for the search scope you have to create
and save the scope first. Then locate the scope on the scopes list, and edit
the scope by clicking the Edit ion. The option for
setting names and descriptions in other locales is available only on the Edit
Search Scope page.
Note: If you modify a content source that belongs to
a search scope, update the scope manually to make sure that the scope still
covers that content source. Especially if you changed the name of the content
source, edit the scope and make sure that it is still listed there. If not,
add it again.
Editing a search scope
To
edit a search scope, locate that scope in the list and click the Edit icon
for that scope. Manage Search displays the Edit Search Scope page. Update
the scope data and select from the available options as required:
- Scope name
- Update the name for the search scope. The name must be unique within the
current portal or virtual portal.
- I want to set names and descriptions.
- Click this link to set names and descriptions for other locales.
For the other data entry fields and options, proceed
as described under Creating
a new search scope.
To delete a search scope, locate that scope in the list and
click the Delete icon for that scope. When the confirmation
prompt appears, confirm by clicking OK, or click Cancel to
return without deleting the search scope.
Adding
a new custom link
You can add Custom Links to allow users to do
direct searches to popular Web search engines, such as Google or Yahoo. To
add a new custom link, click the button New Custom Link.
Manage Search displays the New Custom Link page. Enter the required data in
the fields and select from the available options:
- Status
- Set the status of the custom link as required. To make the link available
to users, set the status to active.
- Custom link name:
- Enter a name for the new custom link. The name must be unique within the
current portal or virtual portal. This field is mandatory.
- Link URL:
- Enter the URL to the target Web search engine for the new custom link.
This field is mandatory. Be careful to use the correct format for the URL,
as the user queries are appended to the URL. For the correct Web interface
syntax refer to the help documentation of the target search engine. In some
cases it might be possible to determine the Web interface syntax as follows:
- Perform a search with some distinctive search text on the target search
engine, for example, an unusual name.
- Review the browser URL field and locate your search string. The part of
the URL that precedes your search string is likely to be the Link URL for
your target search engine.
- If your search string is not at the end of the URL, it might be helpful
to edit the URL and experiment with different versions with a search string
added.
Examples for Web interface syntax are:
- For Google: http://www.google.com/search?&q=
- For Yahoo: http://search.yahoo.com/search?p=
- Custom icon URL:
- Enter the URL location where the portal can find the icon that you want
to be displayed with the new custom link. Click Check icon path to
ensure that the icon is available at the URL you specified.
When you have completed the data entry and selected
the options as required, click OK to save the new custom
link. To return without saving, click Cancel.
In
order to set names and descriptions for the custom link you have to create
and save the link first. Then locate the custom link on the list, and edit
the link by clicking the Edit icon. The option for
setting names and descriptions in other locales is available only on the Edit
Custom Link page.
Editing a custom link
To
edit a custom link, locate that custom link in the list and click the Edit icon.
Manage Search displays the Edit Custom Link page. Update the custom link data
and select from the available options as required. To set names for other
locales, click I want to set names.
Deleting
a custom link
To delete a custom link, locate that link in the
list and click the Delete icon. When the confirmation
prompt appears, confirm by clicking OK, or click Cancel to
return without deleting the link.
Parent topic: Working with the search portlets
Related information
Search Center
Search and Browse
|
|
|