+

Search Tips   |   Advanced Search

Search and crawl


Configure the local portal site, and crawl remote portal sites, so that they are searchable by users. Run crawlers against other, external Web sites to make them searchable by local portal users.

Users of the portal can search across various types of sites. In addition to searching the local portal site, we can crawl remote portal sites, and external Web sites, to make search results from those sites available to the local portal users. Examples of search scenarios include:


Search the local portal

View information on setting up the local portal for your users to search.

The portal default search collection combines two content sources and their related crawlers:


Reset the default search collection

Under certain circumstances we might want to change the configuration of the portal site search collection. In this case you need to recreate the collection, as search collections cannot be modified.

The portal site default search collection is created at the first time when an administrator navigates to the search administration portlet Manage Search. This requires considerations about the configuration tasks related to the portal and Portal Search and about the sequence by which you perform these tasks. An example scenario might bto perform a portal database transfer, for example, from the default database to a different database. In this case create the portal site collection by navigating to the Manage Search portlet before you transfer the database. Otherwise the portal site collection will not be available after the database transfer.

If you created the portal site collection by navigating to the Manage Search portlet before you completely configured the portal and Portal Search, we might need to recreate the search collection. Example scenarios are as follows:

In such a scenario...

  1. Perform the required configuration tasks, for example, for the language or path settings.

  2. Create a new search collection with the appropriate configuration settings.

  3. Export the content sources from the default search collection.

    In a default portal installation these are the Portal Content Source, which contains portal pages and portlets, and the WCM Content Source, which contains web content.

  4. Import these exported content sources into the new search collection. Portal Search or the Manage Search portlet help.

  5. We can now delete the default search collection.

Portal Search performs a new crawl on the portal site search collection.

  1. On a multilingual portal site we can create multiple collections in different languages. For details refer to Crawl a multilingual portal site.

  2. When you start the crawl for the first time, this might result in a warning message. We can ignore this message.


Crawl a remote portal site

Configure Portal Search to crawl and index a remote, public portal site.

We can enable search on other portal sites. However, only the public pages of other portals can be searched.

To have Portal Search crawl and index a public portal site:

  1. Create a new content source using the Manage Search portlet.

  2. Select Web site from the pull-down menu.

  3. Enter the URL of the portal sito to make available for search by the users.

When you start the crawl, the public portion of the portal site is crawled. The search collection will only contain public pages.


Crawl an external site using a seedlist provider

The seedlist crawler is a special HTTP crawleused to crawl external sites which publish their content using the seedlist format. The seedlist format is an ATOM/XML-based format specifically for publishing application content, including all its metadata. The format supports publishing only updated content between crawling sessions for more effective crawling. Configure the seedlist crawler with general parameters, filters and schedulers, then run the crawler.

Before configuring the seedlist crawler, collect the following information:

To configure and create the seedlist crawler:

  1. Click...

      Manage Search | Search Services | Portal Search Service | search_collection | New Content Source | Content source type | Seedlist provider

  2. Under the tabs General Parameters, Advanced parameters, Schedulers and Security, provide the information in the fields and select options as required. For details refer to the topic Manage and administer Portal Search.

  3. Click Create.

  4. To run the crawler, click the start crawler icon (right-pointing arrow) next to the content source name on the Content Sources page. If you have defined a crawler schedule under the Schedulers tab, the crawler will start at the next possible time specified.


Parent: Portal Search
Related: Manage and administer Portal Search
Configure a crawler to search the local portal site
Crawl a multilingual portal site
Configure search on a secured portal site
Configure the default location for search collections
Crawl a multilingual portal site
Tips for using Portal Search
Manage and administer Portal Search
Related reference:
Apply filter rules