Retrieve file content 

Use SearchService commands to perform file content retrieval tasks.


Before you begin

To edit configuration files, use the IBM WAS wsadmin client. See Starting the wsadmin client for details.


About this task

Depending on the number of files being indexed in your deployment, it can take a long time to retrieve file content. To ensure that all content is retrieved and indexed, you can run the indexNow command to retrieve all content before the document indexing service finishes, or you can run it after the document indexing service has finished.

For example, to manually index files and all file content, you might run the following commands:

wsadmin>SearchService.indexNow("files")
wsadmin>SearchService.getFileContentNow("files")
wsadmin>SearchService.indexNow("files")

The document indexing service can now run on multiple nodes, making the download and conversion process faster. When the document indexing task is scheduled, the Search application sends a message to all the nodes to tell them to start the document indexing process locally. Each Search server starts taking files from the cache and downloading and converting them. When a node retrieves a file, it flags the file in the cache as claimed so that other nodes do not try to get content for that file.


Procedure

To perform file content retrieval tasks...

  1. From the dmgr host:

      cd $DMGR_PROFILE/bin
      ./wsadmin.sh -jython
      execfile("searchAdmin.py")

      If prompted to specify a service to connect to, type 1 to pick the first node in the list. Most commands can run on any node. If the command writes or reads information to or from a file using a local file path, pick the node where the file is stored.

  2. Use the following commands to perform file content retrieval tasks.

      SearchService.getFileContentNow(String applicationNames)

        Launches the file content retrieval task. This command iterates over the file cache, downloading and converting files that don't have any content.

        This command takes a string value, which is the name of the application whose content is to be retrieved. The following values are valid:

        For example:

        SearchService.getFileContentNow("files")

      SearchService.retryContentFailuresNow(String applicationNames)

        Retries failed attempts at downloading and converting files for the specified application.

        This command takes a string value, which is the name of the application whose content is to be downloaded and converted. The following values are valid:

        A file download or conversion task can fail for a number of reasons, for example, hardware or network issues. Failures are flagged in the cache and can be retried.

        For example:

        SearchService.retryContentFailuresNow("wikis,files")

      SearchService.addFileContentTask(String taskName, String schedule, String startBy, String applicationNames, failuresOnly)

        Creates a scheduled file content retrieval task.

        This command takes the following arguments:

        • taskName. The name of the scheduled task. This argument is a string value, which must be unique.

        • schedule. The time at which the scheduled task starts. This argument is a string value that must be specified in Cron format. For more information about the Cron schedule, see Scheduling tasks.

        • startBy. The time given to a task to fire before it is automatically canceled. This argument is a string value that must be specified in Cron format. For more information about the Cron schedule, see Scheduling tasks.

        • applicationNames. The name (or names) of the IBM Connections application to be indexed when the task is triggered. This argument is a string value. To index multiple applications, use a comma-delimited list. The following values are valid:

        • failuresOnly. A flag that indicates that only the content of files for which the download and conversion tasks failed should be retrieved. This argument is a boolean value.

        For example:

        SearchService.addFileContentTask("mine", "0 0 1 ? * MON-FRI", "0 10 1 ? * MON-FRI", "wikis,files","true")

      SearchService.listFileContentTasks()

        Lists all the scheduled file content retrieval tasks.

        This command does not take any input parameters.

      SearchService.enableFileContentTask(String taskName)

        Enables the specified task.

        This command takes a single argument:

        • taskName. The name of the task to be enabled. This argument is a string value.

        For example:

        SearchService.enableFileContentTask("mine")

      SearchService.disableFileContentTask(String taskName)

        Disables the specified task.

        This command takes a single argument:

        • taskName. The name of the task to be disabled. This argument is a string value.

        For example:

        SearchService.disableFileContentTask("mine")


Parent topic

Manage the Search index

Related concepts
Scheduling tasks
Configure scheduled tasks


   

 

});