Seedlist response
Understand what is returned in the seedlist response.Seedlists are a modified Atom XML format and follow the Atom paging model. The seedlist uses standard Atom elements and field metadata descriptors and fields. It defines the content to be indexed in multiple field Atom extension elements and in standard Atom elements, such as the updated element. It specifies how the content in the field elements should be indexed using field metadata descriptors in fieldInfo Atom extension elements.
These elements are in the http://www.ibm.com/wplc/atom/1.0 namespace.
<wplc:action>
Identifies the reason why the item is included in the seedlist by specifying the action type in its do attribute. The options are as follows:delete
Indicates that the associated document was deleted. This entry provides enough information to enable you to delete the document from the index. For example:
<atom:entry xml:lang="en_US"> <atom:id>dad9504f-50c4-47c5-b274-7245b21cd308</atom:id> <atom:updated>2008-06-18T10:08:38+01:00</atom:updated> <wplc:action do="delete"/> </atom:entry>Important: The Bookmarks application handles deletions differently from the other applications. Bookmarks are rolled up into cumulative entries for multiple instances of bookmark links to the same URL. The Bookmarks seedlist API only includes delete entries when the last bookmark for a given URL is deleted. Otherwise, deleted content is removed from the rolled up content of an update action.
insert
Indicates that the associated document was added.
update
Indicates that the associated document was changed.
wplcAction = element wplc:action { attribute do {"insert" | "delete" | "update"} }
<wplc:fieldInfo>
Makes the seedlist Atom feeds self-describing by listing content fields provided by each IBM Connections seedlist. Field metadata also gives the indexer hints about how the fields should be treated. Field metadata can be part of the feed or of the entry definition. If it shows on both, the entry takes precedence. Fields metadata can be defined for custom fields and also for the following existing Atom elements: author, summary, title, published, and updated.The <wplc:fieldInfo> element contains the following attributes:
contentSearchable
Indicates whether to search the value of the field. When true, the value of the field is available for search, even if no specific field is specified in the query. For example, if there is a document with the title "my title," and this attribute is set to true for the title, the API matches queries like "my title." Values are true or false. The default value is false.
description
Text that describes the content of the field.
fieldSearchable
Indicates whether the value of the field will be available for search in the context of the field. For example, if there is a document with the title "my title," and this attribute is set to true for the title, the API matches queries that search for the value of that specific field, such as "title:\"my title\"" or "title:my". Values are true or false. The default value is false.
id
Text that identifies the field that this information is associated with. This ID corresponds to the id attribute in the <wplc:field> element.
name
Text that identifies the name of the field. This value is not displayed anywhere.
parametric
Indicates whether this field should answer range constraints in the query. Setting the value to true can incur additional indexing, and space and query processing overhead on the search index, therefore set it with caution. IBM Connections seedlists do not currently set this parameter to true. Values are true or false. The default value is false.
returnable
Indicates whether the content of the field should be returned together with search results. This operation often means additional overhead in indexing and query processing. Values are true or false. The default value is false.
sortable
Indicates whether this field should allow sorting the result set by its value. The default value is false.
supportsExactMatch
Indicates whether the field should be stored as is. This attribute only takes effect if contentSearchable or fieldSearchable are set to true. Values are true or false. The default value is false.
type
Identifies the type of content in the field from among the following options:
- Boolean
- date
- double
- fields
- int
- long
- string
wplcFieldInfo = element wplc:fieldInfo { attribute id { text }, attribute name { text }?, attribute description { text }?, attribute type { "Boolean" | "string"| "date" | "double" | "int" | "long" | "fields" }?, attribute contentSearchable { "true" | "false" }?, attribute fieldSearchable { "true" | "false" }?, attribute parametric { "true" | "false" }?, attribute returnable { "true" | "false" }?, attribute sortable { "true" | "false" }?, attribute supportsExactMatch { "true" | "false" }?, wplcFieldInfos ? }
<wplc:field>
Provides the content to be indexed. The following types of field can be included in seedlist responses:Complex
Contain other wplc:field elements.
Primitive
Contain a simple text value.
Each wplc:field element contains the following attribute:
id
Text value that identifies the field and associates it with the appropriate <wplc:fieldInfo> element.
wplcFields = { wplcField+ } wplcField = element wplc:field { attribute id { text }, extensionAttribute*, ( text | wplcFields | atomContent ) } extensionAttribute = attribute * - (wplc:* | local:*) { text }
<wplc:acl>
Specifies a security token that can be used later to filter search results. This element contains a text value that represents the token as an opaque string. Add one or more security tokens to a feed or entry to prefilter search results. If the <wplc:acl> element is missing, the crawl can assume that the content is publicly available and is OK to display to unauthenticated users. A user's security tokens can be retrieved at search time to enable searching over access-controlled or secure documents. These APIs return the same opaque security token strings that are used in the <wplc:acl> element of the seedlist document. See Securing access to seedlist SPIs for more details.
wplcACL = element wplc:acl { text }
Searching file attachments
The seedlist format enables indexing of content stored in file attachments in the Files and Wikis applications through the use of <atom:link> and <atom:content> elements. The <atom:entry> element representing the file attachment has an <atom:content> element in which the src attribute specifies the location from which the file can be downloaded by the crawler. Additional information about the file is present in the corresponding <atom:link> element in the entry. For example:
<atom:entry xml:lang="en"> <atom:id>13bcb9ed-bca5-4dc8-b60d-d5b90144c8a8</atom:id> <atom:link href="http://www.seedlistsample.com :9080/files/app/file/13bcb9ed-bca5-4dc8- b60d-d5b90144c8a8" rel="via" type="application/pdf" hreflang="en" title="SWGAB web 2.0 Series IBM Connections Wikis & File Sharing.pdf"/> <atom:content xml:lang="en" type="application/pdf" src="https://www.seedlistsample.com:9443/files/seedlist/document/contents/ 13bcb9ed-bca5-4dc8-b60d-d5b90144c8a8"/> ... </atom:entry>The following XML snippet is an example of a seedlist response:
<atom:feed> <atom:id> http://www.seedlistsample.com:10038/seedlist/myserver?Action=GetChildren&Format= com.ibm.lotus.search.plugins.seedlist.ATOMFormatterFactory&Locale=en&Range=3& SeedlistId=com.ibm.jcr,localhost:e802bd80468869da9d93dd95f6dc8031& Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JcrRetrieverFactory& Start=10 </atom:id> <atom:link href="/seedlist/myserver?Range=3&Format=com.ibm.lotus.search.plugins.seedlist. ATOMFormatterFactory&Locale=en&SeedlistId= com.ibm.jcr%2Clocalhost%3Ae802bd80468869da9d93dd95f6dc8031&Source= com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JcrRetrieverFactory& Action=GetDocuments&Start=13&State=MTAwfC9jb250ZW50Um9vdC9pY2%3D%3D" rel="next" type="application/atom+xml" title="Next page"/> <atom:link href="/seedlist/myserver?Range=3&Format=com.ibm.lotus.search.plugins.seedlist. ATOMFormatterFactory&Locale=en&SeedlistId= com.ibm.jcr%2Clocalhost%3Ae802bd80468869da9d93dd95f6dc8031&Source= com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JcrRetrieverFactory& Action=GetDocuments&Start=7" rel="previous" type="application/atom+xml" title="Previous page"/> <atom:generator xml:lang="en" version="1.0"> Seedlist Service Backend System </atom:generator> <wplc:timestamp>AAABFoVf8zs=</wplc:timestamp> <atom:category term="ContentSourceType/PDM" scheme="com.ibm.wplc.taxonomy://application_taxonomy" label="PDM"/> <atom:title xml:lang="en"> PDM Retriever : 5 entries of Seedlist Root </atom:title> <atom:updated>2007-11-28T10:29:26+02:00</atom:updated> <wplc:action do="update"/> <wplc:fieldInfo id="title" name="Title" description="Title" type="string" contentSearchable="true" fieldSearchable="true" parametric="false" returnable="true" sortable="false" supportsExactMatch="true"/> <wplc:fieldInfo id="summary" name="Description" description="Description" type="string" contentSearchable="true" fieldSearchable="true" parametric="false" returnable="true" sortable="false" supportsExactMatch="true"/> <wplc:fieldInfo id="updated" name="Last Update Date" description="Last modified date" type="date" contentSearchable="true" fieldSearchable="true" parametric="false" returnable="true" sortable="false" supportsExactMatch="true"/> <wplc:fieldInfo id="published" name="Creation Date" description="Creation date" type="date" contentSearchable="true" fieldSearchable="true" parametric="false" returnable="true" sortable="false" supportsExactMatch="true"/> <wplc:fieldInfo id="author" name="Author" description="Author" type="string" contentSearchable="true" fieldSearchable="true" parametric="false" returnable="true" sortable="false" supportsExactMatch="true"/> <wplc:fieldInfo id="FIELD_PDM_COMMENT" name="PDM Comment" description="PDM Comment" type="string" contentSearchable="true" fieldSearchable="true" parametric="false" returnable="true" sortable="false" supportsExactMatch="true"/> <!-- Seedlist2 : toplevel0 --> <atom:entry xml:lang="en"> <atom:id> com.ibm.jcr,localhost!25803e8046886a0d9decdd95f6dc8031 </atom:id> <atom:link href="/seedlist/myserver?Range=3&Format= com.ibm.lotus.search.plugins.seedlist.ATOMFormatterFactory&Locale=en& SeedlistId=com.ibm.jcr%2Clocalhost%3A25803e8046886a0d9decdd95f6dc8031& Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JcrRetrieverFactory& Start=0&Action=GetChildren" rel="alternate" type="application/atom+xml" title="Seedlist Children URL"/> <atom:link wplc:repeatable="false" href="/seedlist/myserver?Range=3&Format=com.ibm.lotus.search.plugins. seedlist.ATOMFormatterFactory&Locale=en&SeedlistId= com.ibm.jcr%2Clocalhost%3A25803e8046886a0d9decdd95f6dc8031& Source=com.ibm.lotus.search.plugins.seedlist.retriever.jcr.JcrRetrieverFactory& Start=0&Action=GetDocuments" rel="via" type="application/atom+xml" title="Seedlist Documents URL"/> <wplc:securityId> 6QReDeHHCG3QS6H1E03QO6O1ECJGHCK9ECMQGC4JPIJQOCM1P66S06J9C0 </wplc:securityId> <atom:author> <atom:name>quikradm</atom:name> <atom:uri>uid=quikradm,o=default organization</atom:uri> <atom:email>quikradm@us.ibm.com</atom:email> </atom:author> <atom:category term="com.ibm.jcr,localhost!e802bd80468869da9d93dd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="pdfs"/> <atom:category term="com.ibm.jcr,localhost!1611c180468868649d8cdd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="Performance"/> <atom:category term="com.ibm.jcr,localhost!92644b804679c0cb9a3dbfb7b3c312f0" scheme="com.ibm.wplc.taxonomy://pdm_categories_taxonomy" label=".public"/> <atom:category term="com.ibm.jcr,localhost!56a2010046c2db58ac80af58af525af5" scheme="com.ibm.wplc.taxonomy://pdm_categories_taxonomy" label="movies"/> <atom:title xml:lang="en">toplevel0</atom:title> <atom:updated>2007-08-21T18:03:45+03:00</atom:updated> <wplc:action do="update"/> <wplc:acls> <wplc:acl>uid=quikradm,o=default organization</wplc:acl> <wplc:acl>cn=wpsadmins,o=default organization</wplc:acl> </wplc:acls> <atom:published>2007-08-20T03:39:23+03:00</atom:published> <atom:summary xml:lang="en">toplevel0</atom:summary> </atom:entry> <!-- Seedlist1.DOC1 : f1040prh.pdf --> <atom:entry xml:lang="en"> <atom:id> com.ibm.jcr,localhost:efeb960046886d29a393e795f6dc8031 </atom:id> <atom:link href="/lotus/mypoc?uri=dm:efeb960046886d29a393e795f6dc8031&verb=view" rel="via" type="application/pdf" hreflang="en" title="f1040prh.pdf"/> <atom:content xml:lang="en" type="application/pdf" src="/lotus/mypoc?uri=dm:efeb960046886d29a393e795f6dc8031&verb=download"/> <wplc:securityId> 6QReDe5JPA6H57M1C03QO6O1EC3I96P9O6JSC65RDIJQOCM1P66S06J9C0 </wplc:securityId> <atom:author> <atom:name>quikradm</atom:name> <atom:uri>uid=quikradm,o=default organization</atom:uri> <atom:email>quikradm@us.ibm.com</atom:email> </atom:author> <atom:category term="com.ibm.jcr,localhost!25803e8046886a0d9decdd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="toplevel0"/> <atom:category term="com.ibm.jcr,localhost!e802bd80468869da9d93dd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="pdfs"/> <atom:category term="com.ibm.jcr,localhost!1611c180468868649d8cdd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="Performance"/> <atom:title>f1040prh.pdf</atom:title> <atom:updated>2007-01-17T11:19:32+02:00</atom:updated> <wplc:action do="insert"/> <wplc:acls> <wplc:acl>uid=quikradm,o=default organization</wplc:acl> <wplc:acl>cn=wpsadmins,o=default organization</wplc:acl> </wplc:acls> <wplc:field xml:lang="en" id="FIELD_PDM_COMMENT"> This is comment for file f1040prh.pdf </wplc:field> <atom:published>2007-01-17T11:19:32+02:00</atom:published> <atom:summary>f1040prh.pdf</atom:summary> </atom:entry> <!-- Seedlist1.DOC2 : f1098e03.pdf --> <atom:entry xml:lang="en"> <atom:id> com.ibm.jcr,localhost:3bfeb80046886a569e8bdf95f6dc8031 </atom:id> <atom:link href="/lotus/mypoc?uri=dm:3bfeb80046886a569e8bdf95f6dc8031&verb=view" rel="via" type="application/pdf" hreflang="en" title="f1098e03.pdf"/> <atom:link href="/lotus/mypoc?uri=dm:960b860046886b079fbedf95f6dc8031&verb=download" rel="related" type="application/pdf" hreflang="en" title="f2290_03.pdf"/> <atom:content xml:lang="en" type="application/pdf" src="/lotus/mypoc?uri=dm:3bfeb80046886a569e8bdf95f6dc8031&verb=download"/> <wplc:securityId> 6QReDePHD03H17M1C03QO6O1EC3H16N9EC6HLC4JPIJQOCM1P66S06J9C0 </wplc:securityId> <atom:author> <atom:name>wpsadmin</atom:name> <atom:uri>uid=wpsadmin,o=default organization</atom:uri> <atom:email>wpsadmin@us.ibm.com</atom:email> </atom:author> <atom:category term="com.ibm.jcr,localhost!25803e8046886a0d9decdd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="toplevel0"/> <atom:category term="com.ibm.jcr,localhost!e802bd80468869da9d93dd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="pdfs"/> <atom:category term="com.ibm.jcr,localhost!1611c180468868649d8cdd95f6dc8031" scheme="com.ibm.wplc.taxonomy://location_taxonomy" label="Performance"/> <atom:title>f1098e03.pdf</atom:title> <atom:updated>2007-01-17T11:19:33+02:00</atom:updated> <wplc:action do="update"/> <wplc:acls> <wplc:acl>uid=wpsadmin,o=default organization</wplc:acl> <wplc:acl>cn=wpsadmins,o=default organization</wplc:acl> </wplc:acls> <atom:published>2006-08-21T18:03:45+03:00</atom:published> <atom:summary>f1098e03.pdf</atom:summary> </atom:entry> </atom:feed>
Parent topic
Crawling data
Related tasks
Crawling data for the first time
Subsequently crawling data
Securing access to seedlist SPIs
});