URL normalization for search of portal pages by external search engines
The following properties are used to configure the normalization of the URL of the portal. URL normalization is required to enable external search engines to crawl the content of the portal.
URL normalization...
- Removes all elements from a portal page URL used for portal internal purposes. For example, it removes actions that are coded into the URL and change the portal state.
- Reduces the portal page URL to those elements required for a crawler to read the URL and crawl the portal page.
- com.ibm.wps.state.outputmediators.OutputMediatorFactory.normalization_xsl_file = ( UrlNormalization_MIN.xsl )
- XSLstyle sheet file that defines the transformation that should be used in order to normalize the portal URL. This property needs to be set all on one line and concatenated. The default value is UrlNormalization_MIN.xsl. The following two files are available to allow for a minimum or maximum transformation:
- UrlNormalization_MIN.xsl
- This XSL style sheet contains the states for portlet-mode, window-state, renderparameters, selection, and locale in the normalized URL. This transformation represents the minimum set of states that have to be defined in the URL. All other states are removed from the URL. Default.
- UrlNormalization_MAX.xsl
- This XSL style sheet contains the states for portlet-mode, window-state, renderparameters, selection, solo, locale, and screen-template. This maximum transformation represents the set of states that can be defined in a normalized URL for a crawler. All other states are removed from the URL.
The meaning of the different states listed for the minimum and maximum normalization style sheets is as follows:
portlet-mode Allow a portlet to display a different user interface, depending on the task the user performs with the portlet. A portlet has five modes of display: view, help, edit, edit_defaults, config. window-state Portlet states allow users to change how the portlet window is displayed within the portal. Users can choose from three different states: maximized, minimized, normal. renderparameters Parameters set to render a portal page. selection Selected portal page. solo A portlet can also be displayed in solo state. Solo state hides the portal theme elements, such as a that is, page navigation, or tool bar. locale Language in which the page is presented. screen-template Screen used on the portal page. theme-template Theme used on the portal page.
We can also set up our own URL normalization. To implement a URL normalization different than those provided by the two XSL style sheets that come with the portal, create our own XSL style sheet and set it as the value for the URL normalization parameter:
- Here is an example for creating our own XSL style sheet:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="text()"> </xsl:template> <!-- Traverse through the tree starting at the root element --> <xsl:template match="root"> <xsl:copy> <xsl:copy-of select="@*"/> <!-- Search for the state node with the attribute type = navigational --> <xsl:apply-templates select="state[@type='navigational']"/> </xsl:copy> </xsl:template> <!-- Selection of all states which should stay coded in the URL --> <!-- Allowed States: portlet-mode, window-state, renderparameters (param, value, text), selection, solo, locale , screen-template --> <xsl:template match="state"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates select=" . . . "/> . . . </xsl:copy> </xsl:template> . . . . . . . . . </xsl:stylesheet>
- Set the name of the new XSL style sheet as the value for the URL normalization parameter:
com.ibm.wps.state.outputmediators.OutputMediatorFactory.normalization_xsl_file = UrlNormalization_Your_Own_Style_Sheet_File_Name.xsl
Parent State Manager Service