Displaying Web Pages
Both "Connect" tags and URL requests can be used to retrieve data from external Web Pages. The syntax used to call the data is as follows:
MOD=WEB SRV=HTML ACTION=URLExample:
http://[HOST]:[PORT]/wps/wcm/connect/?MOD=WEB&SRV=HTML&ACTION=http://www.server.com/page.htm
Processing Retrieved HTML
When "Connect" tags are processed, the data identified by the tag replaces the tag. For HTML content, Header elements are also retrieved. A HTML page that looks like this:
<HTML> <HEAD>Original Header Code</HEAD> <BODY> Original Body Code <connect MOD="WEB" SRV="HTML" ACTION="http://www.xyz.com"> Failure text </connect> </BODY> </HTML>- would, after processing, look like this:
<HTML> <HEAD>Original Header Code plus www.xyz.com Header Code</HEAD> <BODY>Original Body Code plus www.xzy.com body code</BODY> </HTML>This is not the only processing that the Web Content Management application performs on the retrieved page. Any links and references to other URLs found in the retrieved document are adjusted so that, if clicked on, the Web Content Management application will also retrieve the data they represent.
Note: Web Content Management technology can be configured to only adjust for certain hosts via the RedirectHosts option. URLs for hosts that are not to be redirected are left unchanged. See the Installation Guide for more details. Example:
If a user is accessing http://www.myserver.com via the Web Content Management application, the full URL passed to the Web Content Management application might be:
http://[HOST]:[PORT]/wps/wcm/connect/ ?MOD=WEB&SRV=HTML&ACTION=http://www.myserver.com/main.htmlSuppose the requested page http://www.myserver.com/main.html referred to above contains a relative link to a page named news.html, as follows:
<HTML> <HEAD> <TITLE>Main page</TITLE> </HEAD> <BODY> ... <A HREF="news.html">Click here for news</A> ... </BODY> </HTML>In order to ensure that news.html will be accessed via the Web Content Management application, the Web Content Management application adjusts the HTML code in main.html so that it appears to the browser as follows:
<HTML> <HEAD> <TITLE>Main page</TITLE> </HEAD> <BODY> ... <A HREF="http://[HOST]:[PORT]/wps/wcm/connect/ MOD=WEB&SRV=HTML&ACTION=http://www.myserver.com/news.html"> Click here for news</A> ... </BODY> </HTML>
Note: Web Content Management can only merge content of the same MIME types, and is limited to HTML and XML. So, for example, plain text cannot be merged into HTML. HTML page elements that are processed in this manner include:
A (link) tags: <A HREF="news.html"> </A>(as above) BODY tags: <BODY BACKGROUND="background.jpg"></BODY> IMG (image) tags: <IMG SRC="arrow.jpg"></IMG> Forms: <FORM METHOD="GET" ACTION="http://www.xy.com"></FORM> LINK tags: <LINK TYPE="text/css" HREF="mystyle.css"></LINK> FRAME tags: <FRAME SRC="main.html"></FRAME> BASE tags: <BASE HREF="http://www.server.com/"></BASE> The complete list of what HTML tags will be processed is given in connect.cfg file, under the <HtmlElement> section. If it is decided that a particular element should not be processed, the reference to that element should be removed from connect.cfg
Some examples of HTML Elements in connect.cfg
<HtmlElement> <IMG value="com.com.ibm.workplace.Connect.data.xml.html.HTMLImage"/> <HEAD value="com.ibm.workplace.Connect.data.xml.html.HTMLElement"/> <BODY value="com.ibm.workplace.Connect.data.xml.html.HTMLBody"/> <FORM value="com.ibm.workplace.Connect.data.xml.html.HTMLForm"/> <LINK value="com.ibm.workplace.Connect.data.xml.html.HTMLLink"/> <connect value="com.ibm.workplace.Connect.data.xml.html.HTMLConnect"/> <connectEXCLUDE value="com.ibm.workplace.Connect.data.xml.html.HTMLConnectExclude"/> </HtmlElement>
The ConnectExclude Tag
The ConnectExclude Tag is used to exclude elements from being processed. There are two parsing options. .
Parse=true Elements will be parsed, but URLs will not be qualified, regardless of configuration settings. Parse=false No processing is done, this is default.
Using Absolute URLs
By default, any links generated using "Connect" tags are relative. However, if you are viewing Web Content Management content via a WebSphere Portal Server portlet, then these links will appear to be broken. To use URLs generated by "Connect" tags in a portlet, the URLs must be absolute. The Web Content Management framework can be configured to use absolute URLs instead of relative URLs. To do this, the following setting must be added to the first section of connect.cfg:
<AlwaysUseAbsoluteURLs value=true />Parent topic: Displaying Data from external Web Pages.
IBM Workplace Web Content Management - V5.1.0.1 -
Workplace Web Content Management is a trademark of the IBM Corporation in the United States, other countries, or both.
WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.
IBM is a trademark of the IBM Corporation in the United States, other countries, or both.