Comments
Matt McLarty wrote: For more info... Follow me on Twitter See our website
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..

2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
In many cases, the end of the year gives you time to step back and take stock of the last 12 months. This is when many of us take a hard look at what worked and what did not, complete performance reviews, and formulate plans for the coming year. For me, it is all of those things plus a time when I u...
SYS-CON.TV
Web Services Made Easy
Web Services Made Easy

A couple of weeks ago, while I was on my way home, my cell phone rang and I was greeted by one of my favorite customers, who sounded like he had had better days. He had just left a meeting with the CIO and received his annual development budget for the following year. The problem was that the CIO was unable to justify a new set of Web service initiatives around a set of just-completed internal Web sites.

He and the upper management felt that it was too early to redevelop these sites. After all, as he explained, "the users had just been trained and were just starting to take advantage of these sites." It certainly wasn't that they didn't see the clear business and technical advantages of Web services but the business value just wasn't there. "Until we can get some return on our investments for these sites, they will stay as they are," was how the CIO later phrased it to me.

During our conversation I started to realize that we all too often forget how important it is to leverage existing assets in infrastructure and technology - and that we can use a combination of Web services and the .NET Framework to realize that. As I did for this customer, I will demonstrate how, using the built-in HTML parsing solution within .NET, you can parse existing content from a remote HTML page and then programmatically expose the resulting data in a Web service.

The development of a Web service that parses content is actually a different paradigm than traditional ASP.NET Web service development. At the core of this development is a service implemented through a Web Service Description Language (WSDL) file. The real difference is that with traditional ASP.NET development we never worry about actual WSDL generation. The framework handles this during the compilation process. With a parser-based service we actually spend our time focused almost exclusively on the creation of the WSDL. Once the WSDL file is created, then the framework provides a utility to generate the proxy class for our code. The trick, as I will show next, is that additional XML elements are added to specify both the input parameters and data returned from a parsed page. Even though additional elements are added, the end XML document must still adhere to the WSDL specification (www.w3.org/TR/wsdl). Within the WSDL file you provide both a target and a regular expression syntax to retrieve the requested parsed data. Once you have created the WSDL file, the .NET Framework provides a custom utility (wsdl.exe) that is used to generate the proxy files for ASP.NET applications. The built-in support is important to allow companies like my customer's to easily transition their existing investments in Web sites into Web services.

To demonstrate this technique, I created a simple HTML page that I will render back into a Web service callable by an ASP.NET application. There are two main caveats that I wanted to pass on. First, always make sure that you get proper permission before trying this on a site. Second, always remember that any changes to the layout of the target Web pages will cause problems within the Web service. In this article, I will show how you can retrieve both the <TITLE> and the <H1> elements of this simple document.

<html>
<head>
<title>Sample title</title>
</head>
<body>
<H1>Some Heading</H1>
</body>
</html>

Creating Custom WSDL
I always like to think of WSDL as an XML format that describes the network services offered by a server. WSDL by definition is an XML-based file that identifies the services provided by the server and the set of operations within each service that the server supports. Each operation described within the WSDL file includes a format that the client must follow to request an operation. The nature of this document sets up a requirement that both the server and the client must follow, and acts as a form of contract that both sides agree upon. The server limits its liability to only providing services if the client sends the properly formatted SOAP request. With a parsing service, both the parsing and implementation requirements are part of the WSDL document and these two combined return the requested information.

Within Visual Studio .NET, creating a custom WSDL file is fairly easy but not completely straightforward. The problem is that VS.NET doesn't directly support the creation of a WSDL file as part of its standard wizards. In order to add a WSDL file after creating an ASP.NET application, add a text file and then rename it with a *.WSDL extension. Once this is done you're ready to add the necessary XML elements.

Within the WSDL file there are a couple of basic elements. First is the <services> element. A service is a set of <port> elements that associate the physical or URL location with a <binding> element. Even though this is a one-to-one relationship, you can specify additional <port> elements within a <binding>. These are used for alternate locations. It really isn't uncommon to have multiple <service> elements within a document. This provides a couple of features, including the ability to group HTTP ports in one service and SMTP in another. This gives client applications the ability to search for the specific <service> elements they need. This also provides a built-in redirection mechanism for clients. Client applications can redirect requests to another <service> element and continue processing without any changes. For our sample I created a <service> binding that points to the local machine. Obviously, within the production application you would need to reset the URL to a valid location.

<service name="GetTitle">
<port name="GetTitleHttpGet"
binding="s0:GetTitleHttpGet">
<http:address location="http: //localhost/WebInfo" />
</port>
</service>

Within a WSDL document the <service> "name" attribute is used to uniquely distinguish one server from another. This becomes even more important when you have multiple ports in a service. The name attribute allows each one to become unique and distinguishable from the others.

Within our WSDL file we also have the <message> elements. These are used to define the input and output parameters. Within this element is a <part> child element that represents the particular parameter. This element contains a name and type attribute. The name attribute contains the unique name of the parameter being passed, and the type attribute lists the data type of the parameter being passed. WSDL isn't limited to simple type only. If you want to define more complex types using XSD, they can be defined within the <types> section of the services description and then specified within the data type for the parameter. For our example I am using the simple type string and defining "Body" as the parameter name.

<message name="TestHeadersHttpGetOut">
<part name="Body" element="s0:string"/>
</message>

Using Regular Expressions
Of course, all elements are important for a properly formatted WSDL document. The most important element for parsing is the <match> element. This element contains the actually parsing instruction and the data elements required by the .NET Framework to properly generate the proxy classes. The <match> element is part of the fully qualified <text> element and contains the <output> and <operation> elements of a specific <binding>. Within the <match> element there are a variety of attributes (see Table 1).

 

By far the most important is the pattern attribute. This contains a regular expression syntax pattern that will be applied against the parsed page and will determine the return value. By definition, a regular expression is a series of characters that define a pattern. The pattern is then compared against a target string to determine whether there is a match to the pattern in the target string.

The real power of these expressions is in the use of metacharacters to indicate character positioning, grouping and even repetition. The easiest example of a metacharacter is the "*" from the old DOS days. The .NET Framework contains a fairly extensive set of expressions that can be used when parsing pages. For more information and examples of syntax, take a look at the .NET SDK. For our example, I attempted to locate both the <TITLE> and the <H1> tags within the base HTML elements.

<output>
<text xmlns="http: //microsoft.com/wsdl/mime/textMatching/">
<match name='Title'
pattern='TITLE>(.*?)<'/>
<match name='H1' pattern='H1>(.*?)<'/>
</text>
</output>

One thing I learned while writing this sample is that case sensitivity is important. So, for this example and your own code, make sure that you either turn on case insensitivity or are aware of how the HTML tags are written.

Generating Proxy Classes
The job of the service description file is to define how to communicate with the Web service. XML Web services allow communication over a network in a variety of protocols. This means that the client and Web service communicate using SOAP messages that encapsulate both the in and the out parameters as XML. It is up to the proxy class of a Web service client to handle the work of mapping parameters to the actual XML elements defined within the service description file and then sending the SOAP message over the networks.

Within the .NET Framework a proxy class is generated using the Wsdl.exe utility. This utility examines the WSDL file and creates proxy classes that can be invoked to communicate with the target Web service over the network. The service in turn processes both the incoming and outgoing SOAP messages. By default, the Wsdl.exe utility assumes SOAP over HTTP to communicate with Web services. The utility also provides the ability to generate classes that can communicate with Web services using either the HTTP-GET or HTTP-POST protocol.

Wsdl.exe is run from the command prompt. The utility supports a wide variety of switches that allow you to define such things as language type, passwords, and even namespaces. For a complete listing of the available options, run "wsdl.exe /?" from the command prompt. For my example, I was interested in creating a Visual Basic .NET-based class and a specific class name. From the command prompt I ran the following:

Wsdl.exe /l:vb /out:datareturn.vb
http://localhost/Webinfo/datareturn.wsdl

The output of Wsdl.exe resulted in the creation of a class called datareturn.vb. This file contains a proxy class that exposes both synchronous and asynchronous methods for each of the methods in the Web service. In this example the generated methods were TestHeaders, BeginTestHeaders, and End TestHeaders. The Testheaders method provides synchronous connectivity to the Web service. Both the BeginTestHeader and EndTestHeader can be used to provide asynchronous Web service connectivity.

Consume the Web Service
Once the generated proxy class is added to the project and a Web reference is set to the WSDL file, you are ready to start using the service. Within an ASP.NET Web page you can call the proxy class and return the requested parsed data from the Web service using the code:

Dim Getdata As New localhost.GetTitle()
Dim match As localhost.TestHeadersMatches

match = Getdata.TestHeaders
TextBox1.Text = match.Title
TextBox2.Text = match.H1

Summary
As I said at the beginning of this article, this is a simple example of what you can do. As I spoke with my customer over the next weeks, he started to understand the value the CIO and upper management were looking for. He developed a Web services strategy that relied on current investments and leveraged them where appropriate. His strategy was centered on a gradual transition that leveraged the full power of his existing infrastructure.

As you download the source code (located at www.sys-con.com/webservices /sourcec.cfm) provided with the article, I challenge you to do the same thing. Use existing Web sites when appropriate and integrate and enhance them with the power of a Web service.

About Thom Robbins
Thom Robbins is a senior technology specialist with Microsoft. He is a frequent contributor to various magazines, including .NET Developer's Journal and SOA Web Services Journal. Thom is also a frequent speaker at a variety of events that include VS Live and others. When he's not writing code and helping customers, he spends his time with his wife at their home in New Hampshire.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

SOA World Latest Stories
Facebook sold off again Tuesday scrapping the bottom at $30.98 after Reuters reported that Scott Devitt, a research analyst at the IPO’s lead underwriter Morgan Stanley, unexpectedly cut his revenue estimates on the company during the roadshow leading up to it going public last Friday....
As a Silver Sponsor of Cloud Expo New York, CloudPassage is offering special passes to SYS-CON's 10th International Cloud Expo, which will take place on June 11–14, 2012, at the Javits Center in New York City, New York. CloudPassage is the leading cloud server security provider, and c...
Private clouds solve many problems for enterprises and bring unique operational challenges along with them. There are dozens of companies of all sizes that will build you a private cloud and turn over the keys – then what? Trying to convert a traditional enterprise IT operations team t...
Cloud computing is becoming an integral part of every enterprise IT environment. With multiple cloud deployment models to choose from, understanding the essential components to any cloud solution will help ensure your success. In his session at the 10th International Cloud Expo, Ores...
The International Trade Commission’s six-member board of commissioners has issued an import ban against Motorola Mobility’s Android gear that the agency’s administrative law judge found in December infringes Microsoft’s patent on “generating meeting requests and group scheduling from a...
As a Platinum Sponsor of Cloud Expo New York, Intel is offering special passes to SYS-CON's 10th International Cloud Expo, which will take place on June 11–14, 2012, at the Javits Center in New York City, New York. Intel is a world leader in computing innovation. The company designs a...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE