Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..

2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
In many cases, the end of the year gives you time to step back and take stock of the last 12 months. This is when many of us take a hard look at what worked and what did not, complete performance reviews, and formulate plans for the coming year. For me, it is all of those things plus a time when I u...
SYS-CON.TV
Breaking XML to Optimize Performance
Why must developers jump through so many hoops to improve the performance of XML?

As XML becomes ubiquitous throughout the enterprise, it increasingly taxes the systems that must deal with it. Even though there are a wide range of hardware and software solutions coming to market that aim to alleviate XML's performance bottlenecks (See ZapThink's XML Proxies Report), many developers are nevertheless resorting to a variety of tactics to improve the performance of XML processing and transmission that are… well… creative. Many of these creative approaches simplify certain aspects of XML in order to squeeze document size, improve parser performance, and speed the mapping of XML document components to application objects.

Why must developers jump through so many hoops to improve the performance of XML? Simply put, XML is not a particularly efficient format for representing information. It is a text-based, human-readable, and metadata-encoded markup language that operates on the principle that the metadata that describes a message's meaning and context accompanies the content of the message. As a result, XML document sizes can easily be ten to twenty times larger than an equivalent binary representation of the same information. Even though it is inefficient, however, XML's numerous advantages are increasing its use for ever broader and more mission-critical functions. (See ZapThink's popular Pros and Cons of XML report for a detailed discussion of the merits and challenges of XML).

While XML's verbosity may be acceptable for situations with moderate transaction volumes, XML's processing overhead, storage requirements, and bandwidth consumption become quite problematic when transaction volumes are high. As a result, many companies are resorting to potentially dangerous tactics for squeezing every last drop of performance out of XML. Three common tactics include compressing XML, ignoring XML validity, and changing the parsing rules for XML.

Compressing and Squeezing XML

One obvious approach to optimizing XML is to compress it. Since XML is a text-based format, using common binary compression formats like gzip can squeeze over 90% of the volume out of XML data files. However, the problem with compression is that both ends of the communication pipeline must understand the compression format and be able to uncompress the document on the fly without introducing extra latency. A straightforward alternative to using binary compression formats is to simply avoid long element names. Many developers have resorted to the tactic of referring to their XML elements as simply "<g>" or "<bx1>". While such short tags are definitely an improvement over tags like "<SOAPInspectionSecurityHandler>", the resulting XML is for all practical purposes no longer human readable.

Another more subtle approach to XML compression is the all-attribute approach. Rather than creating long, complex trees of elements, developers can use a small number of elements with a large number of attributes to dramatically reduce document size. However, the all-attribute approach does not work for complex tree structures, especially those with highly repeating elements. Even with all these creative tricks for reducing the size of XML messages, compression really only serves to reduce network bandwidth utilization and storage requirements, and doesn't positively impact XML processing performance. In fact, such compression techniques might even reduce the performance.

Ignoring XML Validity

Simply skipping the processing step of validating XML documents is another approach to improving XML performance. In fact, ZapThink's research has shown that few businesses use XML validation of any type (either DTD or W3C XML Schema) as part of runtime XML-based business processes. Instead, developers will check their XML for validity only during the test or design phases of an implementation, and then simply trust that the documents are remain valid thereafter. After all, checking for a document's validity does not remove the need to check its validity at the application level anyway. Since validity checking slows down XML parsers, it's often the first thing to go when optimizing XML performance.

Rewriting the Parser - and Changing the Rules

Since compression and simplified validity checking only minimally improve overall XML processing, developers are increasingly using more drastic approaches to improving XML processing performance. One drastic approach is to rewrite the rules that XML parsers follow. The XML specification is relatively simple - its W3C specification is 80 pages long, which is quite brief considering the power and flexibility of the language. Even at that length, much the XML specification is generally superfluous. When was the last time you used ENTITY, NOTATION, or CDATA elements, anyway? Maybe once in a while those elements come in handy, but they usually aren't necessary. As a result, developers can recompile their XML so that it contains only a subset of all available XML functionality, which can dramatically increase parser performance by as much as a factor of three.

Yet, the drastic measures to improve XML parsing don't stop there. Some developers are taking an even more dangerous approach and rewriting the rules of XML itself. For example, some developers eliminate the need for end tags or remove case-sensitivity within XML documents. These approaches are potentially very dangerous moves that sacrifice interoperability and developer-friendliness for speed. In fact, in essence, these developers are creating new, proprietary markup languages of their own.

The ZapThink take

There is no doubt that standard XML is an inefficient data representation format, and the increasing layers of complexity that Web Services add to the core XML language further exacerbate the problem. Nevertheless, ZapThink sees that the many of the above trends for optimizing XML performance are a tremendous step backwards for interoperability and standards-based computing. At what point does a compressed, stripped-down, non-validating "XML-like" format leave the standards behind and represent a proprietary data format? Keep in mind that all of the tactics discussed above require that both ends of a given communication path agree on the optimization mechanism. As a result, the loosely-coupled, implementation-agnostic XML format becomes a tightly-coupled, proprietary implementation, and virtually all of the advantages of using XML for system-to-system communication are lost. ZapThink, therefore, cannot recommend any of the techniques described above.

That being said, we realize that something must be done about improving overall XML performance. The solution is relatively straightforward - more standards. As a set of requirements emerge for optimizing XML performance, so too will a set of agreed-upon conventions and standards for implementing them. Perhaps the WS-I will recommend a particular compression scheme or minimal set of parsing requirements in order to assure interoperability among parties. Maybe a standards body like OASIS or W3C will start work on a new "WS-Compression" specification. Another approach to the problem of XML inefficiency is to leave optimization up to the implementation - basically, trust the application server, XML Proxy, or Web Services management platform to perform the required optimization while leaving the endpoints fully compliant with XML. Companies will then achieve the best of both worlds - high performance without compromising interoperability.

About Ron Schmelzer
Ron Schmelzer is founder and senior analyst of ZapThink. A well-known expert in the field of XML and XML-based standards and initiatives, Ron has been featured in and written for periodicals and has spoken on the subject of XML at numerous industry conferences.

SOA World Latest Stories
Sooner than expected, Apple Thursday started previewing a developer-directed beta of Mountain Lion, its next-generation Mac OS X 10.8, due out late this summer. It’s borrowed some more features from iOS like the popular and unlimited iChat-replacing iMessages IM as well as Notes, Gam...
Cloud is a shift from the focus on underlying technology implementation to leveraging existing implementations and further building upon them. Cloud orchestration or a network of clouds is the wave of the future where these clouds can operate with elasticity, scalability, and efficienc...
In Aug 2011, around 72 million people accessed social networking sites from mobile, increase of 37% from previous year (study by ComScore) and nearly 50% (of 72 million) access networking sites almost every day. Devising a cohesive strategy for addressing both mobility and social medi...
Citrix has opened up a beta of its CloudStack 3, the first release of the open source cloud platform under the Citrix brand. Citrix acquired the Java-based cloud management last year when it bought Cloud.com. A full production version of the branded stuff is supposed to be available ...
EMC and VMware are going into the cloud business with Atos, the big, publicly owned, Paris-based global IT services firm, intending to take an equity position in Canopy, an end-to-end cloud company Atos is setting up using EMC and VMware technology. The companies said Wednesday when ...
A Munich court Thursday found Motorola Mobility guilty of infringing an Apple patent and handed Apple a permanent injunction against two Android smartphones. Apple can enforce the injunction after posting a bond lest MMI succeed in invalidating the slide-to-unlock patent (EP1964022) ...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE