Comments
Matt McLarty wrote: For more info... Follow me on Twitter See our website
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..

2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
In many cases, the end of the year gives you time to step back and take stock of the last 12 months. This is when many of us take a hard look at what worked and what did not, complete performance reviews, and formulate plans for the coming year. For me, it is all of those things plus a time when I u...
SYS-CON.TV
Seven Ways to Mess Up with XML
Seven Ways to Mess Up with XML

A successful XML publishing project inspired this article. The project's leader, who claims that the financial return gained for his company "made his career" there, achieved success for two reasons: he focused on the right goals and executed the project in the right way.

This article focuses on two things: how to establish the right goals for an XMLbased publishing project and the most common mistakes made. We explore the topic by discussing how to go about it the wrong way.

Mistake #1: Plan too little
Everyone knows the importance of upfront planning, right? Yet, even though "everyone knows," we regularly see projects marred by inadequate and superficial planning.

Why does this happen? Two common reasons emerge. First, most people responsible for planning grew up with word-processing and desktop-publishing software. As a result, they typically think that implementing an XML-based system primarily involves a substitution of technologies and file formats.

In reality, using XML for publishing involves new and unfamiliar concepts - it's a true paradigm shift. Unless someone with XML publishing experience helps with the planning, you will likely invest too little in the upfront work.

Second, the decision to launch an XML publishing project can take too long (doesn't it always?). But because the deadline doesn't change, planning gets squeezed to leave more time for implementing the wrong thing. Dilbert cartoons routinely illustrate this problem quite effectively.

Complicating this problem, it's also possible to go overboard on planning. This occurs much less often, but it's still costly because it delays the realization of benefits. Six to eight weeks for planning is about right. If that's not sufficient, then you're probably making mistake #2.

Mistake #2: Try to do too much at once
Once bitten by the XML publishing bug, it's easy to identify opportunities for dramatic improvement everywhere in your organization. So much waste! So much redundancy! So much inaccuracy! How could we have been so blind?

But you must resist trying to change everything at once. Too many people, too many processes, and too many document types exist to tackle everything at once. Instead, start with one group, one process, and one set of related document types.

Some words of caution: make sure you take the long view when planning so that phase VII of your project works well with phase I. You don't want every phase to require going back and changing previously completed phases.

Mistake #3: Try to change too little
Here's a surefire way to fail: start with the aim of creating "minimum disruption." Sounds good - won't work. You want to leave the same tools and processes in place and get a different result? You don't want to affect anyone or change anything but you want to achieve great benefits?

No magic beans exist. If you want to achieve dramatic results, expect to make dramatic changes. Since people naturally resist change, you will need to sell them on the organizational and individual benefits of the changes.

Mistake #4: Try to automatically convert all existing content to XML
Here's one of the most dangerous misunderstandings in publishing: existing processes and tools produce information that is sufficiently consistent to allow automatic conversion to XML. No matter how many times we have encountered that belief - and no matter how insistently it is expressed - it is always wrong.

Word-processing and desktop-publishing tools survive precisely because of the flexibility and freedom they provide to authors. These product attributes are opposed diametrically to the primary purpose of creating XML content, which involves constraining the author to create content according to a set of rules.

Is it hopeless to convert existing content to XML? Not at all. Tools are available that can convert existing content to XML. But you must accept that manual cleanup will be required, so design your process accordingly.

If you're contemplating a one-time conversion of existing information to XML, that's a subject for another article. In this article, we're focusing on building a new system that uses ongoing conversions from word processors.

In such cases, for simple documents or simple content, the manual cleanup may be minimal and, therefore, reasonable. But for long, complex documents, the cleanup cost may be excessive.

You should carefully avoid presenting a cost justification for your system that depends on ongoing, fully automatic conversion of long, complex information to XML.

Mistake #5: Try to convert word-processing tools to XML editors
We have seen companies waste millions of dollars building applications on top of word processors in an attempt to force authors to conform consistently to a set of rules. Why? Because the tools do not provide the architecture that absolute conformance to a data model requires.

Fortunately, word processors and desktop-publishing software are becoming increasingly XML-aware and a few are even XML-capable. These tools offer a greater chance of success, especially if you arm yourself with expert assistance to dissect vendors' claims.

We'll explore this topic in greater detail in a future article.

Mistake #6: Set up too many rules
We're referring to the data model - the DTD or schema - that guides the author in creating and editing content. Two dimensions exist to the problems of "too many rules." First, the data model is too restrictive, and second, the data model has too many tags.

Many novices begin by designing highly restrictive data models with lots of tags. Such data models involve too many subsequent changes, which cost time and money, and require authors to spend a long time learning them.

To make a model overly restrictive, you would be very careful about limiting where tags can be used and how they can be used. For example, you may decide that a <part number> tag can appear only in a <paragraph> tag. But later you may realize that you have to allow a <title> tag to contain <part number> as well. And then you'll find still more places where you need to be able to use <part number>.

To create a problem of too many tags, give authors somewhere between 200 and 300 tags to learn so that they reach their maximum productivity just about the time that they move on to another job. If you want an overly broad generalization, shoot for 30 tags.

Mistake #7: Use too many moving parts
The problem with too many moving parts is that you must do a lot of work to choose them, integrate them, test them, and keep them all working.

In traditional publishing processes involving a lot of manual work, a problem usually doesn't erupt. Many moving parts may exist but human intervention integrates them and keeps the whole machine working. For example, contributing authors may use word processors while the technical publications department uses desktop-publishing software and manually imports the word-processor files as needed.

In an XML publishing system, however, one of the goals is to eliminate human intervention and make everything work together automatically. Fulfilling this goal requires tight integration among the various software products.

XML publishing systems must also deliver more functionality and productivity than the traditional systems they replace, so a key project requirement usually includes the execution of a content management system as well.

No single vendor offers a complete system that delivers all of the functionality needed in support of every type of content. That leaves customers with the task of selecting vendors for each piece of functionality needed.

The short answer is to limit the number of vendors involved - choose enough to accomplish your goals (both immediate and future!) but no more. The long answer is to get some expert assistance to help you match your current and future needs with the products available.

About PG Bartlett
PG Bartlett is vice president of product marketing at Arbortext, where he is responsible for corporate positioning, marketing strategy, and product direction. Bartlett joined Arbortext in 1994, bringing more than 18 years of experience in both technical and marketing positions at leading-edge high technology companies. He is a frequent presenter at major industry events and has been invited to speak and chair sessions at Comdex, Seybold Seminars, XML conferences, AIIM conferences, and others.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

A few ruminations on your article;

Planning instead of building is an age old concept in software ( as well as buildings ), which with every passing month seems to be reiterated in one passing methodology fad or another.

Most of the points you raise are generally applicable to 'all things' software. I would respectfully point out that there are a few other, possibly more important issues when designing with XML.

I will list some further alternate ways of messing up with xml;

- not recognizing the differences in relational vs hiearchical data; for 20+ years RDBMS have been king....

- not identifying document centric vs data centric data in one's usage of xml

- XML should be human readable, the moment it becomes opaque to human inspection....the moment it becomes hard to debug/read/see if its correctly doing its job

- dont be afraid to cook your own xml vocabulary, but always look around to see if someone else has done it before you. We see too many people replicating effort, where enhancing an existing xml vocabulary is much less effort

- just because you like XML, don't force a declaritive processing model on all your publishing processes, sometimes its easier to just pass a filter through all of your data using classic parser techniques; hybrid approaches tend to be more successful then 'golden hammer'

- dont force XML on domain experts, if they are comfortable with existing methods, then just take their output and xml'ify it at the end of the publishing workflow

- recognize that the biggest impact of XML is Unicode, Ubiqitous usage, and the sheer utility of an easily understandable short term data format

- early taxonomisation of xml is a pitfall, there is little need to initially absolutely define a vocabulary with all the expressive power of XML Schema.

- Publishing can reflect pipelines of processing, take a look at existing XML Application servers...I see many people replicating functionality where Cocoon, AxKit, or Ant maybe appropriate.

and lastly use xml:lang.

regards,


Your Feedback
mukhtar wrote:
James Fuller wrote: A few ruminations on your article; Planning instead of building is an age old concept in software ( as well as buildings ), which with every passing month seems to be reiterated in one passing methodology fad or another. Most of the points you raise are generally applicable to 'all things' software. I would respectfully point out that there are a few other, possibly more important issues when designing with XML. I will list some further alternate ways of messing up with xml; - not recognizing the differences in relational vs hiearchical data; for 20+ years RDBMS have been king.... - not identifying document centric vs data centric data in one's usage of xml - XML should be human readable, the moment it becomes opaque to human inspection....the moment it becomes hard to debug/read/see if its correctly doing its job - dont be afraid to cook your own xml vocabulary, bu...
SOA World Latest Stories
Facebook sold off again Tuesday scrapping the bottom at $30.98 after Reuters reported that Scott Devitt, a research analyst at the IPO’s lead underwriter Morgan Stanley, unexpectedly cut his revenue estimates on the company during the roadshow leading up to it going public last Friday....
As a Silver Sponsor of Cloud Expo New York, CloudPassage is offering special passes to SYS-CON's 10th International Cloud Expo, which will take place on June 11–14, 2012, at the Javits Center in New York City, New York. CloudPassage is the leading cloud server security provider, and c...
Private clouds solve many problems for enterprises and bring unique operational challenges along with them. There are dozens of companies of all sizes that will build you a private cloud and turn over the keys – then what? Trying to convert a traditional enterprise IT operations team t...
Cloud computing is becoming an integral part of every enterprise IT environment. With multiple cloud deployment models to choose from, understanding the essential components to any cloud solution will help ensure your success. In his session at the 10th International Cloud Expo, Ores...
The International Trade Commission’s six-member board of commissioners has issued an import ban against Motorola Mobility’s Android gear that the agency’s administrative law judge found in December infringes Microsoft’s patent on “generating meeting requests and group scheduling from a...
As a Platinum Sponsor of Cloud Expo New York, Intel is offering special passes to SYS-CON's 10th International Cloud Expo, which will take place on June 11–14, 2012, at the Javits Center in New York City, New York. Intel is a world leader in computing innovation. The company designs a...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE