Industry Buzz via Twitter
Disaster Recovery in a Web 2.0 World
The changing face of the Web
Oct. 1, 2007 08:00 AM
Everyone in IT understands that there are disasters and then
there are disasters. Regardless of the scale of any interruption in operations,
disaster recovery plans generally comprise details describing how IT will
accomplish the two most important tasks they will face in the event of a
disaster: business continuity contingencies and the recovery of lost data.
While being “down” and “disconnected” from the rest of the world can be
financially devastating, losing the data upon which the business relies is
equivalent to a monarch losing the crown jewels. Now that’s a disaster, no matter
what the underlying cause.
Before Web 2.0 made its way onto the corporate stage, a
backup – or two – kept us convinced that, should we lose data for some reason,
we could always get it back and, more important, get it back in such a state
that we’d have lost nothing more than time. With Web 2.0, however, that task
has become a bit trickier. There’s more data, more often, that needs to be
backed up and replicated, and only so many hours in the day (the dreaded
maintenance window) in which we can accomplish this important task.
More, More, More
Blogs, wikis, and discussion forums have become fairly
standard as part of corporate communications. Internally and externally, they
are used to exchange information and provide support for customers and partners
alike. The frequency with which data flows through these systems varies, but it
is certainly often on par with database transactions and other high-volume,
mission-critical applications. For example, F5’s own customer-facing Web 2.0
community, DevCentral, sees approximately 30–50 new posts a day, and our internal
Web 2.0 community likewise sees a similar daily average. All of these posts
contain important nuggets of information that are valuable and would be
difficult to re-create should they be lost due to a disaster.
Many of the posts in forums, blogs, and entries on a wiki
are considered, or should be considered, essential data. They are comprised of information
that’s important to the business and customers alike, and therefore should be
protected by the same mechanisms through which corporate databases are protected
– primarily regular backups and replication.
Most data protection policies require that backups be stored
offsite, and the truly disaster-ready know that they should be stored in two places
offsite – one that is not related in any way to the business (e.g., a reputable
data-storage facility) and one that is digitally accessible to enable quick
retrieval and restoration of said data if at all possible.
The latter option usually ends up being a second data center
or remote office location. Both remote sites are often connected via a
dedicated WAN, but can also be connected using a less expensive VPN over public
Internet architecture. In either case, bandwidth is limited and is used for
more than just backups and replication. Remote locations usually have remote
employees that need access to corporate resources, such as applications and
files, and require that some of that bandwidth be available – the more the
better.
There’s a conflict here that requires a delicate balancing
act between ensuring that backups and replications occur in a timely fashion
and meet the needs of the users. That delicate balancing act is quickly
becoming more difficult as the volume of data grows and requires more and more
bandwidth – and time – to move across these lower-capacity pipes. Sure, you can
schedule those backups in the middle of the night, but as the volume grows so
does the time required to transfer backups from one location to another, and
eventually – likely sooner rather than later – your maintenance window will be
creeping into the daylight hours and frustrating remote employees.
Less, Less, Less
One of the obvious solutions to this problem is to decrease
the amount of data being transferred, or place some sort of control on it to
ensure backup data doesn’t consume all the bandwidth available (or even both).
You can, of course, attempt to back up less data, but then you’re likely to be
caught in the crossfire between line-of-business owners who all emphatically
state that their data is essential and therefore must have priority. You can
schedule the backup and/or replication of Web 2.0-related data after the
transfer of other backups, but you then run the risk of running out of time and
ending up without all that information safely stored in another location.
Which puts you right back into the position of somehow
pushing an increasing volume of data through the same pipe, in the same amount
of time.
One of the simplest solutions is to take advantage of a WAN
optimization controller (WOC). These devices, deployed in pairs – one at the
data center and one at the target remote location – use a number of
technologies to combat the inherent characteristics of WAN and Internet links that
degrade the transfer rates of data, as well as advanced data reduction techniques
to reduce the amount of data in transit, resulting in better utilization and
faster transfers.
A WOC can also optimize application traffic such as file
transfers via CIFS and SMB while reducing the amount of data traversing the
wire. This has the effect of improving bandwidth utilization and ensuring that
backups are completed in a timely manner without imposing any delay in
delivering application and file traffic to remote office users. Optimization
and acceleration technologies implemented by a WOC can also help speed up the
backup process, as most of the techniques and software used to transfer backups
rely on protocols like FTP or TCP, which a WOC can easily optimize to improve
the efficiency of the transfer.
Conclusion – Doing More with Less
There’s no doubt that while Web 2.0 is changing the face of the Web,
it’s also changing the amount of data that needs to be protected in the event
of a disaster. A WOC can assist in completing backups and the replication of
this critical new source of information over the WAN without requiring longer
maintenance windows or impacting the performance of applications accessed by
remote users.
About Lori MacVittieLori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.