Comments
rock333 wrote: At the IaaS Cloud layer virtualisation is going to be essential to allow the self service attributes, all painful and slow to do with physical hardware. Moving up the stack to PaaS and SaaS the use of virtualisation may, as you say, be less required if you put lots of smarts into your software. A lot of software does not have those smarts and by utalising virtualisation of the layers below can manipulate existing software architectures to have more cloudy attributes through automation (eg run load balancers and deploy more servers automagically). Over time, as new investment in software at...
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..


2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts

Now more than every there is pressure on IT to offer higher levels of service and a greater degree of availability all while cutting back on costs. As such, making sure your technology environment is efficient and effectively managed is absolutely essential. The data center, by its very nature, i...

SYS-CON.TV
Linux Clustering File Systems
Comparing read and write performance

Clustering is a way to create a system where computers gain access to each others' data and resources. In principle, this adds more computing power and redundancy to the system; however, practical implementations often consume more resources due to the overhead associated with synchronization of facilities in different computers.

One type of a clustering system is one where file systems form a cluster to better serve clients of data stored in the files. For instance, many Internet servers and telecommunications systems can benefit from such a setup, as an error in one computer does not necessarily harm the whole cluster. Instead, the failing computer can simply be removed from the cluster, and others can continue their normal operations.

A number of open source clustering file systems exist for Linux. In this paper, we compare read and write performances of file systems listed in Table 1. The rest of the paper introduces the results structured as follows: first, we define the test setup, second, we introduce the tests we executed and provide the results from the tests, and finally, we present some final points of conclusion.

Test Setting
General Test Setup and Associated Hardware
The hardware environment we used for testing was a two node cluster where both nodes were identical HP dc7600SFF P630 machines. Both nodes consisted of Intel Pentium 4 630, 3.0 GHz processors with 1024 MB of main memory, and 160 GB of disk space between the two physical - HP 80GB SATA 3.0Gb/s - disks. Both nodes ran either Ubuntu Linux 5.10 [11] or Fedora Core 5 [3], depending on the tested file system. To enable clustering, a gigabyte Ethernet connected the nodes. Exact tools, auxiliaries, and their version numbers are listed in Table 2.

Two different test setups were constructed, because the way the different clustering file systems work varies. File systems were tested for read and write performances, which are easy to test for with new file systems that are under constant development. There are no more complicated test cases, and it is worth considering that this is obviously only one very simplistic way to compare file systems. We introduce test configurations in Figure 1.

Test Setup for NFS Derivatives
Figure 1 describes the environment used to test file systems based on NFS [8], including CacheFS, UnionFS, and plain NFS. The environment consists of two different computers. One is a server machine that shares a NFS exported hard drive. The exported hard disk contains an Ext3 file system, and a NFS server that shares the file system to the network. The other is a client node, which mounts the exported disk. This test setup is smaller than a basic cluster, which normally includes more cluster nodes. When performing tests, CacheFS [2] uses Fedora Core kernel. UnionFS [12] and NFS tests are performed with Linux kernel 2.6.16.

GFS and OCFS2 Test Setup
In contrast to NFS derivatives, GFS [4] and OCFS2 [9] were tested with a different test environment. Figure 2 shows this environment, which consists of a cluster node and a storage system (PC). The storage system exports one disk to the storage area network (SAN) using iSCSI. The Target daemon [6] on the PC enables the sharing of the hard drive, and the Open-iSCSI initiator [10] on each node makes the device visible (/dev/sdx). The Logical Volume Manager (LVM) [7] creates one logical volume, where /dev/sdx is attached. The logical volume contains the GFS or OCFS2 file system. The file system will be created only in one node, and after that, the device must be mounted on every machine in cluster. Obviously, a cluster normally includes more nodes, and each one of them uses PC's storage, but this test setup is reduced to the bare minimum. GFS uses the Ubuntu Breezy kernel and the OCFS2 Linux kernel 2.6.16.

Test Cases
Iozone
Iozone [5] is a file system benchmark tool. The benchmark tests file I/O performance for various operations. We executed Iozone's write, rewrite, read, and reread tests. The write test measures the performance of writing the new file. This contains "metadata" overhead that consists of directory information, space allocation, and other data associated with a file that is not part of the data contained in the file. The rewrite test measures the performance of writing a file that already exists, and for that reason, it requires less metadata. The read test measures the performance of reading an existing file. The reread test measures the performance of reading a file that was recently read. Iozone tests were executed with the following command:

$ iozone -Racb output.xls -g 2G -i 0 -i 1

The command executes write and read tests with a 2 Gb file with all different block sizes, and output will be stored in a binary format spreadsheet. There is also one extra option "-c," which includes the close() in the measurement. This helps to reduce the client-side cache effects of NFS version 3.

Bonnie++
Bonnie++ [1] is a benchmark suite that performs a number of simple tests that will exercise the storage and file system combination. This project includes bonnie++'s sequential output and input tests. Per-character tests use putc() or getc() stdio macros, and the loop should be small enough to fit into any reasonable cache. Block tests use write(2) or read(2) system calls when reading or writing files. Rewrite reads files with read(2), changes a few bytes, and rewrites with write(2).

Bonnie++ tests were executed with the following command:

$ bonnie++ -u root -d /cluster -r 1000 > output.txt

The command makes all bonnie++ tests in a /cluster directory, where the mounted file system is located, and writes results to the output.txt file. Bonnie++ uses a file for tests that is double the size of a main memory. Main memory is given with the "-r" option for Bonnie++.

Executing Tests
Tests were executed a numbers of times, but test results were almost identical, and only minor deviators were observed. Therefore, each figure is based on a single representative test case, not an average of all performed tests. Iozone tests are shown in Figures 3, 4, 5, and 6. Bonnie++ tests are shown in Figures 7, 8, 9, 10, and 11. Ext3 in each figure is the same test run on a local file system.

Conclusion
In this study, we have addressed open source clustering file systems. Next, we present some final points to consider. First, under certain conditions, some new cluster file systems can be considerably faster than plain NFS (Figures 3 and 4). Second, tests were performed with two different architectures because the goals of clustering file systems are different, and this might have an effect on the test results. Both the OCFS2 and GFS are targeted for clusters where data is located in a storage area network where a storage device is visible to all cluster nodes over the network. Then, all the cluster nodes play similar roles, and there is no client/server architecture that is common in NFS-related file systems, which is also reflected in our tests. While iSCSI plays a role like that of an NFS server, the resulting system is fundamentally different. Still, NFS is in many ways so elaborated and optimized that new file systems may find it difficult to challenge (see Figure 10).

One problem we experienced between test series was the cluster file systems' instability. The NFS and GFS were the only file systems that caused no problems in running tests. The most problematic file system was CacheFS, which builds on using NFS, too. There was a lot of stability problems, as the NFS client that included CacheFS crashed many times. Also UnionFS and OCFS2 included some minor stability problems when executing tests. The lack of documentation was yet another problem when setting up different environments. However, support via IRC-channels and e-mail worked surprisingly well, and finally, someone found a way to solve each problem. An additional observation is that different communities associated with clustering file systems are all different sizes, and their development capacity varies a lot - some are supported by companies and others by a few active developers and small communities which can have an effect on the output of the community.

Finally, it seems that the tested clustering file systems are developing rapidly, and associated communities have been very active recently. The development of CacheFS has been very active since last spring, and its functionality has increased a lot during this period. OCFS2 is now included in the Linux kernel (>2.6.16), and this is a big step for its development. UnionFS has also released specified versions for each kernel version.

References
1.  Bonnie++: www.coker.com.au/bonnie++/.

2.  CacheFS: www.redhat.com/archives/linux-cachefs/.

3.  Fedora Core: http://fedora.redhat.com/.

4.  GFS: www.redhat.com/software/rha/gfs/.

5.  Iozone File system Benchmark: www.iozone.org/.

6.  iSCSI Enterprise Target: http://iscsitarget.sourceforge.net/.

7.  Logical Volume Manager: http://sourceware.org/lvm2/.

8.  NFS: http://nfs.sourceforge.net/.

9.  OCFS2: http://oss.oracle.com/projects/ocfs2/.

10.  Open-iSCSI project: www.open-iscsi.org/.

11.  Ubuntu: www.ubuntu.com/.

12.  UnionFS: www.unionfs.org/.

About Matti Kosola
Matti Kosola is 25 years old, about to graduate this fall as a Master of Science from Tampere University of Technology, in Finland. He has major in software engineering and minors in communications engineering and industrial management. He started his career as a research assistant at 2005 in Tampere U of Tech working on developing tool project to Nokia 770. Since spring 2006 he has been researching Linux clustering file systems.

About Tommi Mikkonen
Prof. Tommi Mikkonen (MSc 1992, Lic. Tech. 1995, Dr. Tech 1999, all from Tampere University of Technology, Tampere Finland) works on software architectures, software engineering and open source software development at the Institute of Software Systems at Tampere U of Tech. Over the years, he has written a number of research papers, and supervised theses and research projects on software engineering. At present, he is also the supervising software engineering track of a multi-disciplinary project on open source software, where the focus is placed on supporting software architecture and development process.

About Jyke Jokinen
Jyke Jokinen is a Teaching Research Scientist at Tampere University of Technology. His current research interests include distributed systems, concurrent programming and programming languages. Jokinen received his Msc in Information Technology (main subject Software Systems, subsidiary Engineering Physics) from Tampere University of Technology.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

This is a good article. I know this is about Open Source "CFS" technology, but readers may also be interested in other testing that has been done on Linux systems with Open Source and closed source commercial products.

There is a lot of such information on my blog: kevinclosson.wordpress.com

Couod you elaborate on the individual test results? As in, what was being tested.

Also, what were the parameters for the individual filesystems? Like blocksize, etc.


Your Feedback
Kevin Closson wrote: This is a good article. I know this is about Open Source "CFS" technology, but readers may also be interested in other testing that has been done on Linux systems with Open Source and closed source commercial products. There is a lot of such information on my blog: kevinclosson.wordpress.com
sniper wrote: Couod you elaborate on the individual test results? As in, what was being tested. Also, what were the parameters for the individual filesystems? Like blocksize, etc.
SOA World Latest Stories
Novell broke its 18-day silence late Saturday morning and rejected the unsolicited $5.75-a-share offer to take the company private that Elliott Associates plunked on the table March 2. Novell wants more money. Bearing in mind that Novell currently has close to a billion dollars i...
Cloud Computing Journal caught up with the CEO of a major new player in the fast-emerging Cloud ecosystem - a CEO who has taken an interesting and unusual decision. While signing up as the Platinum Plus Sponsor of the 5th International Cloud Expo, he and his company have decided to rem...
NaviCloud is a next-generation platform that combines the economic efficiencies of cloud computing with true enterprise-class reliability and security. With built-in high-availability, a state of the art operations center, and a highly resilient service delivery infrastructure spanning...
Dell is suing Sharp, Hitachi, Toshiba, Seiko Epson and HannStar in district court in San Francisco for fixing the price of LCDs and overcharging since 1996. It wants treble damages. Bloomberg repeats the suit’s observation that Sharp and Hitachi admitted overcharging Dell in a plea agr...
SYS-CON Events announced today that VirtuDataCenter, a cloud computing network infrastructure company, will offer a complete turnkey alternative to today’s cloud computing solutions. They will exhibit at SYS-CON's 5th International Cloud Expo (www.CloudComputingExpo.com), which will ta...
A reconstituted SGI has resurrected its old entry-level enterprise-oriented Origin brand and slapped it on a Westmere-EP Xeon 5600-based SME-targeted workgroup blade system called the Origin 400 that comes with integrated SAN and networking. The thing, which is made for standard busi...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE