VMmark is a virtualization throughput benchmark developed by VMware to test its products’ performance on compatible hardware configurations. Its job is to stress the CPU/memory subsystem of a server hosting virtual machines and index its performance at its maximum acceptable workload. Vendors document VMmark tests with VMware products (normally ESX) on a given hardware/software configuration and submit the results to VMware, who publishes them on a web site.
VMmark came out of beta with version 1.0 in July 2007. To date, Dell, HP, IBM, and Sun have submitted results that have been published by VMware. The results cover the AMD Opteron and Intel Xeon server platforms, which all four server vendors now provide to varying degrees. It’s been a useful resource for me, since the competition between AMD and Intel the last few years has resulted in each vendor taking turns leading in virtualization performance in the 2-socket and 4-socket x86 server spaces. Regardless of which vendor submits a VMmark result for a particular processor/memory/chipset combination, the result can usually be inferred to be similar to what would be obtained on another vendor’s implementation of that combination. Based on a recent conversation I had with HP, they expect that customers will make that inference. I had approached them twice about HP’s lack of up-to-date VMmark results for their flagship virtualization platforms, and was told that they hadn’t submitted recent benchmarks due to their reluctance to publish results with non-production VMware ESX builds and/or hardware that wasn’t yet available to customers. Because other vendors were publishing results on current or upcoming platforms sooner, HP apparently didn’t see much return on going though the trouble and cost of performing and documenting VMmarks on their implemenation of similar platforms.
Note that when I described VMmark, I mentioned compatible, not supported, hardware configurations; that’s because VMware has published results from vendors that used pre-release, unsupported software and/or hardware. I think this is the most likely reason Dell was the first to release a quad-core-Opteron-based VMmark. If you look at the disclosure for that submission, you’ll see that it was run on a PowerEdge R905 with 2.5GHz quad-core Opterons (model 8360 SE), a processor model that isn’t available for purchase in that server today. The fastest available R905 today has model 8356 (2.3GHz) processors. Dell’s submitted results for their PowerEdge R900 with Xeon 7350 processors used a beta version of VMware ESX Server v3.5, build 62773, and was tested on November 16, 2007: a few weeks before the production release of ESX 3.5, build 64607, on December 10th. In fact, of the 16 total VMmark results published as of today, the only vendor who submitted results with hardware or software unavailable at the time of publishing is Dell.
To better reflect the version and status of hardware and software used to obtain the published results, I think VMware should:
- refuse to publish results that use pre-release hardware and/or software
- clearly state the availability and/or versions of the tested hardware and software in the system descriptions on the results page
That would allow customers like me to better determine the veracity of a published score without having to be a detective. As VMmark evolves and future SPEC-sanctioned virtualization benchmarks come to market, it would be nice to be able to see more, relevant benchmarks from more vendors rather than gamed, dubious benchmarks from a few.




8 comments
Comments feed for this article
May 16, 2008 at 11:56 am
Bruce Herndon
As you note, all of the information on component availability is provided for the reader in the full disclosure reports – nothing is hidden or “gamed”. VMmark follows the same guidelines as SPEC in requiring that all benchmarked components by available within 90 days of the benchmark’s publication. (TPC requires availability within a more generous 180 days.) If availability is not met, the result will be marked as invalid. This is a longtime standard benchmarking policy that the major benchmarking organizations follow and that many vendors, including HP, take advantage of at times.
May 16, 2008 at 12:56 pm
aharden
Bruce, thanks for your reply.
I wasn’t aware of the SPEC and TPC solution availability policies you specify, but they don’t alter my concern. While I completely agree that the disclosures are full, VMware could easily implement my second suggestion. That may not alter the state of things, but I think it would help customers.
As a customer, I’d like to see more vendors disclose VMmark results, even if the underlying platforms overlap. I believe I’ve identified a reason that HP doesn’t actively submit results. It may hold for the other vendors involved; HP’s the only one I’ve spoken with. I just wanted to get my thoughts and suggestions on the issue out to see if there might be others with opinions on the subject.
May 16, 2008 at 2:29 pm
Greg Kopczynski
> While I completely agree that the disclosures
> are full, VMware could easily implement my
> second suggestion.
It’s not clear to me what more is reasonable to expect. Since it appears to me that all version information is disclosed, I’ll assume, then, that it is availability dates that concern you. However, in the Dell submission you cited, under “Configuration,” both “Hardware Availability Date” and “Software Availability Date” are listed in Month-Year format. Other availability dates are also discosed, as required. This seems consistent with other industry-standard benchmarks.
I would concede that it might be an easier read for those specifically interested in release dates if all release dates for all components were lumped together in one place in the disclosure, but that would just move the burden of scrolling around the disclosure from the reader who cares most about the release dates to the reader who would prefer release dates for specific components be listed with that component description. So, for example, under such a new arrangment the person who is interested in the specifics of the “Virtualization Platform” would have to go scrolling to another area of the disclosure (the presumed new common release dates area) to find the release date for the virtualization platform used. The only way to avoid this would be duplication of fields in the disclosure, which has its own, obvious drawbacks.
> I believe I’ve identified a reason that HP
> doesn’t actively submit results.
Yes, but what I continue to find unclear is why HP submits results under TPC and SPEC when they are subject to very similar (if not identical) drawbacks as those you noted for VMmark submissions. In other words, I do not believe these industry-standard benchmarks do any better job of addressing HP’s concerns than does VMmark, and yet HP submits in these spaces. (I’m also petty sure they publish in these spaces using pre-release components from time-to-time.)
So if, the next time you speak with HP, they can clarify this seeming inconsistency, I would find that highly enlightening.
May 16, 2008 at 3:26 pm
aharden
Specifically, my second suggestion is to place more information about the hardware and software used (and estimated availability dates if they’re not available at the time of publishing) to the main table on the main VMmark results page table.
My HP account team was passed the link to this post, so if they want to reply here they’re welcome to.
May 16, 2008 at 3:58 pm
Greg Kopczynski
True. If only they could reposition the right-most column on the results page that would free up some width for more of that kind of data. But given the constraints of the current layout (the results page layout already stretches across most of my 1024-pixel-width screen), I agree with the choice of fields.
May 17, 2008 at 11:44 pm
Bruce Herndon
The SPEC summary pages do not include an availability date while the TPC summary pages do. So, opinions and practices seem mixed. VMmark has always been more closely in line with SPEC practices. Also, keep in mind that the VMmark results page was designed to give a high-level summary of results. Greg is correct that we are faced with the practical issue of page size. Adding too much detail there would likely make the results page much harder to parse and less useful in general. A thorough reading of a full disclosure is really the best way for an interested party to get a full understanding of a particular result. I think educating customers about this is truly the answer, which is why I welcome discussions like this one.
I’d also like to point out the the Dell R900 result you mention above was matched exactly by a later result from Sun using similar hardware and the production release of ESX Server 3.5. I hope such independent confirmation of results would put to rest any concerns that the use of pre-release builds shows any measureable performace variance and is in any way unfair.
Full disclosure for me: I work for VMware and am one of the original developers of VMmark. Prior to that, I worked in a benchmarking team at HP where I published a few results with future availability dates
.
May 18, 2008 at 6:32 am
aharden
Thanks to both of you for the discussion.
December 29, 2008 at 1:42 pm
VMmark Review Panel Formed « cygweb
[...] agree. I’ve blogged about the veracity of VMmark results before and it generated some good [...]