You are currently browsing the tag archive for the ‘v12n’ tag.
A VMmark Review Panel comprised of several of our partners has recently been formed in order to provide additional rigor and transparency to the benchmark reviews. The founding members of the review panel are AMD, Dell, and HP, in addition to VMware. (We hope to add more partners soon.)
This broader and more open review process will produce an even greater level of confidence in the accuracy and compliance of published results.
I agree. I’ve blogged about the veracity of VMmark results before and it generated some good discussion.
VMmark is a virtualization throughput benchmark developed by VMware to test its products’ performance on compatible hardware configurations. Its job is to stress the CPU/memory subsystem of a server hosting virtual machines and index its performance at its maximum acceptable workload. Vendors document VMmark tests with VMware products (normally ESX) on a given hardware/software configuration and submit the results to VMware, who publishes them on a web site.
VMmark came out of beta with version 1.0 in July 2007. To date, Dell, HP, IBM, and Sun have submitted results that have been published by VMware. The results cover the AMD Opteron and Intel Xeon server platforms, which all four server vendors now provide to varying degrees. It’s been a useful resource for me, since the competition between AMD and Intel the last few years has resulted in each vendor taking turns leading in virtualization performance in the 2-socket and 4-socket x86 server spaces. Regardless of which vendor submits a VMmark result for a particular processor/memory/chipset combination, the result can usually be inferred to be similar to what would be obtained on another vendor’s implementation of that combination. Based on a recent conversation I had with HP, they expect that customers will make that inference. I had approached them twice about HP’s lack of up-to-date VMmark results for their flagship virtualization platforms, and was told that they hadn’t submitted recent benchmarks due to their reluctance to publish results with non-production VMware ESX builds and/or hardware that wasn’t yet available to customers. Because other vendors were publishing results on current or upcoming platforms sooner, HP apparently didn’t see much return on going though the trouble and cost of performing and documenting VMmarks on their implemenation of similar platforms.
Note that when I described VMmark, I mentioned compatible, not supported, hardware configurations; that’s because VMware has published results from vendors that used pre-release, unsupported software and/or hardware. I think this is the most likely reason Dell was the first to release a quad-core-Opteron-based VMmark. If you look at the disclosure for that submission, you’ll see that it was run on a PowerEdge R905 with 2.5GHz quad-core Opterons (model 8360 SE), a processor model that isn’t available for purchase in that server today. The fastest available R905 today has model 8356 (2.3GHz) processors. Dell’s submitted results for their PowerEdge R900 with Xeon 7350 processors used a beta version of VMware ESX Server v3.5, build 62773, and was tested on November 16, 2007: a few weeks before the production release of ESX 3.5, build 64607, on December 10th. In fact, of the 16 total VMmark results published as of today, the only vendor who submitted results with hardware or software unavailable at the time of publishing is Dell.
To better reflect the version and status of hardware and software used to obtain the published results, I think VMware should:
- refuse to publish results that use pre-release hardware and/or software
- clearly state the availability and/or versions of the tested hardware and software in the system descriptions on the results page
That would allow customers like me to better determine the veracity of a published score without having to be a detective. As VMmark evolves and future SPEC-sanctioned virtualization benchmarks come to market, it would be nice to be able to see more, relevant benchmarks from more vendors rather than gamed, dubious benchmarks from a few.