By John | June 23, 2008
Ok boys, slow down a bit there. I know we are all in a hurry to capitalize on the new industry buzz around the “Clouds”. However, after taking a brief look at Hyperic’s new CloudStatus, it looks like they are riding on all the “hype” associated with the cloud rather than looking for real business impact that could be done with cloud monitoring. It looks like CloudStatus provides non customer specific synthetic monitors for the five major components of Amazon’s Web Services (AWS).
- Elastic Compute Cloud (EC2)
- Simple Storage Service (S3)
- Simple Queue Service (SQS)
- Simple DB (SDB)
- Flexible Payment Service (FPS)
The first issue I have with the CloudStatus offering is that it doesn’t appear to be specific to any one client’s AWS implementation. It looks like they are doing synthetic monitoring of the generic AWS services from their own public and private servers. This might be very interesting to the blog-o-sphere to discuss outages at Amazon; however; it adds little value to a specific customer’s business impact/service. Generically snapshot-ing parts of Amazon’s infrastructure will not give any one customer a bird’s eye view of how their services are impacted. We have already seen where parts of Amazon or the AWS services were not available in some parts of their network and available in others areas. I am not sure how the synthetic and isolated nature of Cloudstatus latency aggregation will be useful to any one particular customer’s service. Let’s take a look at each of the specific monitored areas provided by Hyperic’s new CloudStatus.
For EC2 Health CloudStatus measure the time it takes to start a small EC2 instance. They measure from the time the instance is started until it is available. This clearly is a rookie mistake. What do they mean by available? An instance that is available is not a service that is available. Here again I go back to my point, unless you are measuring from within the customer’s perspective, this measurement adds little more than “Hype”. The real measure should be when the service is available to the customer. Also, the service is usually not made up of just one instance. They are made up of tiers of servers and clusters. Yes Virginia, even in the clouds. The lag time between an instances being registered as up and/or being ping-able and a customer service being available can be a life time. Telling a customer when a service made up of multiple configured systems is up…now that is something worth monitoring.
For S3 Health CloudStatus simulates puts and gets to Amazon’s S3 from within EC2 instances. They simulate from both the US and Europe. Since most of the AWS services are housed on the east cost it is no surprise that the EU times are significantly higher. What does that tell us? I am not privy to the architectural infrastructure of Amazon’s S3 topology; however; I would think that picking one or two points from within their network to hit a specific S3 bucket that checks latency would not render real meaningful information. I would say this might be similar to an exercise of reading and writing to one SAN in a very large enterprise infrastructure, and using that as a metric to show the health of the whole IT infrastructure. Would this simulation give you reasonable results to base business decisions upon? Also, wouldn’t competing resources on the requesting application server render varied results. Unless this kind of monitoring is done from within a specific customer’s infrastructure, I am not sure how useful the business impact can be relied upon.
SQS Health simulates the round trip time it takes to put and get messages to and from the AWS SQS queues. A much better implementation for monitoring SQS would have been to implement a real queue manager from a customer’s perspective. This is something SmugMug has done to monitor their own queue manager (they use EC2/S3 but not SQS). However, they do monitor their queues and that kind of monitoring provides real business value. If a vendor is going to monitor SQS, they should look to how other vendors monitor queue managers. When I first heard about Hyperic’s CloudStatus announcement, I was hoping they would have included monitoring that gives customers this kind of information. Monitoring things like, how many jobs are backing up on the queue? Are they in a storm? Are there any dead queues? … For all the reasons listed above I believe Hyperic’s SQS Health is a novelty at best unless they can provide true business level feedback.
SDB simulates gets and puts to Amazon’s SimpleDB. I really shouldn’t have to explain why this is pure “Hype”. However, here is my spin… This the equivalent of doing a select him,her,them from youseguysDB and saying you are monitoring Oracle. Ignoring the fact that the adoption rates in SimpleDb are low, I just don’t see how this adds any value to the cloud conversation. The first vendors who figure out how to analyze applications like SimpleDB, CouchDB and BigTable like the way vendors monitor Oracle and DB2, are the ones who are going to lead the pack in cloud monitoring.
This is out right silly. No comment… except, did you notice all the zeros?
I will use the analogy of the movie Get Smart for this one. I went to see Get Smart this weekend with my boys, and I had very high expectations for this movie. The movie turned out to be funny and basically amusing; however, I was very disappointed because it wasn’t great. That is how I feel about Hyperic’s Cloudstatus announcement. Hyperic and Zenoss, IMHO, have been leading the non-hype charge for open source monitoring and until this announcement, Hyperic has been really walking the walk when it comes to aligning business activities with IT management. When I heard about the Hyperic announcement last week, I was really excited. I knew they could not solve all the “Cloud” problems in a first offering, but I was disappointed in how they went for the quick marketing win (Did you see their nonsense Video?). CloudStatus is free and that is a good thing; however, I wish they would have taken a more serious approach to this endeavor. I was hoping to see more of an emerging open source type project where they would lay out a few gems and then try to get industry input and involvement to grow the space. Instead they opted for the quick win and they used a massive PR machine to add a tremendous amount of hype to make this service look like it has some real business value. This in an already over hyped space (“The Cloud”).