Many of us who work in IT/IS/DS have been directly involved in decisions over whether virtual servers can provide as robust a solution as a properly designed and maintained physical one. Perhaps it is possible to even improve on the physical model?
Virtual *services* have been around for many years. HTTP, for example, quickly developed the ability to provide more than one website, later basing the decision solely on the name of the site requested. The next logical step isto virtualise the entire server.
Early attempts were obviously going to be more of a cost saving exercise; compromising performance for lower TCO. But the growth of services and their expected performance are proportional over time. The more something is used, the more we expect of it.
Take water. If you only bathe once a week it doesn’t matter how long the bath takes to fill (as long as the water is still hot) but if you take a shower every day then the rate if flow of said water can become a major inconvenience.
Data Services and therefore servers are the same. So virtualisation had to improve.
When I started to work in virtual models it was the respected and accepted belief that rapid data access systems, high end DB, MS Exchange etc, could not be virtualised. This is to do with peaks in usage patterns.
Virtualisation works as a cost affective model as most Physical systems are rarely working at over 50% utilisation of the CPU, or possibly even the RAM. Thus that available slack can be taken up by, say, another system running side by side.
Now we hit the crunch. What happens when the first service gets *spiked*? There is a sudden influx of requests, the CPU is needed to work harder but the second service is simply preventing this.
Massive issues ensue, sometimes with unpredictable consequences on the integrity of information, if not the raw data. For example a report may timeout and provide only a partial result or only be calculated over a partial dataset.
Of course very careful resource management can negate a lot of this risk but then the virtual environment approaches the physical model.
Time moves on and now things are very different.
Technologies such as VMWare make it possible to virtualise almost all components of a corporations IS. Right from the simplest FTP server up to the full AD/Exchange even transactional RDBMs.
It is now possible to have the virtual environment analyse the load on services and decide if a service needs to be *moved* on to a more powerful system to ensure continued service delivery. Similarly, perhaps a service has a period of inactvity and doesn’t need as many resources as normal opperational hours; then it could be moved to a lower specification system, freeing resources for other systems, possibly backup maintenace or data consolidation.
This movement could be a simple backend downgrading of the resources available; say reduce the RAM from 2GB to 1GB if the server is consistently using less. The real advantage is that the virtual system doesn’t know that it has less memory. It still thinks it has 2GB. If the system requests more the controlling system can reprovision the memory and away we go.
All of this means that the available resources are used more efficiently. But the not so hidden cost is that of the very careful capacity management that is required to allow all these services to survive in a single virtualised environment.
If a service is thought to be mission critical it fan be provisioned with a gaurentee on the, say, RAM. But that means less is available for other systems, even if the first system isn’t currently using said memory.
Some systems have peaks and troughs at the same time: a DB service and a report manager for example. It makes sense to ensure these systems don’t fight for the same resources when necessary so you set up rules to ensure that cannot happen. This will again restrict the capacity of the overall environment.
All of this can be managed, however, as long as there are skilled hands on the wheel. Knowing how and when your services will need greater resources, perhaps permanently if they suddenly become more popular, requires dedication, analysis and communication. All the planning in the world won’t keep your web server up if someone in PR is seen on peak time TV promoting your service if they haven’t forewarned their capacity management team. Similarly, spikes are evident on corporate websites after every major announcement, report or such.
But few companies are yet so joined up. PR is obviously accountable to the corporate engine, as is the capacity management and IS, but rarely do the two confer on requirements.
Sad but true.
I have spoken primarily of the memory and CPU resources so far. Of course the same is true of hard disk resources. There are two metrics with hard disk: size and responsiveness. Virtualisation allows the disk size apparently available to a machine to be wildly different to the actual disk space used. Mainly due to space compression the virtualised disk can take up a fraction of the space apparently available. But this compression can be at the cost of reponsiveness. This was one of the major issues with earlier virtualised environments. Once the used capacity in the virtual system approached the current physically available amount the physical environment had to adjust, sometimes rapidly, the virtual disk to provision more space. The virtualised system isn’t expecting any delay and thus problems ensue with IO locks and waits and, especially with DBs, the potential for significant disruption. Current virtualisation environments are able to accommodate these changes far more rapidly. With the introduction of Storage Area Networks (SAN) and better frontend management. These issues, except in extreme circumstances, are almost negated. It is even possible to have different types of disk for different systems, faster more expensive ones only where required. As a side note though, I left hard disk last as, excepting the SAN and management systems, it is by far and away the cheepest of the three main resources.
So today we are at a stage where, yes extreme unexpected peaks are problematic but the vast majority of the time almost any service can be virtualised. Additionally those peaks are theoretically better managed by the virtualised envirnment: a physical server requires significant down time to be improved upon.
Having said all this, there are still some systems that require almost an one to one virtual to physical relationship: anything with an external or attached hardware requirement (SMS modem for example), extreme DBs. But even these gain from the other two major advantages of virtualisation: ease of backup and ease of disaster recovery.
With the physical world the requirements of offline storage, be it another system elsewhere, local tape drive or networked tape library, are a major issue. From the virtualised environment, however a background backup can be taken with almost imperceptible impact on the running system. A snapshot can be taken that that written to disk excetera without the running system even pausing. Well, that last statement is not entirely accuate, there is a pause but now it is better managed and smaller, sometimes for of a very slightslow down of the system as the running version and the snapshot version are brought back into sync.
Virtualisation will, with very careful and skilled management, provide everything that is required to run almost every service conceivable.
Why then does my latest project run so badly in virtualisation? Hmm.
Well, really this is the emerging science and giant that is Cloud Computing. This truely is a behemoth. Systems at this level can control every aspect of their resources normally without any intervention from us meer humans at all. Go SkyNet, go!
– Posted using BlogPress from my iPhone