Dr Yang Zhang’s lab does genome structure sequence modelling and analysis. Over the past
Virtualisation plays a huge role in almost all of today’s fastest-growing software-based industries. It is the foundation for most cloud computing, the go-to methodology for cross-platform development, and has made its way all the way to ‘the edge’; the eponymous IoT. This article is the first in a series where we explain what virtualisation is and how it works. Here, we start with the broad strokes. Anything that goes beyond the scope of a 101 article will be covered in subsequent blog posts. Let’s get into it.
What is virtualisation?
Virtualisation technology creates virtualised hardware environments. It uses software to create an ‘abstraction layer’ on top of hardware to divide up parts of a single computer’s resources, such as processors, memory, storage, etc, between multiple virtual computers. The result can be virtual machines (VMs) or containers. Both allow you to create isolated, secure environments for testing, debugging, legacy software, and for specific needs that do not require all of the resources on the physical hardware.
Today, virtualization is a standard practice in enterprise IT architectures, software development and at the edge. You can virtualise numerous parts of a computers ‘stack’ for a myriad of reasons. You can virtualise:
- Data centres
Each of these scenarios enables providers to serve users, or individual VMs, and means users only need the exact computational resources necessary for a given workload. This could be anything from virtualising single machines to more complex setups like full virtual data centre environments.
What is a virtual machine?
A virtual machine is a resource that uses software to run workloads and deploy apps. Each VM runs its own operating system (OS) (the guest OS), and behaves like an independent computer utilising a portion of the underlying computer’s resources (the host). VMs allow users to run numerous different operating systems on one machine, each with potentially different applications and libraries inside. There are numerous tools and methodologies for managing VMs in different places, the first layer of management comes from either a ‘hypervisor’ or ‘application virtualisation’.
What is a hypervisor?
A hypervisor is a layer of software that sits between VMs and hardware to manage resource allocation, general VM to hardware communications, and to make sure VMs don’t interfere with each other. There are two types of hypervisors:
- Type 1: ‘Bare-metal’ hypervisors which interact directly with the underlying hardware and become the OS, except that you only really interact with them through the virtualisation tool. Some examples are: VMware ESXi, Microsoft Hyper-V, and Apple Boot Camp.
- Type 2: Hypervisors that run as an application on top of the existing OS. Some examples are: Parallels Desktop for Mac, QEMU and VirtualBox.
Each operating system, macOS, Windows, Linux, and so on, use different hypervisors for different things. MacOS ships with Hyperkit, Windows with Hyper-V and Linux with KVM as their built-in ‘type 1’ hypervisors. But there are lots of organisations that offer type 1 and type 2 solutions. For example, Virtual box is a type 2 hypervisor that is popular on both Windows and macOS. VMware specialises in all different kinds of virtualisation; server, desktop, networking and storage, with different hypervisor offerings for each. The details of how hypervisors work is beyond the scope of this article.
What is application-virtualisation?
Application-based virtualisation uses an application (such as Parallels RAS) to effectively stream applications to a virtual environment on another server or host system. Instead of logging into a host computer, users gain access to the application virtually, separating applications from the operating system and allowing the user to run almost any application on other hardware. In this way users don’t have to worry about local storage and multiple applications can be run in this way with barely touching the host system.
What is virtual networking?
A key part of virtualisation is allowing virtual machines to talk to the rest of the world. VMs need the ability to talk to other VMs, internally with the host, and externally, with things outside of the virtual environment. This is done with a virtual network between the virtual machine(s) and the host OS. The network is a line of communication that goes between the VMs, and the hardware in the physical environment. There is lots more to it than that but the details are beyond the scope of this particular article.
There are many ways to implement a virtual network, two of the most common are “bridged networking” and “network address translation” (NAT). Using NAT, virtual machines are represented on external networks using the IP address of the host system. In this way virtual machines in the virtual environment are not visible to the outside, this is why virtual machines behind NAT are considered protected. . When a connection is made between an address inside and outside of the virtual environment the NAT system forwards the connection to the correct VM.
Bridged networking connects the VMs directly onto the physical network that the host is using. The DHCP server can then assign each VM its own IP address and is visible on the network. Once connected the VM is accessible over the network and can access other machines on the network as if it were a physical machine.
What are containers?
Containers are standardised units of software that bundle code and all its dependencies into one modular package. While each VM brings its own OS, containers can share the OS of the host machine or bring their own in separate containers. As a result, they are more lightweight, you can deploy a lot more at once, and they are low(er) maintenance, with everything you need in one place. We typically recommend three types of containers for different use cases:
Linux containers focus on being system containers. Containers which create an environment as close to a VM as possible without the overhead of running a kernel and virtualising the hardware. These are considered more robust because they are closer to being a machine with all the services in place, and so are used in a lot of traditional operations. Linux containers come from the Linux containers project (LXC), an open source container platform that is a userspace interface for the tools, templates, libraries and bindings to allow for the creation and management of containers.
Docker containers are the most popular kind of container among developers for cross-platform deployments in data centres or serverless environments. Docker containers use Docker Engine and numerous other container technologies, including LXC, to create developer-friendly environments that are reproducible regardless of the underlying infrastructure. They are standalone executable packages that include everything needed to run an application: code, runtime, system tools, libraries and settings.
Snaps are containerised software packages that focus on being singular application containers. Where LXC could be seen as a machine container, Docker as a process container, snaps can be seen as application containers. Snaps package code and dependencies in a similar way to containers to keep the application content isolated and immutable. They have a writable area that is separated from the rest of the system, but are visible to the host via user application-defined interfaces and behave more like traditional Debian apt packages.
Snaps are designed for when you want to deploy to a single machine. Applications are built and packaged as snaps using a tool called snapcraft that incorporates different container technologies to create a secure and easy-to-update way to package applications for workstations or for fleets of IoT devices. There are a few ways to develop snaps. Developers can configure snap to even run unconfined while they put it together and containerise everything later when pushing to production. Read more about the different way snaps can be configured in another article.
Virtual machines vs Containers
Whether you should use a VM or a container depends on your use case. They’re both great technologies for separate reasons, not necessarily competitors. Virtual machines allow users to run multiple OSes on the same hardware, and containers allow users to deploy multiple applications on the same OS, on a single machine.
Pros and cons of VMs
The benefits of using a VM include, but are not limited to:
- Teams being more efficient with computational resources.
- Support for larger, more complex applications that need full OS functionality on a single server.
- The ability to turn one server into many
- Potentially risky work can be isolated from the host environment.
- Running multiple versions of the same OS environments on the same machine.
- VMs support and run legacy applications that only work on outdated OSes.
- VMs can provide disaster recovery features that abstract important data from problems on the host.
And of course there are several caveats that include, but are also not limited to:
- Running multiple VMs on a single host can cause unstable performance and overload the host’s resources if unconstrained.
- From some providers, especially at scale you may need to endure licensing costs for each VM.
- Virtualisation has inherent performance differences simply as a result of the abstraction from hardware and can pose problems troubleshooting time-based/dependent issues.
- Hosts without hardware extensions in the CPU may not allow access to specific resources. This is known as paravirtualisation, but is beyond the scope of this article.
Pros and cons of containers
The benefits of containers include but are not limited to:
- Security; dy default containers limit what is exposed to the host system and the internet, plus, with the extra layer provided by the container the level of security is increased.
- Scalability; large applications and services can be broken down to run in isolated containers that can be spread across multiple resources.
- Manageability; when applications are broken down into containers developers can focus on features and individual aspects of the application rather than worrying about the whole thing.
- Portability; containers run on any architecture and can be used most anywhere in the stack so the same container environment can be used from development to production.
And of course there are several caveats that include, but are also not limited to:
- Setup and organisation can be difficult because users need to develop a strategy around how they want to operate their particular environment.
- The compartmentalised approach of containers can lead to issues where changes in one container have a negative impact on the rest of the application.
- Support and maintenance of applications, or application parts, inside containers becomes more difficult the more applications are broken down.
- Since containers can share the same operating system, they share all the security threats and vulnerabilities of that OS too.
Virtualisation can exist anywhere computation is important. It is used to isolate whatever is being done from the host computer and to utilise specific resources more efficiently. There are two major kinds of virtualisation: virtual machines, and containers. Each has its pros and cons and can be used independently or together but both have the aim of providing flexibility and efficiency in deploying and managing applications. In our next article we will talk about some of the topics touched on here in more detail.