What is Virtual GPU and how does it work?

Virtual GPU

A virtual graphics processing unit (GPU) is a computer processor that renders graphics on a virtual machine's (VM's) host server rather than on a physical endpoint device. Virtual GPU reduces the lag time when delivering graphics to remote users and providing the same performance they would get from a PC. This is especially useful for users that require computer-aided design or 3-D graphics applications.

Though virtual desktop infrastructure (VDI) is a great way to deliver desktops and apps to workers. But it's not ideal for delivering the type of performance power users need to accomplish work with apps that display complex graphics. That's where virtual graphics processing unit (vGPU) cards come in. NVIDIA introduced the first virtual GPU in 2012 to help to solve that problem. This virtualized GPU power also creates changes on the company's back end. Saved CPU cycles lead to saved costs in hardware, floor space, and cooling. There are also savings on the front end by way of computer, network and cooling costs. Virtual GPU is a great technology to satisfy multiple users with a GPU. Though Intel, AMD, and Nvidia are continuously working on this technology.

Fundamental types of virtualized GPUs

API Intercept

The oldest of these, API Intercept, works at the OpenGL and DirectX level. It intercepts commands via an API, sends them to the GPU, then gets them back and shows the results to the user. Since this is all done in software, no GPU features are exposed. This also means that the software capabilities tend to lag behind the GPU in terms of what APIs are supported. API Intercept typically has good performance when it works. It's the only method that supports vMotion.


Pass-through, which if memory serves has been around for longer than Virtualized GPU, connects virtual machines directly to a GPU. If you have two cards in your server, then you get to connect two VMs to GPUs while everyone else gets nothing. This is great for the highest-end workloads since VMs get access to all of the GPU and its features and application compatibility is great. Pass-through is the most expensive by far, and other than the high-end use case, the only other use cases are either GPGPU or as a reward for good performance at work.

Virtualized GPU

This is the hottest spot in desktop virtualization today, spare maybe storage and hyper-converged infrastructure. With Virtualized GPU, users get direct access to a part of the GPU. This is preferable to API Intercept because the OS uses the real AMD/NVIDIA/Intel drivers, which means applications can use native graphics calls as opposed to a genericized subset of them. It has better performance than API Intercept. Though it gives applications direct access to the CPU, the users are only getting a portion of the CPU, so it can still be limited in certain situations. That said, the application compatibility is good, but vMotion is not supported.

Features of vGPU

True hardware virtualization with NVIDIA vGPU

Graphics applications have direct access to the GPU, reducing system latency and improving performance with complex 3D workloads.

Native graphics hardware stack

Drivers support for the latest OpenGL and DirectX libraries for maximum application compatibility and performance, thanks to end-to-end graphics stack by NVIDIA (hardware, Windows driver, hypervisor manager).

Leverage FlexCast services

Balance high performance and optimum scalability with choice of delivering just the apps, or full desktop using the FlexCast delivery model in XenDesktop and XenApp, supporting HDX optimizations for the low-bandwidth and broad range of client devices.

Hypervisor Requirements and Hypervisor Support

Both Intel and NVIDIA require a software manager to be installed into the hypervisor. This isn’t a big deal since both GPUs are certified to run on certain platforms (more on that in a minute), but it is an extra step. AMD utilizes SR-IOV, which essentially means that they designed their card to present itself to the BIOS in such a way that the BIOS treats it as if it’s several cards, which means you don’t need a software component in the hypervisor itself.

How does vGPU work?

A graphics processing unit has thousands of computing cores to efficiently process workloads in parallel. 3D apps, video, and image rendering all these are massively parallel tasks. The GPU’s ability to handle parallel tasks that make it expert at accelerating computer-aided applications. Engineers rely on performing for heavy-duty stuff like computer-aided engineering (CAE), computer-aided design (CAD) and computer-aided manufacturing (CAM) applications. But there are plenty of other consumer and enterprise applications. Any processor can render graphics - four, eight or 16 cores could do the job. But with the thousands of specialized cores on a GPU, there’s no long wait. Applications simply run faster, interactively - the way they’re supposed to run. vGPU software delivers graphics-rich virtual desktops and workstations accelerated. This software transforms a physical GPU installed on a server to create virtual GPUs that can be shared across multiple virtual machines. It’s no longer a one-to-one relationship from GPU to the user, but one-to-many.


It's true that many administrators and businesses that use virtual desktop infrastructure (VDI) for normal office tasks might assume that virtualized GPU power is not a significant value proposition, but GPUs are actually more important than ever.


Hope this article helped you to clear your doubts on Virtual GPU. You can share your experiences regarding this article in the comments section. Thank you!

Comments (0)

  • To add your comment please or

We use cookies to improve your experience on our site and to show you personalised advertising. Please read our cookie policy and privacy policy.

Got It!