Skip to content

Commit 84cc667

Browse files
committed
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin: - device feature provisioning in ifcvf, mlx5 - new SolidNET driver - support for zoned block device in virtio blk - numa support in virtio pmem - VIRTIO_F_RING_RESET support in vhost-net - more debugfs entries in mlx5 - resume support in vdpa - completion batching in virtio blk - cleanup of dma api use in vdpa - now simulating more features in vdpa-sim - documentation, features, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits) vdpa/mlx5: support device features provisioning vdpa/mlx5: make MTU/STATUS presence conditional on feature bits vdpa: validate device feature provisioning against supported class vdpa: validate provisioned device features against specified attribute vdpa: conditionally read STATUS in config space vdpa: fix improper error message when adding vdpa dev vdpa/mlx5: Initialize CVQ iotlb spinlock vdpa/mlx5: Don't clear mr struct on destroy MR vdpa/mlx5: Directly assign memory key tools/virtio: enable to build with retpoline vringh: fix a typo in comments for vringh_kiov vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails scsi: virtio_scsi: fix handling of kmalloc failure vdpa: Fix a couple of spelling mistakes in some messages vhost-net: support VIRTIO_F_RING_RESET vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit vdpa: mlx5: support per virtqueue dma device vdpa: set dma mask for vDPA device virtio-vdpa: support per vq dma device vdpa: introduce get_vq_dma_device() ...
2 parents 49d5759 + deeacf3 commit 84cc667

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+3535
-502
lines changed

Documentation/driver-api/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ available subsections can be seen below.
108108
vfio-mediated-device
109109
vfio
110110
vfio-pci-device-specific-driver-acceptance
111+
virtio/index
111112
xilinx/index
112113
xillybus
113114
zorro
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
======
4+
Virtio
5+
======
6+
7+
.. toctree::
8+
:maxdepth: 1
9+
10+
virtio
11+
writing_virtio_drivers
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
.. _virtio:
4+
5+
===============
6+
Virtio on Linux
7+
===============
8+
9+
Introduction
10+
============
11+
12+
Virtio is an open standard that defines a protocol for communication
13+
between drivers and devices of different types, see Chapter 5 ("Device
14+
Types") of the virtio spec (`[1]`_). Originally developed as a standard
15+
for paravirtualized devices implemented by a hypervisor, it can be used
16+
to interface any compliant device (real or emulated) with a driver.
17+
18+
For illustrative purposes, this document will focus on the common case
19+
of a Linux kernel running in a virtual machine and using paravirtualized
20+
devices provided by the hypervisor, which exposes them as virtio devices
21+
via standard mechanisms such as PCI.
22+
23+
24+
Device - Driver communication: virtqueues
25+
=========================================
26+
27+
Although the virtio devices are really an abstraction layer in the
28+
hypervisor, they're exposed to the guest as if they are physical devices
29+
using a specific transport method -- PCI, MMIO or CCW -- that is
30+
orthogonal to the device itself. The virtio spec defines these transport
31+
methods in detail, including device discovery, capabilities and
32+
interrupt handling.
33+
34+
The communication between the driver in the guest OS and the device in
35+
the hypervisor is done through shared memory (that's what makes virtio
36+
devices so efficient) using specialized data structures called
37+
virtqueues, which are actually ring buffers [#f1]_ of buffer descriptors
38+
similar to the ones used in a network device:
39+
40+
.. kernel-doc:: include/uapi/linux/virtio_ring.h
41+
:identifiers: struct vring_desc
42+
43+
All the buffers the descriptors point to are allocated by the guest and
44+
used by the host either for reading or for writing but not for both.
45+
46+
Refer to Chapter 2.5 ("Virtqueues") of the virtio spec (`[1]`_) for the
47+
reference definitions of virtqueues and "Virtqueues and virtio ring: How
48+
the data travels" blog post (`[2]`_) for an illustrated overview of how
49+
the host device and the guest driver communicate.
50+
51+
The :c:type:`vring_virtqueue` struct models a virtqueue, including the
52+
ring buffers and management data. Embedded in this struct is the
53+
:c:type:`virtqueue` struct, which is the data structure that's
54+
ultimately used by virtio drivers:
55+
56+
.. kernel-doc:: include/linux/virtio.h
57+
:identifiers: struct virtqueue
58+
59+
The callback function pointed by this struct is triggered when the
60+
device has consumed the buffers provided by the driver. More
61+
specifically, the trigger will be an interrupt issued by the hypervisor
62+
(see vring_interrupt()). Interrupt request handlers are registered for
63+
a virtqueue during the virtqueue setup process (transport-specific).
64+
65+
.. kernel-doc:: drivers/virtio/virtio_ring.c
66+
:identifiers: vring_interrupt
67+
68+
69+
Device discovery and probing
70+
============================
71+
72+
In the kernel, the virtio core contains the virtio bus driver and
73+
transport-specific drivers like `virtio-pci` and `virtio-mmio`. Then
74+
there are individual virtio drivers for specific device types that are
75+
registered to the virtio bus driver.
76+
77+
How a virtio device is found and configured by the kernel depends on how
78+
the hypervisor defines it. Taking the `QEMU virtio-console
79+
<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/char/virtio-console.c>`__
80+
device as an example. When using PCI as a transport method, the device
81+
will present itself on the PCI bus with vendor 0x1af4 (Red Hat, Inc.)
82+
and device id 0x1003 (virtio console), as defined in the spec, so the
83+
kernel will detect it as it would do with any other PCI device.
84+
85+
During the PCI enumeration process, if a device is found to match the
86+
virtio-pci driver (according to the virtio-pci device table, any PCI
87+
device with vendor id = 0x1af4)::
88+
89+
/* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
90+
static const struct pci_device_id virtio_pci_id_table[] = {
91+
{ PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) },
92+
{ 0 }
93+
};
94+
95+
then the virtio-pci driver is probed and, if the probing goes well, the
96+
device is registered to the virtio bus::
97+
98+
static int virtio_pci_probe(struct pci_dev *pci_dev,
99+
const struct pci_device_id *id)
100+
{
101+
...
102+
103+
if (force_legacy) {
104+
rc = virtio_pci_legacy_probe(vp_dev);
105+
/* Also try modern mode if we can't map BAR0 (no IO space). */
106+
if (rc == -ENODEV || rc == -ENOMEM)
107+
rc = virtio_pci_modern_probe(vp_dev);
108+
if (rc)
109+
goto err_probe;
110+
} else {
111+
rc = virtio_pci_modern_probe(vp_dev);
112+
if (rc == -ENODEV)
113+
rc = virtio_pci_legacy_probe(vp_dev);
114+
if (rc)
115+
goto err_probe;
116+
}
117+
118+
...
119+
120+
rc = register_virtio_device(&vp_dev->vdev);
121+
122+
When the device is registered to the virtio bus the kernel will look
123+
for a driver in the bus that can handle the device and call that
124+
driver's ``probe`` method.
125+
126+
At this point, the virtqueues will be allocated and configured by
127+
calling the appropriate ``virtio_find`` helper function, such as
128+
virtio_find_single_vq() or virtio_find_vqs(), which will end up calling
129+
a transport-specific ``find_vqs`` method.
130+
131+
132+
References
133+
==========
134+
135+
_`[1]` Virtio Spec v1.2:
136+
https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html
137+
138+
.. Check for later versions of the spec as well.
139+
140+
_`[2]` Virtqueues and virtio ring: How the data travels
141+
https://www.redhat.com/en/blog/virtqueues-and-virtio-ring-how-data-travels
142+
143+
.. rubric:: Footnotes
144+
145+
.. [#f1] that's why they may be also referred to as virtrings.
Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
.. _writing_virtio_drivers:
4+
5+
======================
6+
Writing Virtio Drivers
7+
======================
8+
9+
Introduction
10+
============
11+
12+
This document serves as a basic guideline for driver programmers that
13+
need to hack a new virtio driver or understand the essentials of the
14+
existing ones. See :ref:`Virtio on Linux <virtio>` for a general
15+
overview of virtio.
16+
17+
18+
Driver boilerplate
19+
==================
20+
21+
As a bare minimum, a virtio driver needs to register in the virtio bus
22+
and configure the virtqueues for the device according to its spec, the
23+
configuration of the virtqueues in the driver side must match the
24+
virtqueue definitions in the device. A basic driver skeleton could look
25+
like this::
26+
27+
#include <linux/virtio.h>
28+
#include <linux/virtio_ids.h>
29+
#include <linux/virtio_config.h>
30+
#include <linux/module.h>
31+
32+
/* device private data (one per device) */
33+
struct virtio_dummy_dev {
34+
struct virtqueue *vq;
35+
};
36+
37+
static void virtio_dummy_recv_cb(struct virtqueue *vq)
38+
{
39+
struct virtio_dummy_dev *dev = vq->vdev->priv;
40+
char *buf;
41+
unsigned int len;
42+
43+
while ((buf = virtqueue_get_buf(dev->vq, &len)) != NULL) {
44+
/* process the received data */
45+
}
46+
}
47+
48+
static int virtio_dummy_probe(struct virtio_device *vdev)
49+
{
50+
struct virtio_dummy_dev *dev = NULL;
51+
52+
/* initialize device data */
53+
dev = kzalloc(sizeof(struct virtio_dummy_dev), GFP_KERNEL);
54+
if (!dev)
55+
return -ENOMEM;
56+
57+
/* the device has a single virtqueue */
58+
dev->vq = virtio_find_single_vq(vdev, virtio_dummy_recv_cb, "input");
59+
if (IS_ERR(dev->vq)) {
60+
kfree(dev);
61+
return PTR_ERR(dev->vq);
62+
63+
}
64+
vdev->priv = dev;
65+
66+
/* from this point on, the device can notify and get callbacks */
67+
virtio_device_ready(vdev);
68+
69+
return 0;
70+
}
71+
72+
static void virtio_dummy_remove(struct virtio_device *vdev)
73+
{
74+
struct virtio_dummy_dev *dev = vdev->priv;
75+
76+
/*
77+
* disable vq interrupts: equivalent to
78+
* vdev->config->reset(vdev)
79+
*/
80+
virtio_reset_device(vdev);
81+
82+
/* detach unused buffers */
83+
while ((buf = virtqueue_detach_unused_buf(dev->vq)) != NULL) {
84+
kfree(buf);
85+
}
86+
87+
/* remove virtqueues */
88+
vdev->config->del_vqs(vdev);
89+
90+
kfree(dev);
91+
}
92+
93+
static const struct virtio_device_id id_table[] = {
94+
{ VIRTIO_ID_DUMMY, VIRTIO_DEV_ANY_ID },
95+
{ 0 },
96+
};
97+
98+
static struct virtio_driver virtio_dummy_driver = {
99+
.driver.name = KBUILD_MODNAME,
100+
.driver.owner = THIS_MODULE,
101+
.id_table = id_table,
102+
.probe = virtio_dummy_probe,
103+
.remove = virtio_dummy_remove,
104+
};
105+
106+
module_virtio_driver(virtio_dummy_driver);
107+
MODULE_DEVICE_TABLE(virtio, id_table);
108+
MODULE_DESCRIPTION("Dummy virtio driver");
109+
MODULE_LICENSE("GPL");
110+
111+
The device id ``VIRTIO_ID_DUMMY`` here is a placeholder, virtio drivers
112+
should be added only for devices that are defined in the spec, see
113+
include/uapi/linux/virtio_ids.h. Device ids need to be at least reserved
114+
in the virtio spec before being added to that file.
115+
116+
If your driver doesn't have to do anything special in its ``init`` and
117+
``exit`` methods, you can use the module_virtio_driver() helper to
118+
reduce the amount of boilerplate code.
119+
120+
The ``probe`` method does the minimum driver setup in this case
121+
(memory allocation for the device data) and initializes the
122+
virtqueue. virtio_device_ready() is used to enable the virtqueue and to
123+
notify the device that the driver is ready to manage the device
124+
("DRIVER_OK"). The virtqueues are anyway enabled automatically by the
125+
core after ``probe`` returns.
126+
127+
.. kernel-doc:: include/linux/virtio_config.h
128+
:identifiers: virtio_device_ready
129+
130+
In any case, the virtqueues need to be enabled before adding buffers to
131+
them.
132+
133+
Sending and receiving data
134+
==========================
135+
136+
The virtio_dummy_recv_cb() callback in the code above will be triggered
137+
when the device notifies the driver after it finishes processing a
138+
descriptor or descriptor chain, either for reading or writing. However,
139+
that's only the second half of the virtio device-driver communication
140+
process, as the communication is always started by the driver regardless
141+
of the direction of the data transfer.
142+
143+
To configure a buffer transfer from the driver to the device, first you
144+
have to add the buffers -- packed as `scatterlists` -- to the
145+
appropriate virtqueue using any of the virtqueue_add_inbuf(),
146+
virtqueue_add_outbuf() or virtqueue_add_sgs(), depending on whether you
147+
need to add one input `scatterlist` (for the device to fill in), one
148+
output `scatterlist` (for the device to consume) or multiple
149+
`scatterlists`, respectively. Then, once the virtqueue is set up, a call
150+
to virtqueue_kick() sends a notification that will be serviced by the
151+
hypervisor that implements the device::
152+
153+
struct scatterlist sg[1];
154+
sg_init_one(sg, buffer, BUFLEN);
155+
virtqueue_add_inbuf(dev->vq, sg, 1, buffer, GFP_ATOMIC);
156+
virtqueue_kick(dev->vq);
157+
158+
.. kernel-doc:: drivers/virtio/virtio_ring.c
159+
:identifiers: virtqueue_add_inbuf
160+
161+
.. kernel-doc:: drivers/virtio/virtio_ring.c
162+
:identifiers: virtqueue_add_outbuf
163+
164+
.. kernel-doc:: drivers/virtio/virtio_ring.c
165+
:identifiers: virtqueue_add_sgs
166+
167+
Then, after the device has read or written the buffers prepared by the
168+
driver and notifies it back, the driver can call virtqueue_get_buf() to
169+
read the data produced by the device (if the virtqueue was set up with
170+
input buffers) or simply to reclaim the buffers if they were already
171+
consumed by the device:
172+
173+
.. kernel-doc:: drivers/virtio/virtio_ring.c
174+
:identifiers: virtqueue_get_buf_ctx
175+
176+
The virtqueue callbacks can be disabled and re-enabled using the
177+
virtqueue_disable_cb() and the family of virtqueue_enable_cb() functions
178+
respectively. See drivers/virtio/virtio_ring.c for more details:
179+
180+
.. kernel-doc:: drivers/virtio/virtio_ring.c
181+
:identifiers: virtqueue_disable_cb
182+
183+
.. kernel-doc:: drivers/virtio/virtio_ring.c
184+
:identifiers: virtqueue_enable_cb
185+
186+
But note that some spurious callbacks can still be triggered under
187+
certain scenarios. The way to disable callbacks reliably is to reset the
188+
device or the virtqueue (virtio_reset_device()).
189+
190+
191+
References
192+
==========
193+
194+
_`[1]` Virtio Spec v1.2:
195+
https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html
196+
197+
Check for later versions of the spec as well.

MAINTAINERS

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22057,6 +22057,7 @@ S: Maintained
2205722057
F: Documentation/ABI/testing/sysfs-bus-vdpa
2205822058
F: Documentation/ABI/testing/sysfs-class-vduse
2205922059
F: Documentation/devicetree/bindings/virtio/
22060+
F: Documentation/driver-api/virtio/
2206022061
F: drivers/block/virtio_blk.c
2206122062
F: drivers/crypto/virtio/
2206222063
F: drivers/net/virtio_net.c
@@ -22077,6 +22078,10 @@ IFCVF VIRTIO DATA PATH ACCELERATOR
2207722078
R: Zhu Lingshan <[email protected]>
2207822079
F: drivers/vdpa/ifcvf/
2207922080

22081+
SNET DPU VIRTIO DATA PATH ACCELERATOR
22082+
R: Alvaro Karsz <[email protected]>
22083+
F: drivers/vdpa/solidrun/
22084+
2208022085
VIRTIO BALLOON
2208122086
M: "Michael S. Tsirkin" <[email protected]>
2208222087
M: David Hildenbrand <[email protected]>

0 commit comments

Comments
 (0)