In this article, we will describe how to set up communication between VMs (virtual machines) using the virtio_vdpa module.
The following is a list of related articles.
1.Overview
1-1.Environment
IA server : ProLiant DL360p Gen8 or DL360 Gen9 System ROM : P71 01/22/2018 NIC : Mellanox ConnectX-6 Dx (MCX623106AS-CDAT) OS : CentOS8.3(2011) Kernel : 5.11.11-1.el8.elrepo.x86_64 Installed Environment Groups : @^graphical-server-environment @container-management @development @virtualization-client @virtualization-hypervisor @virtualization-tools Mellanox OFED : v5.2-2.2.0.0 qemu-kvm : v6.0.0-rc1 DPDK : v21.02 ovs : v2.14.1
1-2.Overall flow
Advance preparation
Kernel update
Building qemu
Building dpdk
Change to SR-IOV switchdev mode
Configure ovs-dpdk and VM : Different from previous article
Operation check : Different from previous article
Note
Since many items are the same as in the previous article, items that are different are written in bold blue text.
If your environment is already set up in the previous article, please reboot the host OS and start reading from "Change to SR-IOV switchdev mode".
1-3.overall structure
The following points are different from the previous article.
this article | (1) | /tmp/sock-virtio0 |
previous article | (1) | /dev/vhost-vdpa-0 |
fig.1
fig.1 is a simplified description and omits the internal architecture. For this reason, please imagine the following configuration in reality.
fig.2
Quoted from Red Hat's Blog
vDPA kernel framework part 3: usage for VMs and containers
The orange dotted lines (A) and (B) correspond to fig.1 and fig.2, respectively.
Furthermore, in fig.2, the actual traffic flow is described in blue and red letters. *1
In fig.2, PF and VF of SR-IOV are written respectively, and "VF rep" is written in addition to them.
It should be noted that the bsf (Bus, Slot, Function) numbers of PF and VF rep are the same.
PF | VF0 | VF0 rep |
ens2f0 | ens2f0v0 | ens2f0_0 |
07:00.0 | 07:00.2 | 07:00.0 |
rep=representor is an interface specific to swtichdev mode in SR-IOV, and is created by enabling swtichdev mode.
In contrast to swtichdev mode, the conventional SR-IOV VF is called legacy mode and must be explicitly separated from it.
In addition, switchdev mode is a mandatory requirement for ConnectX-6 Dx to enable the vDPA HW offload.
2.Advance preparation
Although not specifically mentioned, SELinux disabling, FW disabling, and NTP time synchronization settings are done in advance.
2-1.Enabling HugePage and IOMMU
sed -i -e "/GRUB_CMDLINE_LINUX=/s/\"$/ default_hugepagesz=1G hugepagesz=1G hugepages=16\"/g" /etc/default/grub sed -i -e "/GRUB_CMDLINE_LINUX=/s/\"$/ intel_iommu=on iommu=pt pci=realloc\"/g" /etc/default/grub grub2-mkconfig -o /etc/grub2.cfg
Next, implement the mount settings for HugePage. It will be mounted automatically the next time the OS boots.
vi /etc/fstab nodev /dev/hugepages hugetlbfs pagesize=1GB 0 0
2-2.SR-IOV VF settings
Configure the SR-IOV VF settings; you can increase the number of VFs, but for the sake of simplicity, we have set the number of VFs to "1". In addition, setting the MAC address is mandatory. *2
vi /etc/rc.local echo 1 > /sys/class/net/ens2f0/device/sriov_numvfs echo 1 > /sys/class/net/ens2f1/device/sriov_numvfs sleep 1 ip link set ens2f0 vf 0 mac 00:11:22:33:44:00 ip link set ens2f1 vf 0 mac 00:11:22:33:44:10 sleep 1 exit 0 chmod +x /etc/rc.d/rc.local
2-3.Install the Mellanox driver (OFED)
You can download the iso file from the Mellanox website.Mellanox Download Site
Please save the downloaded iso file to /root/tmp/.
The following command will install the Mellanox driver, but it will also install ovs v2.14.1 at the same time.
dnf -y install tcl tk unbound && \ mount -t iso9660 -o loop /root/tmp/MLNX_OFED_LINUX-5.2-2.2.0.0-rhel8.3-x86_64.iso /mnt && \ /mnt/mlnxofedinstall --upstream-libs --dpdk --ovs-dpdk --with-mft --with-mstflint
After the installation is complete, reboot.
reboot
After the reboot is complete, check the HugePage.
cat /proc/meminfo | grep Huge grep hugetlbfs /proc/mounts [root@c83g155 ~]# cat /proc/meminfo | grep Huge AnonHugePages: 452608 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 16 HugePages_Free: 16 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB Hugetlb: 16777216 kB [root@c83g155 ~]# grep hugetlbfs /proc/mounts nodev /dev/hugepages hugetlbfs rw,relatime,pagesize=1024M 0 0
3.Kernel update
As of April 8, 2021, the vDPA-related modules are updated at a high frequency, so install the latest Kernel.
3-1.Installing elrepo
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org dnf -y install https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm
3-2.Installation of Kernel
dnf list installed | grep kernel dnf -y --enablerepo=elrepo-kernel install kernel-ml kernel-ml-devel dnf list installed | grep kernel reboot
Check the currently installed Kernel.
Install kernel-ml and kernel-ml-devel *3
Check the installed Kernel.
Reboot
3-3.Install Kernel headers, etc.
uname -r dnf -y swap --enablerepo=elrepo-kernel kernel-headers -- kernel-ml-headers && \ dnf -y remove kernel-tools kernel-tools-libs && \ dnf -y --enablerepo=elrepo-kernel install kernel-ml-tools kernel-ml-tools-libs dnf list installed | grep kernel
Check the currently running Kernel Version.
Install kernel-headers.
Remove the existing kernel-tools kernel-tools-libs
Install kernel-tools kernel-tools-libs
Check the installed Kernel.
If you get the following output, you are good to go.
[root@c83g155 ~]# dnf list installed | grep kernel kernel.x86_64 4.18.0-240.el8 @anaconda kernel-core.x86_64 4.18.0-240.el8 @anaconda kernel-devel.x86_64 4.18.0-240.el8 @anaconda kernel-ml.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-ml-core.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-ml-devel.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-ml-headers.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-ml-modules.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-ml-tools.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-ml-tools-libs.x86_64 5.11.11-1.el8.elrepo @elrepo-kernel kernel-modules.x86_64 4.18.0-240.el8 @anaconda kmod-kernel-mft-mlnx.x86_64 4.16.1-1.rhel8u3 @System kmod-mlnx-ofa_kernel.x86_64 5.2-OFED.5.2.2.2.0.1.rhel8u3 @System mlnx-ofa_kernel.x86_64 5.2-OFED.5.2.2.2.0.1.rhel8u3 @System mlnx-ofa_kernel-devel.x86_64 5.2-OFED.5.2.2.2.0.1.rhel8u3 @System
4.Building qemu
4-2.Install the necessary packages
In addition to qemu, we have also installed the packages that are required for the dpdk build.
dnf -y install cmake gcc libnl3-devel libudev-devel make numactl numactl-devel \ pkgconfig valgrind-devel pandoc libibverbs libmlx5 libmnl-devel meson ninja-build \ glibc-utils glib2 glib2-devel pixman pixman-devel zlib zlib-devel \ usbredir-devel spice-server-devel && \ wget https://cbs.centos.org/kojifiles/packages/pyelftools/0.26/1.el8/noarch/python3-pyelftools-0.26-1.el8.noarch.rpm && \ dnf -y localinstall python3-pyelftools-0.26-1.el8.noarch.rpm
4-3.Building qemu
cd /usr/src && \ git clone https://github.com/qemu/qemu.git && \ cd qemu/ && \ git checkout v6.0.0-rc1 && \ mkdir build && \ cd build/ && \ ../configure --enable-vhost-vdpa --target-list=x86_64-softmmu && \ make -j && \ make install
Checking Version after Installation
/usr/local/bin/qemu-system-x86_64 --version [root@c83g155 ~]# /usr/local/bin/qemu-system-x86_64 --version QEMU emulator version 5.2.91 (v6.0.0-rc1) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
5.Building dpdk
5-1.Building dpdk
cd /usr/src/ && \ git clone git://dpdk.org/dpdk && \ cd dpdk && \ git checkout v21.02 && \ meson -Dexamples=all build && \ ninja -C build && \ ninja -C build install
5-2.Links to dpdk-related libraries
Create a new file with vi and include the path of lib.
vi /etc/ld.so.conf.d/libdpdk.conf /usr/src/dpdk/build/lib
After running ldconfig, make sure the libs are linked.
ldconfig ldconfig -p |grep dpdk
It is OK if it is pointed as follows.
[root@c83g155 dpdk]# ldconfig -p |grep dpdk librte_vhost.so.21 (libc6,x86-64) => /usr/src/dpdk/build/lib/librte_vhost.so.21 librte_vhost.so (libc6,x86-64) => /usr/src/dpdk/build/lib/librte_vhost.so librte_timer.so.21 (libc6,x86-64) => /usr/src/dpdk/build/lib/librte_timer.so.21 ============ s n i p ============
Now, reboot once again.
reboot
6.Change to SR-IOV switchdev mode
6-1.Check the current operation mode.
lshw -businfo -c network devlink dev eswitch show pci/0000:07:00.0 devlink dev eswitch show pci/0000:07:00.1
Check the bsf (bus, slot, function) number of the PCI device.
Check the status of 07:00.0 (ens2f0)
Check the status of 07:00.1 (ens2f1)
The output will look like the following
[root@c83g155 ~]# lshw -businfo -c network Bus info Device Class Description ======================================================== pci@0000:04:00.0 ens1f0 network 82599ES 10-Gigabit SFI/SFP+ Network Connection pci@0000:04:00.1 ens1f1 network 82599ES 10-Gigabit SFI/SFP+ Network Connection pci@0000:03:00.0 eno1 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:03:00.1 eno2 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:03:00.2 eno3 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:03:00.3 eno4 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:07:00.0 ens2f0 network MT2892 Family [ConnectX-6 Dx] pci@0000:07:00.1 ens2f1 network MT2892 Family [ConnectX-6 Dx] pci@0000:07:00.2 ens2f0v0 network ConnectX Family mlx5Gen Virtual Function pci@0000:07:01.2 ens2f1v0 network ConnectX Family mlx5Gen Virtual Function [root@c83g155 ~]# devlink dev eswitch show pci/0000:07:00.0 pci/0000:07:00.0: mode legacy inline-mode none encap disable [root@c83g155 ~]# devlink dev eswitch show pci/0000:07:00.1 pci/0000:07:00.1: mode legacy inline-mode none encap disable
6-2.Changing the operating mode
Note that the bsf numbers are slightly different.*4
echo 0000:07:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind && \ echo 0000:07:01.2 > /sys/bus/pci/drivers/mlx5_core/unbind && \ devlink dev eswitch set pci/0000:07:00.0 mode switchdev && \ devlink dev eswitch set pci/0000:07:00.1 mode switchdev && \ echo 0000:07:00.2 > /sys/bus/pci/drivers/mlx5_core/bind && \ echo 0000:07:01.2 > /sys/bus/pci/drivers/mlx5_core/bind
Unbind the mlx5_core driver for VF.
07:00.2 | ens2f0v0 |
07:01.2 | ens2f1v0 |
Change the PF operation mode to switchdev.
07:00.0 | ens2f0 |
07:00.1 | ens2f1 |
Rebind the mlx5_core driver of VF.
07:00.2 | ens2f0v0 |
07:01.2 | ens2f1v0 |
6-3.Check the operation mode after the change.
devlink dev eswitch show pci/0000:07:00.0 devlink dev eswitch show pci/0000:07:00.1
Changed to switchdev mode.
[root@c83g155 ~]# devlink dev eswitch show pci/0000:07:00.0 pci/0000:07:00.0: mode switchdev inline-mode none encap enable [root@c83g155 ~]# devlink dev eswitch show pci/0000:07:00.1 pci/0000:07:00.1: mode switchdev inline-mode none encap enable
VF Representer has been added.
[root@c83g155 ~]# lshw -businfo -c network Bus info Device Class Description ======================================================== pci@0000:04:00.0 ens1f0 network 82599ES 10-Gigabit SFI/SFP+ Network Connection pci@0000:04:00.1 ens1f1 network 82599ES 10-Gigabit SFI/SFP+ Network Connection pci@0000:03:00.0 eno1 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:03:00.1 eno2 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:03:00.2 eno3 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:03:00.3 eno4 network NetXtreme BCM5719 Gigabit Ethernet PCIe pci@0000:07:00.0 ens2f0 network MT2892 Family [ConnectX-6 Dx] pci@0000:07:00.1 ens2f1 network MT2892 Family [ConnectX-6 Dx] pci@0000:07:00.2 ens2f0v0 network ConnectX Family mlx5Gen Virtual Function pci@0000:07:01.2 ens2f1v0 network ConnectX Family mlx5Gen Virtual Function pci@0000:07:00.0 ens2f0_0 network Ethernet interface pci@0000:07:00.1 ens2f1_0 network Ethernet interface
In addition, make sure that the HW offload function of the NIC is enabled.
ethtool -k ens2f0 |grep tc ethtool -k ens2f1 |grep tc [root@c83g155 ~]# ethtool -k ens2f0 |grep tc tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on hw-tc-offload: on [root@c83g155 ~]# ethtool -k ens2f1 |grep tc tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on hw-tc-offload: on
7.Configure ovs-dpdk and VM : Different from previous article
7-1.Overall Flow - Overview -
Configure the settings in the order (1)-(9) described in fig.1 below.
fig.1
- Enabling the virtio_vdpa module and configuring dpdk-vdpa : (1) : Different from previous article
- Initial configuration of ovs
- Configuration of br30-ovs: (2)(3)(4)
- Configuration of br31-ovs: (5)(6)(7)
- Configure and start virtual machine c77g153: (8) : Different from previous article
- Configure and start virtual machine c77g159: (9) : Different from previous article
7-2.Overall flow - Commands only -
We will throw in the following commands.
Detailed explanations will follow, but if you don't need the explanations, just execute the commands.
1.Enabling the virtio_vdpa module and configuring dpdk-vdpa
(1)
modprobe virtio_vdpa
/usr/src/dpdk/build/examples/dpdk-vdpa \
--socket-mem 1024,1024 \
-a 0000:07:00.2,class=vdpa \
-a 0000:07:01.2,class=vdpa \
--log-level=pmd,debug -- -i
create /tmp/sock-virtio0 0000:07:00.2
create /tmp/sock-virtio1 0000:07:01.2
2.Initial configuration of ovs
systemctl start openvswitch
ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl set Open_vSwitch . other_config:hw-offload=true other_config:tc-policy=none
ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=1024,1024
ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vsctl set Open_vSwitch . other_config:dpdk-extra=" \
-w 0000:07:00.0,representor=[0],dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=0 \
-w 0000:07:00.1,representor=[0],dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=0"
systemctl restart openvswitch
3.Configuration of br30-ovs
(2)
ovs-vsctl add-br br30-ovs -- set bridge br30-ovs datapath_type=netdev
(3)
ovs-vsctl add-port br30-ovs ens2f0 -- set Interface ens2f0 type=dpdk options:dpdk-devargs=0000:07:00.0
(4)
ovs-vsctl add-port br30-ovs ens2f0_0 -- set Interface ens2f0_0 type=dpdk options:dpdk-devargs=0000:07:00.0,representor=[0]
4.Configuration of br31-ovs
(5)
ovs-vsctl add-br br31-ovs -- set bridge br31-ovs datapath_type=netdev
(6)
ovs-vsctl add-port br31-ovs ens2f1 -- set Interface ens2f1 type=dpdk options:dpdk-devargs=0000:07:00.1
(7)
ovs-vsctl add-port br31-ovs ens2f1_0 -- set Interface ens2f1_0 type=dpdk options:dpdk-devargs=0000:07:00.1,representor=[0]
5.Configure and start virtual machine c77g153
(8)
virsh edit c77g153
<currentMemory unit='KiB'>4194304</currentMemory>
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB'/>
</hugepages>
</memoryBacking>
<cpu mode='custom' match='exact' check='partial'>
<numa>
<cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
virt-xml c77g153 --edit --qemu-commandline='-mem-prealloc'
virt-xml c77g153 --edit --qemu-commandline='-chardev'
virt-xml c77g153 --edit --qemu-commandline='socket,id=charnet1,path=/tmp/sock-virtio0'
virt-xml c77g153 --edit --qemu-commandline='-netdev'
virt-xml c77g153 --edit --qemu-commandline='vhost-user,chardev=charnet1,queues=16,id=hostnet1'
virt-xml c77g153 --edit --qemu-commandline='-device'
virt-xml c77g153 --edit --qemu-commandline='virtio-net-pci,mq=on,vectors=6,netdev=hostnet1,id=net1,mac=00:11:22:33:44:00,addr=0x6,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024'
6.Configure and start virtual machine c77g159
(9)
virsh edit c77g159
<currentMemory unit='KiB'>4194304</currentMemory>
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB'/>
</hugepages>
</memoryBacking>
<cpu mode='custom' match='exact' check='partial'>
<numa>
<cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
virt-xml c77g159 --edit --qemu-commandline='-mem-prealloc'
virt-xml c77g159 --edit --qemu-commandline='-chardev'
virt-xml c77g159 --edit --qemu-commandline='socket,id=charnet2,path=/tmp/sock-virtio1'
virt-xml c77g159 --edit --qemu-commandline='-netdev'
virt-xml c77g159 --edit --qemu-commandline='vhost-user,chardev=charnet2,queues=16,id=hostnet2'
virt-xml c77g159 --edit --qemu-commandline='-device'
virt-xml c77g159 --edit --qemu-commandline='virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net1,mac=00:11:22:33:44:10,addr=0x7,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024'
7-3.Enabling the virtio_vdpa module and configuring dpdk-vdpa:(1) : Different from previous article
Enabling the virtio_vdpa module
We will check the changes before and after executing the modprobe virtio_vdpa command.
Before running modprobe virtio_vdpa
lsmod |grep vd ls -Fal /sys/bus/vdpa/drivers/virtio_vdpa [root@c83g155 ~]# lsmod |grep vd mlx5_vdpa 45056 0 vhost_iotlb 16384 2 vhost,mlx5_vdpa vdpa 16384 1 mlx5_vdpa mlx5_core 1216512 2 mlx5_vdpa,mlx5_ib [root@c83g155 ~]# ls -Fal /sys/bus/vdpa/drivers/virtio_vdpa ls: cannot access '/sys/bus/vdpa/drivers/virtio_vdpa': No such file or directory
After running modprobe virtio_vdpa
modprobe virtio_vdpa lsmod |grep vd ls -Fal /sys/bus/vdpa/drivers/virtio_vdpa [root@c83g155 ~]# lsmod |grep vd virtio_vdpa 16384 0 mlx5_vdpa 45056 0 vhost_iotlb 16384 1 mlx5_vdpa vdpa 16384 2 virtio_vdpa,mlx5_vdpa mlx5_core 1216512 2 mlx5_vdpa,mlx5_ib [root@c83g155 ~]# ls -Fal /sys/bus/vdpa/drivers/virtio_vdpa total 0 drwxr-xr-x 2 root root 0 Apr 12 21:00 ./ drwxr-xr-x 3 root root 0 Apr 12 21:00 ../ --w------- 1 root root 4096 Apr 12 21:00 bind lrwxrwxrwx 1 root root 0 Apr 12 21:00 module -> ../../../../module/virtio_vdpa/ --w------- 1 root root 4096 Apr 12 21:00 uevent --w------- 1 root root 4096 Apr 12 21:00 unbind lrwxrwxrwx 1 root root 0 Apr 12 21:00 vdpa0 -> ../../../../devices/pci0000:00/0000:00:03.0/0000:07:00.2/vdpa0/ lrwxrwxrwx 1 root root 0 Apr 12 21:00 vdpa1 -> ../../../../devices/pci0000:00/0000:00:03.0/0000:07:01.2/vdpa1/
From the above output results, we can confirm the following.
- 0000:07:00.2/vdpa0 and 0000:07:01.2/vdpa1 are controlled by the virtio_vdpa driver
Configuring dpdk-vdpa
Next, run the dpdk-vdpa command.
/usr/src/dpdk/build/examples/dpdk-vdpa \
--socket-mem 1024,1024 \
-a 0000:07:00.2,class=vdpa \
-a 0000:07:01.2,class=vdpa \
--log-level=pmd,debug -- -i
When the prompt changes to "vdpa>", execute the following command.
create /tmp/sock-virtio0 0000:07:00.2 create /tmp/sock-virtio1 0000:07:01.2
Connect to the host OS via ssh in another terminal and confirm that the sock file has been generated using the following command.
[root@c83g155 ~]# ls -Fal /tmp total 36 drwxrwxrwt. 17 root root 4096 Apr 12 21:08 ./ dr-xr-xr-x. 17 root root 244 Apr 7 20:30 ../ -rw-r--r-- 1 root root 1874 Apr 7 20:30 anaconda.log ===================== s n i p ===================== srwxr-xr-x 1 root root 0 Apr 12 21:08 sock-virtio0= srwxr-xr-x 1 root root 0 Apr 12 21:08 sock-virtio1= drwx------ 3 root root 17 Apr 12 19:56 systemd-private-f5b122148a7c4019be8cf0116bd9f2cc-chronyd.service-IEe7hb/ ===================== s n i p =====================
Note
The following is an example of output from the dpdk-vdpa command.
[root@c83g155 ~]# /usr/src/dpdk/build/examples/dpdk-vdpa \ > --socket-mem 1024,1024 \ > -a 0000:07:00.2,class=vdpa \ > -a 0000:07:01.2,class=vdpa \ > --log-level=pmd,debug -- -i EAL: Detected 16 lcore(s) EAL: Detected 2 NUMA nodes EAL: Detected static linkage of DPDK EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: No available 2048 kB hugepages reported EAL: Probing VFIO support... EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:07:00.2 (socket 0) mlx5_vdpa: Checking device "mlx5_3".. mlx5_vdpa: Checking device "mlx5_2".. mlx5_vdpa: PCI information matches for device "mlx5_2". common_mlx5: Netlink "devlink" family ID is 20. common_mlx5: ROCE is enabled for device "0000:07:00.2". common_mlx5: Device 0000:07:00.2 ROCE was disabled by Netlink successfully. common_mlx5: Device "0000:07:00.2" was reloaded by Netlink successfully. mlx5_vdpa: ROCE is disabled by Netlink successfully. mlx5_vdpa: Checking device "mlx5_3".. mlx5_vdpa: Checking device "mlx5_1".. mlx5_vdpa: Checking device "mlx5_0".. mlx5_vdpa: Checking device "mlx5_2".. mlx5_vdpa: event mode is 1. mlx5_vdpa: event_us is 0 us. mlx5_vdpa: no traffic time is 2 s. EAL: Probe PCI driver: mlx5_pci (15b3:101e) device: 0000:07:01.2 (socket 0) mlx5_vdpa: Checking device "mlx5_3".. mlx5_vdpa: PCI information matches for device "mlx5_3". common_mlx5: Netlink "devlink" family ID is 20. common_mlx5: ROCE is enabled for device "0000:07:01.2". common_mlx5: Device 0000:07:01.2 ROCE was disabled by Netlink successfully. common_mlx5: Device "0000:07:01.2" was reloaded by Netlink successfully. mlx5_vdpa: ROCE is disabled by Netlink successfully. mlx5_vdpa: Checking device "mlx5_1".. mlx5_vdpa: Checking device "mlx5_0".. mlx5_vdpa: Checking device "mlx5_2".. mlx5_vdpa: Checking device "mlx5_3".. mlx5_vdpa: event mode is 1. mlx5_vdpa: event_us is 0 us. mlx5_vdpa: no traffic time is 2 s. EAL: No legacy callbacks, legacy socket not created Interactive-mode selected vdpa> < < < < After executing the command, the prompt changes to "vdpa>". vdpa> create /tmp/sock-virtio0 0000:07:00.2 VHOST_CONFIG: vhost-user server: socket created, fd: 83 VHOST_CONFIG: bind to /tmp/sock-virtio0 vdpa> create /tmp/sock-virtio1 0000:07:01.2 VHOST_CONFIG: vhost-user server: socket created, fd: 86 VHOST_CONFIG: bind to /tmp/sock-virtio1 vdpa>
Please keep this terminal as it is, as we will use it in the operation check later.
7-4.Initial configuration of ovs
Since ovs has already been installed, start the service from systemctl.*5
systemctl start openvswitch
ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl set Open_vSwitch . other_config:hw-offload=true other_config:tc-policy=none
ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=1024,1024
ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vsctl set Open_vSwitch . other_config:dpdk-extra=" \
-w 0000:07:00.0,representor=[0],dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=0 \
-w 0000:07:00.1,representor=[0],dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=0"
systemctl restart openvswitch
Start the ovs service
Initialize dpdk
HW offload and tc-policy configuration
Memory allocation
IOMMU configuration for vhost
Configure representer
Restart the ovs service (to reflect the above settings)
Use the following command to check the settings.
ovs-vsctl get Open_vSwitch . other_config [root@c83g155 ~]# ovs-vsctl get Open_vSwitch . other_config {dpdk-extra=" -w 0000:07:00.0,representor=[0],dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=0 -w 0000:07:00.1,representor=[0],dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=0", dpdk-init="true", dpdk-socket-mem="1024,1024", hw-offload="true", tc-policy=none, vhost-iommu-support="true"}
Note 1:
Here is a supplementary explanation of other_config:dpdk-extra.
There is the following correspondence between the output results of "lshw -businfo -c network" and the commands configured in "other_config:dpdk-extra".
0000:07:00.0 ens2f0_0 | -w 0000:07:00.0,representor=[0] |
0000:07:00.1 ens2f1_0 | -w 0000:07:00.1,representor=[0] |
Note 2:
Here is a supplementary explanation of other_config:tc-policy.
The following options can be set for tc-policy.
none | adds a TC rule to both the software and the hardware (default) |
skip_sw | adds a TC rule only to the hardware |
skip_hw | adds a TC rule only to the software |
Note 3:
If you want to remove the configuration, execute the command as follows.
"dpdk-extra" is the key, so specify any key you want to delete, such as "dpdk-init" or "hw-offload".
ovs-vsctl remove Open_vSwitch . other_config dpdk-extra
7-5.Configuration of br30-ovs : (2)(3)(4)
Create the first bridge.
(2) ovs-vsctl add-br br30-ovs -- set bridge br30-ovs datapath_type=netdev (3) ovs-vsctl add-port br30-ovs ens2f0 -- set Interface ens2f0 type=dpdk options:dpdk-devargs=0000:07:00.0 (4) ovs-vsctl add-port br30-ovs ens2f0_0 -- set Interface ens2f0_0 type=dpdk options:dpdk-devargs=0000:07:00.0,representor=[0]
(2) Create a bridge
(3) Create the uplink (specify PF and set the interface for the external NW)
(4) Create downlink (specify VF Representer and set up the interface for VM)
Check the settings with the following command.
[root@c83g155 ~]# ovs-vsctl show 59a34ea2-ca80-48b9-8b14-a656c79bc451 Bridge br30-ovs datapath_type: netdev Port br30-ovs Interface br30-ovs type: internal Port ens2f0_0 Interface ens2f0_0 type: dpdk options: {dpdk-devargs="0000:07:00.0,representor=[0]"} Port ens2f0 Interface ens2f0 type: dpdk options: {dpdk-devargs="0000:07:00.0"} ovs_version: "2.14.1"
7-6.Configuration of br31-ovs : (5)(6)(7)
Create the second bridge.
(5) ovs-vsctl add-br br31-ovs -- set bridge br31-ovs datapath_type=netdev (6) ovs-vsctl add-port br31-ovs ens2f1 -- set Interface ens2f1 type=dpdk options:dpdk-devargs=0000:07:00.1 (7) ovs-vsctl add-port br31-ovs ens2f1_0 -- set Interface ens2f1_0 type=dpdk options:dpdk-devargs=0000:07:00.1,representor=[0]
Same as (2), (3), and (4).
Check the settings with the following command. The blue text is the part that has been added.
[root@c83g155 ~]# ovs-vsctl show
59a34ea2-ca80-48b9-8b14-a656c79bc451
Bridge br31-ovs
datapath_type: netdev
Port ens2f1_0
Interface ens2f1_0
type: dpdk
options: {dpdk-devargs="0000:07:00.1,representor=[0]"}
Port ens2f1
Interface ens2f1
type: dpdk
options: {dpdk-devargs="0000:07:00.1"}
Port br31-ovs
Interface br31-ovs
type: internal
Bridge br30-ovs
datapath_type: netdev
Port br30-ovs
Interface br30-ovs
type: internal
Port ens2f0_0
Interface ens2f0_0
type: dpdk
options: {dpdk-devargs="0000:07:00.0,representor=[0]"}
Port ens2f0
Interface ens2f0
type: dpdk
options: {dpdk-devargs="0000:07:00.0"}
ovs_version: "2.14.1"
7-7.Configure and start virtual machine c77g153 : (8) : Different from previous article
Please upload the qcow2 file to "/var/lib/libvirt/images/".
In this article, the qcow2 file with CentOS7.7 installed was prepared beforehand.
Additionally, once you have created a virtual machine with virt-manager, you will edit it with the "virsh edit" and "virt-xml" commands.*6
Login to the host OS via VNC or other means, and start virt-manager.
When creating a new virtual machine, delete the following [1]-[5] devices.*7
After booting the VM, shutdown it once.
After shutdown, the device configuration should look like the following.
The NICs listed here are not used in vDPA, but they will allow you to ssh to them, so if you need to, assign a management IP to them.
After shutdown, use the virsh edit command to perform the following settings.
(8) virsh edit c77g153 <currentMemory unit='KiB'>4194304</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking> <cpu mode='custom' match='exact' check='partial'> <numa> <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/> </numa> </cpu>
After returning to the bash, further configure the following settings using the virt-xml command.
(8) virt-xml c77g153 --edit --qemu-commandline='-mem-prealloc' virt-xml c77g153 --edit --qemu-commandline='-chardev' virt-xml c77g153 --edit --qemu-commandline='socket,id=charnet1,path=/tmp/sock-virtio0' virt-xml c77g153 --edit --qemu-commandline='-netdev' virt-xml c77g153 --edit --qemu-commandline='vhost-user,chardev=charnet1,queues=16,id=hostnet1' virt-xml c77g153 --edit --qemu-commandline='-device' virt-xml c77g153 --edit --qemu-commandline='virtio-net-pci,mq=on,vectors=6,netdev=hostnet1,id=net1,mac=00:11:22:33:44:00,addr=0x6,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024'
-mem-prealloc | We haven't been able to confirm the details, but it seems to be a mandatory setting since it is used for exchanging virtqueue with PlatformIOMMU from fig.2. |
path=/tmp/sock-virtio0 | Explicitly specify the sock file for dpdk-vdpa. |
mq=on | This is the setting for using multi-queue. |
page-per-vq=on | This setting is required to use virtqueue. |
Note
When you run the virt-xml command, you will see the following WARNING message, please ignore it.
WARNING XML did not change after domain define. You may have changed a value that libvirt is setting by default.
7-8.Configure and start virtual machine c77g159 : (9) : Different from previous article
Same as 7-7, except /tmp/sock-virtio1.
(9) virsh edit c77g159 <currentMemory unit='KiB'>4194304</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking> <cpu mode='custom' match='exact' check='partial'> <numa> <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/> </numa> </cpu>
After returning to the bash, further configure the following settings using the virt-xml command.
(9) virt-xml c77g159 --edit --qemu-commandline='-mem-prealloc' virt-xml c77g159 --edit --qemu-commandline='-chardev' virt-xml c77g159 --edit --qemu-commandline='socket,id=charnet2,path=/tmp/sock-virtio1' virt-xml c77g159 --edit --qemu-commandline='-netdev' virt-xml c77g159 --edit --qemu-commandline='vhost-user,chardev=charnet2,queues=16,id=hostnet2' virt-xml c77g159 --edit --qemu-commandline='-device' virt-xml c77g159 --edit --qemu-commandline='virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net1,mac=00:11:22:33:44:10,addr=0x7,page-per-vq=on,rx_queue_size=1024,tx_queue_size=1024'
8.Operation check : Different from previous article
8-1.advance preparation
Prepare five consoles on hostOS c83g155.
ConsoleA | Already activated at 7-3 | To refer to the dpdk-vdpa log |
ConsoleB | watch ovs-ofctl -O OpenFlow14 dump-ports br30-ovs | To check the packet count on c77g153 |
ConsoleC | watch ovs-ofctl -O OpenFlow14 dump-ports br31-ovs | To check the packet count on c77g159 |
ConsoleD | virsh start c77g153; virsh console c77g153 | For the console of virtual machine c77g153 |
ConsoleE | virsh start c77g159; virsh console c77g159 | For the console of virtual machine c77g159 |
8-2.Booting the VM
ConsoleA has been started in debug mode when the dpdk-vdpa command was executed in 7-3.
For ConsoleB and C, please run the above commands before starting the VM.
Then, for ConsoleD, start c77g153 with the above command.
After waiting for a few seconds, ConsoleE will start c77g159 with the above command.
Send a ping from c77g153 or c77g159.
As an example, follow fig.1 and execute ping 192.168.30.159 -f from c77g153.
fig.1
The following is the output result. The points of interest are in red.
ConsoleA
The ConsoleA log is an excerpt.
The full output has been saved to this link.
vdpa> VHOST_CONFIG: new vhost user connection is 87 VHOST_CONFIG: new device, handle is 0 VHOST_CONFIG: read message VHOST_USER_GET_FEATURES VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES ===================== s n i p ===================== VHOST_CONFIG: read message VHOST_USER_SET_FEATURES VHOST_CONFIG: negotiated Virtio features: 0x140601803 VHOST_CONFIG: read message VHOST_USER_SET_MEM_TABLE VHOST_CONFIG: guest memory region size: 0x80000000 guest physical addr: 0x0 guest virtual addr: 0x7faa40000000 host virtual addr: 0x7f8080000000 mmap addr : 0x7f8080000000 mmap size : 0x80000000 mmap align: 0x40000000 mmap off : 0x0 VHOST_CONFIG: guest memory region size: 0x80000000 guest physical addr: 0x100000000 guest virtual addr: 0x7faac0000000 host virtual addr: 0x7f8000000000 mmap addr : 0x7f7f80000000 mmap size : 0x100000000 mmap align: 0x40000000 mmap off : 0x80000000 VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE ===================== s n i p ===================== new port /tmp/sock-virtio0, device : 0000:07:00.2 mlx5_vdpa: Cannot get vhost MTU - -95. mlx5_vdpa: MTU cannot be set on device 0000:07:00.2. mlx5_vdpa: Region 0: HVA 0x7f8080000000, GPA 0x0, size 0x80000000. mlx5_vdpa: Region 1: HVA 0x7f8000000000, GPA 0x100000000, size 0x80000000. mlx5_vdpa: Indirect mkey mode is KLM Fixed Buffer Size. mlx5_vdpa: Memory registration information: nregions = 2, mem_size = 0x180000000, GCD = 0x80000000, klm_fbs_entries_num = 0x3, klm_entries_num = 0x3. mlx5_vdpa: Dump fill Mkey = 1792. mlx5_vdpa: Registered error interrupt for device0. mlx5_vdpa: VAR address of doorbell mapping is 0x7f8157669000. mlx5_vdpa: vid 0: Init last_avail_idx=0, last_used_idx=0 for virtq 0. mlx5_vdpa: Register fd 123 interrupt for virtq 0. mlx5_vdpa: vid 0 virtq 0 was created successfully. mlx5_vdpa: Virtq 0 notifier state is enabled. mlx5_vdpa: Ring virtq 0 doorbell. mlx5_vdpa: vid 0: Init last_avail_idx=0, last_used_idx=0 for virtq 1. mlx5_vdpa: Register fd 89 interrupt for virtq 1. mlx5_vdpa: vid 0 virtq 1 was created successfully. mlx5_vdpa: Virtq 1 notifier state is enabled. mlx5_vdpa: Ring virtq 1 doorbell. mlx5_vdpa: vDPA device 0 was configured. VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL VHOST_CONFIG: vring call idx:1 file:127 mlx5_vdpa: Update virtq 1 status enable -> disable. mlx5_vdpa: vid 0 virtq 1 was stopped. mlx5_vdpa: Query vid 0 vring 1: hw_available_idx=0, hw_used_index=0 mlx5_vdpa: Update virtq 1 status disable -> enable. mlx5_vdpa: vid 0: Init last_avail_idx=0, last_used_idx=0 for virtq 1. mlx5_vdpa: Register fd 89 interrupt for virtq 1. mlx5_vdpa: vid 0 virtq 1 was created successfully. VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE VHOST_CONFIG: set queue enable: 1 to qp idx: 0 ===================== s n i p ===================== mlx5_vdpa: Update virtq 2 status disable -> enable. mlx5_vdpa: vid 0: Init last_avail_idx=0, last_used_idx=0 for virtq 2. mlx5_vdpa: Register fd 90 interrupt for virtq 2. mlx5_vdpa: vid 0 virtq 2 was created successfully. mlx5_vdpa: Virtq 2 notifier state is enabled. mlx5_vdpa: Ring virtq 2 doorbell. VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE VHOST_CONFIG: set queue enable: 1 to qp idx: 3 mlx5_vdpa: Update virtq 3 status disable -> enable. mlx5_vdpa: vid 0: Init last_avail_idx=0, last_used_idx=0 for virtq 3. mlx5_vdpa: Register fd 91 interrupt for virtq 3. mlx5_vdpa: vid 0 virtq 3 was created successfully. ===================== s n i p ===================== VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE VHOST_CONFIG: set queue enable: 0 to qp idx: 31 mlx5_vdpa: Virtq 3 notifier state is enabled. mlx5_vdpa: Ring virtq 3 doorbell. mlx5_vdpa: Device 0000:07:00.2 virtq 3 cq 2277 event was captured. Timer is off, cq ci is 1. mlx5_vdpa: Device 0000:07:00.2 virtq 1 cq 2270 event was captured. Timer is on, cq ci is 1. mlx5_vdpa: Device 0000:07:00.2 traffic was stopped. mlx5_vdpa: Device 0000:07:00.2 virtq 3 cq 2277 event was captured. Timer is off, cq ci is 18. mlx5_vdpa: Device 0000:07:00.2 traffic was stopped.
ConsoleB
[root@c83g155 ~]# ovs-ofctl -O OpenFlow14 dump-ports br30-ovs OFPST_PORT reply (OF1.4) (xid=0x2): 3 ports port ens2f0: rx pkts=159317, bytes=15614385, drop=0, errs=0, frame=?, over=?, crc=? tx pkts=159318, bytes=15614457, drop=0, errs=0, coll=? duration=173.964s rx rfc2819 broadcast_packets=2, tx rfc2819 multicast_packets=53, broadcast_packets=1, CUSTOM Statistics ovs_tx_failure_drops=0, ovs_tx_mtu_exceeded_drops=0, ovs_tx_qos_drops=0, ovs_rx_qos_drops=0, ovs_tx_invalid_hwol_drops=0, rx_missed_errors=0, rx_errors=0, tx_errors=0, rx_mbuf_allocation_errors=0, rx_q0_errors=0, rx_wqe_errors=0, rx_phy_crc_errors=0, rx_phy_in_range_len_errors=0, rx_phy_symbol_errors=0, tx_phy_errors=0, tx_pp_missed_interrupt_errors=0, tx_pp_rearm_queue_errors=0, tx_pp_clock_queue_errors=0, tx_pp_timestamp_past_errors=0, tx_pp_timestamp_future_errors=0, port LOCAL: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=0, bytes=0, drop=54, errs=0, coll=0 duration=173.957s port "ens2f0_0": rx pkts=159318, bytes=15614457, drop=0, errs=0, frame=?, over=?, crc=? tx pkts=159317, bytes=15614385, drop=0, errs=0, coll=? duration=173.729s CUSTOM Statistics ovs_tx_failure_drops=0, ovs_tx_mtu_exceeded_drops=0, ovs_tx_qos_drops=0, ovs_rx_qos_drops=0, ovs_tx_invalid_hwol_drops=0, rx_missed_errors=0, rx_errors=0, tx_errors=0, rx_mbuf_allocation_errors=0, rx_q0_errors=0, tx_pp_missed_interrupt_errors=0, tx_pp_rearm_queue_errors=0, tx_pp_clock_queue_errors=0, tx_pp_timestamp_past_errors=0, tx_pp_timestamp_future_errors=0,
ConsoleC
[root@c83g155 ~]# ovs-ofctl -O OpenFlow14 dump-ports br31-ovs OFPST_PORT reply (OF1.4) (xid=0x2): 3 ports port ens2f1: rx pkts=159318, bytes=15614493, drop=0, errs=0, frame=?, over=?, crc=? tx pkts=159317, bytes=15614349, drop=0, errs=0, coll=? duration=180.549s rx rfc2819 broadcast_packets=2, tx rfc2819 multicast_packets=53, broadcast_packets=1, CUSTOM Statistics ovs_tx_failure_drops=0, ovs_tx_mtu_exceeded_drops=0, ovs_tx_qos_drops=0, ovs_rx_qos_drops=0, ovs_tx_invalid_hwol_drops=0, rx_missed_errors=0, rx_errors=0, tx_errors=0, rx_mbuf_allocation_errors=0, rx_q0_errors=0, rx_wqe_errors=0, rx_phy_crc_errors=0, rx_phy_in_range_len_errors=0, rx_phy_symbol_errors=0, tx_phy_errors=0, tx_pp_missed_interrupt_errors=0, tx_pp_rearm_queue_errors=0, tx_pp_clock_queue_errors=0, tx_pp_timestamp_past_errors=0, tx_pp_timestamp_future_errors=0, port LOCAL: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=0, bytes=0, drop=54, errs=0, coll=0 duration=181.910s port "ens2f1_0": rx pkts=159317, bytes=15614349, drop=0, errs=0, frame=?, over=?, crc=? tx pkts=159318, bytes=15614493, drop=0, errs=0, coll=? duration=180.861s CUSTOM Statistics ovs_tx_failure_drops=0, ovs_tx_mtu_exceeded_drops=0, ovs_tx_qos_drops=0, ovs_rx_qos_drops=0, ovs_tx_invalid_hwol_drops=0, rx_missed_errors=0, rx_errors=0, tx_errors=0, rx_mbuf_allocation_errors=0, rx_q0_errors=0, tx_pp_missed_interrupt_errors=0, tx_pp_rearm_queue_errors=0, tx_pp_clock_queue_errors=0, tx_pp_timestamp_past_errors=0, tx_pp_timestamp_future_errors=0,
ConsoleD
[root@c77g153 ~]# ping 192.168.30.159 -f PING 192.168.30.159 (192.168.30.159) 56(84) bytes of data. . --- 192.168.30.159 ping statistics --- 159288 packets transmitted, 159288 received, 0% packet loss, time 24357ms rtt min/avg/max/mdev = 0.069/0.086/60.812/0.202 ms, pipe 5, ipg/ewma 0.152/0.101 ms
Note
mlx5_vdpa: Cannot get vhost MTU - -95. | The MTU message is output, but there is no problem. |
mlx5_vdpa: vid 0 virtq 0 was created successfully. | Indicates that the creation of virtq was successful. |
mlx5_vdpa: Device 0000:07:00.2 traffic was stopped. | You will see this message after a while after starting the virtual machine, but it does not mean that sending and receiving traffic has been stopped, so there is no problem. |
ens2f0 "ens2f0_0" | You can see that the tx/rx packet count and byte count for each port are increasing. |
That's all.
9.Finally
We referred to the following website.
https://www.redhat.com/en/blog?search=vdpa
https://docs.mellanox.com/pages/viewpage.action?pageId=43718786
https://community.mellanox.com/s/article/Basic-Debug-utilities-with-OVS-DPDK-offload-ASAP-Direct
https://static.sched.com/hosted_files/dpdkuserspace2020/ab/vDPA%20-%20DPDK%20Userspace%202020.pdf
https://netdevconf.info/1.2/slides/oct6/04_gerlitz_efraim_introduction_to_switchdev_sriov_offloads.pdf
https://www.mail-archive.com/dev@dpdk.org/msg175938.html
https://www.spinics.net/lists/netdev/msg693858.html
http://yunazuno.hatenablog.com/entry/2018/07/08/215118
https://ameblo.jp/makototgc/entry-12579674054.html
https://www.jianshu.com/p/091b60ea72dc
https://doc.dpdk.org/guides/sample_app_ug/vdpa.html#build <-added
In the next article, as an extra chapter, We plan to describe how to procure NICs, how to configure other than ovs-dpdk, and what issues we are facing.
No | vm(qemu)/k8s | k8s Pod/VMI | vDPA Framework | vDPA Type | SR-IOV mode | Related Articles |
1 | vm | - | kernel | vhost | lagacy | Not started |
2 | vm | - | kernel | vhost | switchdev | How to set up vDPA with vhost_vdpa for VMs - Metonymical Deflection |
3 | vm | - | kernel | virtio | lagacy | Not started |
4 | vm | - | kernel | virtio | switchdev | Not started |
5 | vm | - | dpdk | vhost | lagacy | Not started |
6 | vm | - | dpdk | vhost | switchdev | Not started |
7 | vm | - | dpdk | virtio | lagacy | Not started |
8 | vm | - | dpdk | virtio | switchdev | How to set up vDPA with virtio_vdpa for VMs - Metonymical DeflectionThis article |
9 | k8s | pod | kernel | vhost | lagacy | How to set up vDPA with vhost_vdpa for Kubernetes - Metonymical Deflection |
10 | k8s | pod | kernel | vhost | switchdev | How to set up vDPA with vhost_vdpa for Kubernetes + Accelerated Bridge CNI - Metonymical Deflection |
11 | k8s | pod | kernel | virtio | lagacy | Not started |
12 | k8s | pod | kernel | virtio | switchdev | Not started |
13 | k8s | pod | dpdk | client | lagacy | Not started |
14 | k8s | pod | dpdk | client | switchdev | Not started |
15 | k8s | pod | dpdk | server | lagacy | Not started |
16 | k8s | pod | dpdk | server | switchdev | Not started |
Other related articles
How to set up vDPA - appendix - - Metonymical Deflection
*1:This is a description of what I understand. If the content is incorrect, please point it out.
*2:We have confirmed that if the MAC address is not settings, the VM will not recognize the VF after VM startup.
*3:core and modules will be installed at the same time
*4:The "0000" in front of the bsf number is called the Domain number. As far as I know, I have never seen a value other than "0000", so I don't think you need to worry too much about it.
*5:It has already been installed in 2-3.
*6:We will describe the details in the extra chapter, but in the case of vhost_vdpa in the previous article, We were able to start the virtual machine with virt-manager, but We were not able to communicate with it. For this reason, in vhost_vdpa, we booted directly from qemu-kvm.
*7:This is because related packages such as spice were not installed when qemu was built, and the virtual machine could not be started without removing these devices. Since this is not directly related to vDPA, we will not discuss how to deal with these issues.