Re: [PATCH 5/5] virtio: don't use DMA API unless required

mail archive of the barebox mailing list
 help / color / mirror / Atom feed

From: Sascha Hauer <s.hauer@pengutronix.de>
To: Ahmad Fatoum <a.fatoum@pengutronix.de>
Cc: barebox@lists.infradead.org, ejo@pengutronix.de
Subject: Re: [PATCH 5/5] virtio: don't use DMA API unless required
Date: Mon, 14 Oct 2024 14:48:04 +0200	[thread overview]
Message-ID: <Zw0TBKwNve_cG9wm@pengutronix.de> (raw)
In-Reply-To: <20241009060511.4121157-6-a.fatoum@pengutronix.de>

On Wed, Oct 09, 2024 at 08:05:11AM +0200, Ahmad Fatoum wrote:
> We have no Virt I/O drivers that make use of the streaming DMA API, but
> the Virt queues are currently always allocated using the coherent DMA
> API.
> 
> The coherent DMA API (dma_alloc_coherent/dma_free_coherent) doesn't yet
> take a device pointer in barebox, unlike Linux, and as such it
> unconditionally allocates uncached memory.
> 
> When normally run under Qemu, this doesn't matter. But once we enable
> KVM, using uncached memory for the Virtqueues has considerable
> performance impact.
> 
> To avoid this, let's mimic what Linux does and just side step the DMA
> API if the Virt I/O device tells us that this is ok.
> 
> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
> ---
>  drivers/virtio/virtio_ring.c | 85 ++++++++++++++++++++++++++++++++----
>  include/linux/virtio_ring.h  |  1 +
>  2 files changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 0efe1e002506..787b04a766e9 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -299,14 +299,81 @@ static struct virtqueue *__vring_new_virtqueue(unsigned int index,
>  	return vq;
>  }
>  
> -static void *vring_alloc_queue(size_t size, dma_addr_t *dma_handle)
> +/*
> + * Modern virtio devices have feature bits to specify whether they need a
> + * quirk and bypass the IOMMU. If not there, just use the DMA API.
> + *
> + * If there, the interaction between virtio and DMA API is messy.
> + *
> + * On most systems with virtio, physical addresses match bus addresses,
> + * and it _shouldn't_ particularly matter whether we use the DMA API.
> + *
> + * However, barebox' dma_alloc_coherent doesn't yet take a device pointer
> + * as argument, so even for dma-coherent devices, the virtqueue is mapped
> + * uncached on ARM. This has considerable impact on the Virt I/O performance,
> + * so we really want to avoid using the DMA API if possible for the time being.
> + *
> + * On some systems, including Xen and any system with a physical device
> + * that speaks virtio behind a physical IOMMU, we must use the DMA API
> + * for virtio DMA to work at all.
> + *
> + * On other systems, including SPARC and PPC64, virtio-pci devices are
> + * enumerated as though they are behind an IOMMU, but the virtio host
> + * ignores the IOMMU, so we must either pretend that the IOMMU isn't
> + * there or somehow map everything as the identity.
> + *
> + * For the time being, we preserve historic behavior and bypass the DMA
> + * API.
> + *
> + * TODO: install a per-device DMA ops structure that does the right thing
> + * taking into account all the above quirks, and use the DMA API
> + * unconditionally on data path.
> + */
> +
> +static bool vring_use_dma_api(const struct virtio_device *vdev)
>  {
> -	return dma_alloc_coherent(size, dma_handle);
> +	return !virtio_has_dma_quirk(vdev);
>  }
>  
> -static void vring_free_queue(size_t size, void *queue, dma_addr_t dma_handle)
> +static void *vring_alloc_queue(struct virtio_device *vdev,
> +			       size_t size, dma_addr_t *dma_handle)
>  {
> -	dma_free_coherent(queue, dma_handle, size);
> +	if (vring_use_dma_api(vdev)) {
> +		return dma_alloc_coherent(size, dma_handle);
> +	} else {
> +		void *queue = memalign(PAGE_SIZE, PAGE_ALIGN(size));
> +
> +		if (queue) {
> +			phys_addr_t phys_addr = virt_to_phys(queue);
> +			*dma_handle = (dma_addr_t)phys_addr;
> +
> +			/*
> +			 * Sanity check: make sure we dind't truncate
> +			 * the address.  The only arches I can find that
> +			 * have 64-bit phys_addr_t but 32-bit dma_addr_t
> +			 * are certain non-highmem MIPS and x86
> +			 * configurations, but these configurations
> +			 * should never allocate physical pages above 32
> +			 * bits, so this is fine.  Just in case, throw a
> +			 * warning and abort if we end up with an
> +			 * unrepresentable address.
> +			 */
> +			if (WARN_ON_ONCE(*dma_handle != phys_addr)) {
> +				free(queue);
> +				return NULL;
> +			}
> +		}
> +		return queue;
> +	}
> +}
> +
> +static void vring_free_queue(struct virtio_device *vdev,
> +			     size_t size, void *queue, dma_addr_t dma_handle)
> +{
> +	if (vring_use_dma_api(vdev))
> +		dma_free_coherent(queue, dma_handle, size);
> +	else
> +		free(queue);
>  }
>  
>  struct virtqueue *vring_create_virtqueue(unsigned int index, unsigned int num,
> @@ -327,7 +394,7 @@ struct virtqueue *vring_create_virtqueue(unsigned int index, unsigned int num,
>  
>  	/* TODO: allocate each queue chunk individually */
>  	for (; num && vring_size(num, vring_align) > PAGE_SIZE; num /= 2) {
> -		queue = vring_alloc_queue(vring_size(num, vring_align), &dma_addr);
> +		queue = vring_alloc_queue(vdev, vring_size(num, vring_align), &dma_addr);
>  		if (queue)
>  			break;
>  	}
> @@ -337,7 +404,7 @@ struct virtqueue *vring_create_virtqueue(unsigned int index, unsigned int num,
>  
>  	if (!queue) {
>  		/* Try to get a single page. You are my only hope! */
> -		queue = vring_alloc_queue(vring_size(num, vring_align), &dma_addr);
> +		queue = vring_alloc_queue(vdev, vring_size(num, vring_align), &dma_addr);
>  	}
>  	if (!queue)
>  		return NULL;
> @@ -347,7 +414,7 @@ struct virtqueue *vring_create_virtqueue(unsigned int index, unsigned int num,
>  
>  	vq = __vring_new_virtqueue(index, vring, vdev);
>  	if (!vq) {
> -		vring_free_queue(queue_size_in_bytes, queue, dma_addr);
> +		vring_free_queue(vdev, queue_size_in_bytes, queue, dma_addr);
>  		return NULL;
>  	}
>  	vq_debug(vq, "created vring @ (virt=%p, phys=%pad) for vq with num %u\n",
> @@ -355,13 +422,15 @@ struct virtqueue *vring_create_virtqueue(unsigned int index, unsigned int num,
>  
>  	vq->queue_dma_addr = dma_addr;
>  	vq->queue_size_in_bytes = queue_size_in_bytes;
> +	vq->use_dma_api = vring_use_dma_api(vdev);

What's vq->use_dma_api good for? It's unused.

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

next prev parent reply	other threads:[~2024-10-14 13:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09  6:05 [PATCH 0/5] ARM64: make barebox compatible with KVM Ahmad Fatoum
2024-10-09  6:05 ` [PATCH 1/5] ARM64: io: implement I/O accessors in assembly Ahmad Fatoum
2024-10-09  6:05 ` [PATCH 2/5] ARM64: board-dt-2nd: grow stack down from start of binary Ahmad Fatoum
2024-10-09  6:05 ` [PATCH 3/5] mtd: cfi-flash: use I/O accessors for reads/writes of MMIO regions Ahmad Fatoum
2024-10-09  6:05 ` [PATCH 4/5] ARM64: mmu: flush cacheable regions prior to remapping Ahmad Fatoum
2024-10-09  6:05 ` [PATCH 5/5] virtio: don't use DMA API unless required Ahmad Fatoum
2024-10-14 12:48   ` Sascha Hauer [this message]
2024-10-14 13:05     ` Ahmad Fatoum
2024-10-14 13:06   ` [PATCH] fixup! " Ahmad Fatoum
2024-10-15  6:54 ` [PATCH 0/5] ARM64: make barebox compatible with KVM Sascha Hauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zw0TBKwNve_cG9wm@pengutronix.de \
    --to=s.hauer@pengutronix.de \
    --cc=a.fatoum@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    --cc=ejo@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox