`CpuBufferPool` slower than glium #1434

KeyboardDanni · 2020-11-07T19:08:19Z

Version of vulkano: 0.19.0
OS: Arch Linux
GPU (the selected PhysicalDevice): GeForce GTX 1060 6GB
GPU Driver: nVidia 455.38 (Driver version 0x71c98000, Vulkan version 1.2.142)
Upload of a reasonably minimal complete main.rs file that demonstrates the issue: Run Keeshond Doggymark example in latest git, reference code for the Vulkan renderer is here: https://gitlab.com/cosmicchipsocket/keeshond/-/blob/c91fbb2a011be18cb462a8173725730ce2052ceb/keeshond/src/renderer/vulkan.rs#L272
Sorry I can't provide anything more concise, but it was hard enough setting up vulkano for this project as-is.

Issue

When using vulkano to draw instanced quads, the overhead for each draw is actually larger than glium, defeating the whole point of using vulkano in the first place.

I need low draw call overhead for my 2D sprite-based engine for scenarios where ordered draws involve lots of texture changes, as these can't be batched.

For each draw operation I do the following:

let chunk = self.instance_buffer.chunk(self.instance_buffer_src.clone()).expect("Failed to allocate buffer chunk");
let vertex_slice = self.vertex_buffer.clone();

builder.draw_indexed(self.pipeline.clone(), &self.dynamic_state,
                     vec![vertex_slice, Arc::new(chunk)],
                     self.index_buffer.clone(), (), ()).expect("Failed to draw buffer");

The problem is that 1. chunk() seems to be performing a lot of allocations, or is otherwise taking a long time to figure out which chunk to use and whether to allocate, and 2. I have to create a new Arc for every call to draw_indexed(). If I remove batching, callgrind reports 23.97% time spent in vulkano::buffer::cpu_pool::CpuBufferPool<T,A>::try_next_impl and 49.89% time spent in __memcpy_avx_unaligned_erms which is being called from core::ptr::drop_in_place'2 which seems to be coming from dropping the Arc. Without this overhead I suspect vulkano would be quite fast, but right now it's blocking me from working on the rest of this renderer until this bottleneck is resolved.

I tried to do buffering myself using a Vec of CpuAccessibleBuffer objects, but I ran into #1429 and #1433 while trying to implement this.

Any help on this is greatly appreciated.

The text was updated successfully, but these errors were encountered:

KentaTheBugMaker · 2020-11-12T15:39:34Z

if you want draw 2d sprite try use blit image or copy image and 1 quad plane

KeyboardDanni · 2020-11-16T02:43:07Z

if you want draw 2d sprite try use blit image or copy image and 1 quad plane

I need to be able to apply transforms, blending, and shaders, so this is a no-go. Additionally, I would have to issue a separate command for every single draw operation, which I don't think would scale well to ~400k sprites. I need to be able to fill an 8k instance buffer so that I can give each sprite a different transform, alpha, etc. within the shader and still have everything be fast, and draw all that at once.

Rua · 2021-01-23T10:34:58Z

I recently noticed when profiling my code that try_next_impl is taking up an awful lot of time. That seems to be connected to this issue.

AustinJ235 added status: needs investigation type: performances labels Nov 10, 2020

Eliah-Lakhin mentioned this issue Nov 11, 2020

Vulkano needs a new active maintainer #1435

Closed

marc0246 mentioned this issue Nov 2, 2022

CpuBufferPool revamp #2076

Merged

Rua closed this as completed in #2076 Nov 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`CpuBufferPool` slower than glium #1434

`CpuBufferPool` slower than glium #1434

KeyboardDanni commented Nov 7, 2020 •

edited

Loading

KentaTheBugMaker commented Nov 12, 2020

KeyboardDanni commented Nov 16, 2020

Rua commented Jan 23, 2021

CpuBufferPool slower than glium #1434

CpuBufferPool slower than glium #1434

Comments

KeyboardDanni commented Nov 7, 2020 • edited Loading

Issue

KentaTheBugMaker commented Nov 12, 2020

KeyboardDanni commented Nov 16, 2020

Rua commented Jan 23, 2021

`CpuBufferPool` slower than glium #1434

`CpuBufferPool` slower than glium #1434

KeyboardDanni commented Nov 7, 2020 •

edited

Loading