You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using vulkano to draw instanced quads, the overhead for each draw is actually larger than glium, defeating the whole point of using vulkano in the first place.
I need low draw call overhead for my 2D sprite-based engine for scenarios where ordered draws involve lots of texture changes, as these can't be batched.
For each draw operation I do the following:
let chunk = self.instance_buffer.chunk(self.instance_buffer_src.clone()).expect("Failed to allocate buffer chunk");
let vertex_slice = self.vertex_buffer.clone();
builder.draw_indexed(self.pipeline.clone(), &self.dynamic_state,
vec![vertex_slice, Arc::new(chunk)],
self.index_buffer.clone(), (), ()).expect("Failed to draw buffer");
The problem is that 1. chunk() seems to be performing a lot of allocations, or is otherwise taking a long time to figure out which chunk to use and whether to allocate, and 2. I have to create a new Arc for every call to draw_indexed(). If I remove batching, callgrind reports 23.97% time spent in vulkano::buffer::cpu_pool::CpuBufferPool<T,A>::try_next_impl and 49.89% time spent in __memcpy_avx_unaligned_erms which is being called from core::ptr::drop_in_place'2 which seems to be coming from dropping the Arc. Without this overhead I suspect vulkano would be quite fast, but right now it's blocking me from working on the rest of this renderer until this bottleneck is resolved.
I tried to do buffering myself using a Vec of CpuAccessibleBuffer objects, but I ran into #1429 and #1433 while trying to implement this.
Any help on this is greatly appreciated.
The text was updated successfully, but these errors were encountered:
if you want draw 2d sprite try use blit image or copy image and 1 quad plane
I need to be able to apply transforms, blending, and shaders, so this is a no-go. Additionally, I would have to issue a separate command for every single draw operation, which I don't think would scale well to ~400k sprites. I need to be able to fill an 8k instance buffer so that I can give each sprite a different transform, alpha, etc. within the shader and still have everything be fast, and draw all that at once.
main.rs
file that demonstrates the issue: Run Keeshond Doggymark example in latest git, reference code for the Vulkan renderer is here: https://gitlab.com/cosmicchipsocket/keeshond/-/blob/c91fbb2a011be18cb462a8173725730ce2052ceb/keeshond/src/renderer/vulkan.rs#L272Sorry I can't provide anything more concise, but it was hard enough setting up vulkano for this project as-is.
Issue
When using vulkano to draw instanced quads, the overhead for each draw is actually larger than glium, defeating the whole point of using vulkano in the first place.
I need low draw call overhead for my 2D sprite-based engine for scenarios where ordered draws involve lots of texture changes, as these can't be batched.
For each draw operation I do the following:
The problem is that 1.
chunk()
seems to be performing a lot of allocations, or is otherwise taking a long time to figure out which chunk to use and whether to allocate, and 2. I have to create a newArc
for every call todraw_indexed()
. If I remove batching, callgrind reports 23.97% time spent invulkano::buffer::cpu_pool::CpuBufferPool<T,A>::try_next_impl
and 49.89% time spent in__memcpy_avx_unaligned_erms
which is being called fromcore::ptr::drop_in_place'2
which seems to be coming from dropping theArc
. Without this overhead I suspect vulkano would be quite fast, but right now it's blocking me from working on the rest of this renderer until this bottleneck is resolved.I tried to do buffering myself using a Vec of
CpuAccessibleBuffer
objects, but I ran into #1429 and #1433 while trying to implement this.Any help on this is greatly appreciated.
The text was updated successfully, but these errors were encountered: