-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[WIP] compute shader support #1200
Draft
floooh
wants to merge
43
commits into
master
Choose a base branch
from
sgcompute
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
+1,974
−883
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…be zero-initialized)
…nd of compute pass
…ll be zero-initialized)
…inor code cleanup
…ER_BIT to GL loader
25 tasks
…ads_per_threadgroup (since it's only needed in Metal)
...that way they don't show up in the generated zig-docs.
This was referenced Feb 22, 2025
This was referenced Feb 23, 2025
This was referenced Feb 25, 2025
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related sokol-shdc PR: floooh/sokol-tools#173
exampleMetal backend: remove the managed-buffer-synchronization and use the same granular approach as the memory barrier code in GL=> this wouldn't work because the buffer sync needs to happen outside a compute passchange the internal names of the various*_info
structs to not be abbreviated so muchsg_features.storage_buffer
tosg_features.compute
drive-by: remove return value from=> it's actually used in the wgpu backend_sg_*_apply_bindings()
, all implementations always return true anywayd3d11_shutdown()
cleared when bound for read-only
onsg_shader_storage_buffer
renamehlsl_register_t_n
tohlsl_register_t_or_u_n
threadgroup
variables (e.g. SPIRVCross turns GLSLshared
variables into statically allocated threadgroup variables)NOTE: concurrent compute pass encoders require bumping the minimal supported macOS base version to 10.14"Buffer resources used for output from the compute shader must be created with the D3D11_BIND_RENDER_TARGET flag. Such resources may be read from, however.""Buffer resources created with the D3D11_BIND_SHADER_RESOURCE flag may only be used as inputs to the compute shader."assume iPhone8 (e.g. non-uniform thread size feature)(probably don't need this with the below changes)sg_dispatch()
in render passesnum_groups_*
in the dispatch call doesn't exceed 0xFFFF (see D3D11 functional spec)sg_draw()
,sg_apply_viewport
,sg_apply_scissor_rect
in compute passessg_begin_pass()
don't set attachments or swapchain in compute passessg_apply_bindings()
don't bind vertex/index buffers,textures and samplers(textures and samplers should be fine)insg_make_shader()
: don't mix texture, sampler and writable storage buffer bindingsQuestions and open problems:
[YES] support thread-shared memory? (requires special CPU side support in Metal:
https://developer.apple.com/documentation/metal/mtlcomputecommandencoder/setthreadgroupmemorylength(_:index:)?language=objc)=> setThreadGroupMemoryLength is only needed when the length isn't known to the shader (e.g. the shared memory isn't statically allocated in the shader itself, we don't need that since SPIRVCross always statically allocates)[YES] should we just allow multiple writes to the same resource and insert the necessary barriers in the APIs which require it?
[DONE] ||in HLSL, GLSL and WGSL the shader defines the workgroup size, but in Metal that's defined on the CPU side in the dispatch call.
layout(local_size_x = X, local_size_y = Y, local_size_z = Z) in;
numthreads[x, y, z];
@workgroup_size(x,y,z)
dispatchWorkgroups
thing, e.g. if workgroup_size is 64, then:passEncoder.dispatchWorkgroups(Math.ceil(numParticles / 64));
[LATER] storage textures now or later?
[LATER] split SG_BUFFERTYPE into a bool-flags struct? (to allow storage+vertex buffers), alternatively: just do vertex pulling from storage buffers...?