Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Batch kernels for forward pass of Preprocessing #2

Merged
merged 57 commits into from
May 11, 2024

Conversation

sandeepnmenon
Copy link
Collaborator

@sandeepnmenon sandeepnmenon commented Apr 28, 2024

Changes

  1. New kernels for batched preprocessCUDA
  2. New API for the package preprocess_gaussians_batched
  3. New class for rasterization settings GaussianRasterizerBatches
  4. Test file to test and compare batched and non batched preprocess forward kernel

Results

Tests ran on V100

num_gaussians = 1000000
num_batches=64
SH_ACTIVE_DEGREE = 3

Time taken by test_batched_gaussian_rasterizer: 81.2411 ms
Time taken by test_batched_gaussian_rasterizer_batch_processing: 33.5708 ms

sandeepnmenon and others added 30 commits April 20, 2024 19:17
…of math.tan in test_batched_gaussian_rasterizer_batch_processing function
…asterizer_batch_processing functions in rasterization_tests.py
…asterizer_batch_processing functions in rasterization_tests.py
…ao/diff-gaussian-rasterization into mlsys/batched_preprocess
…asterizer_batch_processing functions in rasterization_tests.py
…asterizer_batch_processing functions in rasterization_tests.py
…asterizer_batch_processing functions in rasterization_tests.py
Copy link
Collaborator

@prapti19 prapti19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get a significant speedup. Looks good.

elapsed_time_ms = start_event.elapsed_time(end_event)
print(f"Time taken by test_batched_gaussian_rasterizer_batch_processing: {elapsed_time_ms:.4f} ms")

# TODO: make the below work
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work

for means2D in batched_means2D:
           means2D.retain_grad()

Copy link
Collaborator

@TarzanZhao TarzanZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the code, and I think it's good; it can be merged.

viewpoint_camera.image_width = 512
viewpoint_camera.world_view_transform = torch.eye(4).cuda()
viewpoint_camera.full_proj_transform = torch.eye(4).cuda()
viewpoint_camera.camera_center = torch.zeros(3).cuda()
Copy link
Collaborator

@TarzanZhao TarzanZhao May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us change torch.zeros to be non-zero values and test it in tomorrow meeting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix bug for non zero camera centers and test for non identity view transforms

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested for these. Will check in those with the backward kernel PR #3

@sandeepnmenon sandeepnmenon merged commit 13e4cb0 into dist May 11, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants