Skip to content

Commit

Permalink
Update Cutlass to v3.4.1
Browse files Browse the repository at this point in the history
  • Loading branch information
tridao committed Feb 21, 2024
1 parent b32efb1 commit 4d6b794
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion csrc/cutlass
Submodule cutlass updated 61 files
+7 −2 CHANGELOG.md
+24 −3 CMakeLists.txt
+7 −0 PUBLICATIONS.md
+9 −4 README.md
+0 −38 cmake/version.h.in
+34 −0 cmake/version_extended.h.in
+1 −0 examples/02_dump_reg_shmem/CMakeLists.txt
+2 −2 examples/08_turing_tensorop_gemm/turing_tensorop_gemm.cu
+7 −7 examples/56_hopper_ptr_array_batched_gemm/56_hopper_ptr_array_batched_gemm.cu
+10 −8 examples/56_hopper_ptr_array_batched_gemm/CMakeLists.txt
+96 −49 examples/57_hopper_grouped_gemm/57_hopper_grouped_gemm.cu
+10 −0 examples/57_hopper_grouped_gemm/CMakeLists.txt
+1 −1 include/cute/arch/copy_sm90_desc.hpp
+2 −0 include/cute/atom/mma_atom.hpp
+2 −2 include/cute/util/print.hpp
+3 −0 include/cute/util/type_traits.hpp
+4 −0 include/cutlass/arch/mma_sm90.h
+1 −0 include/cutlass/bfloat16.h
+35 −1 include/cutlass/detail/layout.hpp
+12 −7 include/cutlass/epilogue/collective/builders/sm90_builder.inl
+1 −0 include/cutlass/epilogue/collective/default_epilogue.hpp
+32 −18 include/cutlass/epilogue/collective/default_epilogue_array.hpp
+76 −38 include/cutlass/epilogue/collective/sm90_epilogue_tma_warpspecialized.hpp
+1 −2 include/cutlass/epilogue/dispatch_policy.hpp
+28 −0 include/cutlass/epilogue/fusion/sm90_callbacks_tma_warpspecialized.hpp
+1 −0 include/cutlass/epilogue/fusion/sm90_visitor_store_tma_warpspecialized.hpp
+57 −12 include/cutlass/epilogue/thread/linear_combination.h
+0 −183 include/cutlass/epilogue/threadblock/default_epilogue_tensor_op_row_broadcast.h
+0 −519 include/cutlass/epilogue/threadblock/predicated_tile_iterator_row_broadcast.h
+4 −8 include/cutlass/gemm/collective/builders/sm90_gmma_builder.inl
+45 −29 include/cutlass/gemm/collective/sm90_mma_array_tma_gmma_ss_warpspecialized.hpp
+0 −514 include/cutlass/gemm/device/gemm_sparse_row_broadcast.h
+4 −7 include/cutlass/gemm/dispatch_policy.hpp
+12 −0 include/cutlass/gemm/group_array_problem_shape.hpp
+0 −191 include/cutlass/gemm/kernel/default_gemm_sparse_row_broadcast.h
+30 −35 include/cutlass/gemm/kernel/sm90_gemm_array_tma_warpspecialized_cooperative.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_tma.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_cooperative.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_pingpong.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_warpspecialized.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_warpspecialized_cooperative.hpp
+5 −7 include/cutlass/gemm/kernel/sm90_gemm_warpspecialized_pingpong.hpp
+140 −86 include/cutlass/gemm/kernel/sm90_tile_scheduler_group.hpp
+0 −400 include/cutlass/gemm/kernel/sparse_gemm_row_broadcast.h
+14 −6 include/cutlass/gemm/kernel/tile_scheduler_params.h
+80 −0 include/cutlass/version.h
+2 −2 pyproject.toml
+3 −3 python/cutlass/__init__.py
+6 −2 python/cutlass/backend/c_types.py
+23 −1 python/cutlass/backend/epilogue.py
+2 −2 python/cutlass/backend/evt/frontend/frontend_base.py
+0 −16 python/cutlass/backend/evt/passes/graph_drawer.py
+28 −18 python/cutlass/backend/gemm_operation.py
+1 −1 python/setup_library.py
+1 −1 python/setup_pycute.py
+1 −0 test/unit/gemm/device/CMakeLists.txt
+0 −19 test/unit/gemm/device/gemm_f16n_f16n_f16t_tensor_op_f32_sparse_sm80.cu
+685 −0 test/unit/gemm/device/sm90_gemm_f16_f16_f16_tensor_op_f32_cluster_warpspecialized_cooperative_aux_store.cu
+7 −20 test/unit/gemm/device/testbed_sparse.h
+1 −1 tools/util/include/cutlass/util/packed_stride.hpp

0 comments on commit 4d6b794

Please # to comment.