Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[BugFix] Fix serialization of stacks of Tensorclasses #1236

Merged
merged 3 commits into from
Feb 26, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 25, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 25, 2025
ghstack-source-id: 0f479c80655d1e663ce67a16031556dbe70937f9
Pull Request resolved: #1236
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2025
Copy link

github-actions bot commented Feb 25, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}20$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 68.6270μs 21.1877μs 47.1972 KOps/s 47.4125 KOps/s $\color{#d91a1a}-0.45\%$
test_plain_set_stack_nested 66.6740μs 21.3683μs 46.7982 KOps/s 47.9190 KOps/s $\color{#d91a1a}-2.34\%$
test_plain_set_nested_inplace 65.3320μs 23.0791μs 43.3293 KOps/s 44.8219 KOps/s $\color{#d91a1a}-3.33\%$
test_plain_set_stack_nested_inplace 63.5280μs 22.7969μs 43.8656 KOps/s 44.9595 KOps/s $\color{#d91a1a}-2.43\%$
test_items 42.8000μs 4.2595μs 234.7670 KOps/s 232.0791 KOps/s $\color{#35bf28}+1.16\%$
test_items_nested 0.7136ms 0.4069ms 2.4577 KOps/s 2.5198 KOps/s $\color{#d91a1a}-2.46\%$
test_items_nested_locked 0.7271ms 0.4101ms 2.4385 KOps/s 2.4881 KOps/s $\color{#d91a1a}-2.00\%$
test_items_nested_leaf 0.1947ms 77.0058μs 12.9860 KOps/s 13.1824 KOps/s $\color{#d91a1a}-1.49\%$
test_items_stack_nested 0.7172ms 0.4048ms 2.4701 KOps/s 2.4753 KOps/s $\color{#d91a1a}-0.21\%$
test_items_stack_nested_leaf 0.1581ms 78.0396μs 12.8140 KOps/s 13.1799 KOps/s $\color{#d91a1a}-2.78\%$
test_items_stack_nested_locked 0.8176ms 0.4054ms 2.4667 KOps/s 2.4856 KOps/s $\color{#d91a1a}-0.76\%$
test_keys 18.1540μs 3.4630μs 288.7648 KOps/s 285.4286 KOps/s $\color{#35bf28}+1.17\%$
test_keys_nested 0.2689ms 0.1631ms 6.1312 KOps/s 6.0615 KOps/s $\color{#35bf28}+1.15\%$
test_keys_nested_locked 0.6743ms 0.1678ms 5.9583 KOps/s 5.8341 KOps/s $\color{#35bf28}+2.13\%$
test_keys_nested_leaf 0.2444ms 0.1422ms 7.0346 KOps/s 6.9259 KOps/s $\color{#35bf28}+1.57\%$
test_keys_stack_nested 0.2838ms 0.1632ms 6.1271 KOps/s 6.0592 KOps/s $\color{#35bf28}+1.12\%$
test_keys_stack_nested_leaf 0.2303ms 0.1417ms 7.0579 KOps/s 6.9596 KOps/s $\color{#35bf28}+1.41\%$
test_keys_stack_nested_locked 0.2980ms 0.1694ms 5.9048 KOps/s 5.8540 KOps/s $\color{#35bf28}+0.87\%$
test_values 10.4132μs 1.0525μs 950.1025 KOps/s 945.3966 KOps/s $\color{#35bf28}+0.50\%$
test_values_nested 0.1123ms 62.4829μs 16.0044 KOps/s 16.1875 KOps/s $\color{#d91a1a}-1.13\%$
test_values_nested_locked 0.1176ms 62.7452μs 15.9375 KOps/s 16.1262 KOps/s $\color{#d91a1a}-1.17\%$
test_values_nested_leaf 0.1366ms 71.8473μs 13.9184 KOps/s 14.0695 KOps/s $\color{#d91a1a}-1.07\%$
test_values_stack_nested 0.1170ms 63.2148μs 15.8191 KOps/s 16.1530 KOps/s $\color{#d91a1a}-2.07\%$
test_values_stack_nested_leaf 0.1280ms 71.4293μs 13.9998 KOps/s 14.0453 KOps/s $\color{#d91a1a}-0.32\%$
test_values_stack_nested_locked 0.1155ms 62.6686μs 15.9570 KOps/s 16.2534 KOps/s $\color{#d91a1a}-1.82\%$
test_membership 16.0700μs 0.8583μs 1.1651 MOps/s 1.1494 MOps/s $\color{#35bf28}+1.37\%$
test_membership_nested 43.3510μs 2.8789μs 347.3493 KOps/s 345.9851 KOps/s $\color{#35bf28}+0.39\%$
test_membership_nested_leaf 45.4440μs 2.9265μs 341.7081 KOps/s 343.8202 KOps/s $\color{#d91a1a}-0.61\%$
test_membership_stacked_nested 26.9300μs 2.8712μs 348.2806 KOps/s 346.5005 KOps/s $\color{#35bf28}+0.51\%$
test_membership_stacked_nested_leaf 18.2440μs 2.8896μs 346.0641 KOps/s 350.8263 KOps/s $\color{#d91a1a}-1.36\%$
test_membership_nested_last 46.4070μs 4.2842μs 233.4166 KOps/s 234.2487 KOps/s $\color{#d91a1a}-0.36\%$
test_membership_nested_leaf_last 27.2610μs 4.2949μs 232.8370 KOps/s 232.3410 KOps/s $\color{#35bf28}+0.21\%$
test_membership_stacked_nested_last 51.0650μs 4.2841μs 233.4221 KOps/s 234.6809 KOps/s $\color{#d91a1a}-0.54\%$
test_membership_stacked_nested_leaf_last 20.6880μs 4.2940μs 232.8839 KOps/s 230.2948 KOps/s $\color{#35bf28}+1.12\%$
test_nested_getleaf 54.5210μs 10.5303μs 94.9644 KOps/s 94.8979 KOps/s $\color{#35bf28}+0.07\%$
test_nested_get 57.4370μs 9.9926μs 100.0745 KOps/s 98.4952 KOps/s $\color{#35bf28}+1.60\%$
test_stacked_getleaf 31.8190μs 10.6265μs 94.1039 KOps/s 96.3279 KOps/s $\color{#d91a1a}-2.31\%$
test_stacked_get 52.7790μs 10.0184μs 99.8165 KOps/s 100.4058 KOps/s $\color{#d91a1a}-0.59\%$
test_nested_getitemleaf 32.6210μs 11.1541μs 89.6531 KOps/s 89.4379 KOps/s $\color{#35bf28}+0.24\%$
test_nested_getitem 55.7940μs 10.6071μs 94.2762 KOps/s 94.8007 KOps/s $\color{#d91a1a}-0.55\%$
test_stacked_getitemleaf 52.8780μs 11.1544μs 89.6511 KOps/s 90.1325 KOps/s $\color{#d91a1a}-0.53\%$
test_stacked_getitem 35.0650μs 10.6137μs 94.2181 KOps/s 95.2833 KOps/s $\color{#d91a1a}-1.12\%$
test_lock_nested 0.6004ms 0.4101ms 2.4384 KOps/s 2.4428 KOps/s $\color{#d91a1a}-0.18\%$
test_lock_stack_nested 0.5271ms 0.4217ms 2.3712 KOps/s 2.3532 KOps/s $\color{#35bf28}+0.76\%$
test_unlock_nested 0.4221ms 0.3313ms 3.0181 KOps/s 2.9852 KOps/s $\color{#35bf28}+1.10\%$
test_unlock_stack_nested 0.7135ms 0.3393ms 2.9469 KOps/s 2.9249 KOps/s $\color{#35bf28}+0.75\%$
test_flatten_speed 0.1899ms 0.1009ms 9.9077 KOps/s 9.9739 KOps/s $\color{#d91a1a}-0.66\%$
test_unflatten_speed 1.2497ms 0.5147ms 1.9429 KOps/s 1.9319 KOps/s $\color{#35bf28}+0.57\%$
test_common_ops 4.3503ms 0.8467ms 1.1810 KOps/s 1.2574 KOps/s $\textbf{\color{#d91a1a}-6.07\%}$
test_creation 48.2100μs 2.5248μs 396.0717 KOps/s 402.1641 KOps/s $\color{#d91a1a}-1.51\%$
test_creation_empty 72.4150μs 13.0993μs 76.3397 KOps/s 86.2605 KOps/s $\textbf{\color{#d91a1a}-11.50\%}$
test_creation_nested_1 50.2740μs 15.8611μs 63.0475 KOps/s 68.6295 KOps/s $\textbf{\color{#d91a1a}-8.13\%}$
test_creation_nested_2 43.9210μs 20.5275μs 48.7153 KOps/s 53.1394 KOps/s $\textbf{\color{#d91a1a}-8.33\%}$
test_clone 85.5790μs 13.3859μs 74.7055 KOps/s 71.6282 KOps/s $\color{#35bf28}+4.30\%$
test_getitem[int] 0.8662ms 12.9973μs 76.9391 KOps/s 79.4520 KOps/s $\color{#d91a1a}-3.16\%$
test_getitem[slice_int] 0.1276ms 24.8290μs 40.2754 KOps/s 41.8876 KOps/s $\color{#d91a1a}-3.85\%$
test_getitem[range] 0.1583ms 50.3413μs 19.8644 KOps/s 18.2768 KOps/s $\textbf{\color{#35bf28}+8.69\%}$
test_getitem[tuple] 0.1485ms 20.5637μs 48.6294 KOps/s 47.0832 KOps/s $\color{#35bf28}+3.28\%$
test_getitem[list] 0.1594ms 45.9831μs 21.7471 KOps/s 21.8449 KOps/s $\color{#d91a1a}-0.45\%$
test_setitem_dim[int] 51.0550μs 26.4152μs 37.8570 KOps/s 39.4709 KOps/s $\color{#d91a1a}-4.09\%$
test_setitem_dim[slice_int] 0.1003ms 51.5858μs 19.3852 KOps/s 19.7043 KOps/s $\color{#d91a1a}-1.62\%$
test_setitem_dim[range] 0.1287ms 76.6056μs 13.0539 KOps/s 13.1431 KOps/s $\color{#d91a1a}-0.68\%$
test_setitem_dim[tuple] 81.8520μs 41.9900μs 23.8152 KOps/s 24.9752 KOps/s $\color{#d91a1a}-4.64\%$
test_setitem 0.1034ms 21.5737μs 46.3527 KOps/s 49.4738 KOps/s $\textbf{\color{#d91a1a}-6.31\%}$
test_set 0.1013ms 21.0418μs 47.5245 KOps/s 48.9950 KOps/s $\color{#d91a1a}-3.00\%$
test_set_shared 4.2452ms 0.1868ms 5.3536 KOps/s 5.4822 KOps/s $\color{#d91a1a}-2.35\%$
test_update 0.2872ms 25.9915μs 38.4742 KOps/s 45.1076 KOps/s $\textbf{\color{#d91a1a}-14.71\%}$
test_update_nested 75.4400μs 36.1942μs 27.6288 KOps/s 30.2695 KOps/s $\textbf{\color{#d91a1a}-8.72\%}$
test_update__nested 0.4213ms 33.9868μs 29.4232 KOps/s 30.2636 KOps/s $\color{#d91a1a}-2.78\%$
test_set_nested 0.1216ms 23.3195μs 42.8826 KOps/s 45.8023 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_set_nested_new 71.3920μs 27.3353μs 36.5828 KOps/s 37.9313 KOps/s $\color{#d91a1a}-3.56\%$
test_select 0.1100ms 43.3238μs 23.0820 KOps/s 23.4942 KOps/s $\color{#d91a1a}-1.75\%$
test_select_nested 0.1175ms 61.8263μs 16.1743 KOps/s 16.0707 KOps/s $\color{#35bf28}+0.65\%$
test_exclude_nested 0.1671ms 79.4677μs 12.5837 KOps/s 12.3459 KOps/s $\color{#35bf28}+1.93\%$
test_empty[True] 0.5682ms 0.4023ms 2.4859 KOps/s 2.4468 KOps/s $\color{#35bf28}+1.60\%$
test_empty[False] 35.6437μs 1.3771μs 726.1894 KOps/s 728.3850 KOps/s $\color{#d91a1a}-0.30\%$
test_unbind_speed 0.5399ms 0.2673ms 3.7406 KOps/s 3.6634 KOps/s $\color{#35bf28}+2.11\%$
test_unbind_speed_stack0 0.4083ms 0.2658ms 3.7623 KOps/s 3.7130 KOps/s $\color{#35bf28}+1.33\%$
test_unbind_speed_stack1 0.1001s 0.7214ms 1.3861 KOps/s 1.1924 KOps/s $\textbf{\color{#35bf28}+16.25\%}$
test_split 0.1055s 1.7776ms 562.5594 Ops/s 559.5183 Ops/s $\color{#35bf28}+0.54\%$
test_chunk 0.1047s 1.7724ms 564.2086 Ops/s 629.9324 Ops/s $\textbf{\color{#d91a1a}-10.43\%}$
test_consolidate_njt[False-None] 8.8017ms 8.4427ms 118.4449 Ops/s 108.8788 Ops/s $\textbf{\color{#35bf28}+8.79\%}$
test_creation[device0] 0.2216ms 91.6641μs 10.9094 KOps/s 10.7933 KOps/s $\color{#35bf28}+1.08\%$
test_creation_from_tensor 4.2526ms 95.4257μs 10.4794 KOps/s 10.7543 KOps/s $\color{#d91a1a}-2.56\%$
test_add_one[memmap_tensor0] 0.1097ms 4.8984μs 204.1478 KOps/s 203.9570 KOps/s $\color{#35bf28}+0.09\%$
test_contiguous[memmap_tensor0] 15.1680μs 0.5147μs 1.9429 MOps/s 1.9556 MOps/s $\color{#d91a1a}-0.65\%$
test_stack[memmap_tensor0] 26.9500μs 3.3687μs 296.8499 KOps/s 289.6538 KOps/s $\color{#35bf28}+2.48\%$
test_memmaptd_index 1.2587ms 0.2280ms 4.3861 KOps/s 4.3516 KOps/s $\color{#35bf28}+0.79\%$
test_memmaptd_index_astensor 0.4926ms 0.3140ms 3.1851 KOps/s 3.1626 KOps/s $\color{#35bf28}+0.71\%$
test_memmaptd_index_op 0.8673ms 0.6060ms 1.6502 KOps/s 1.7293 KOps/s $\color{#d91a1a}-4.57\%$
test_serialize_model 0.2180s 0.1332s 7.5060 Ops/s 8.7201 Ops/s $\textbf{\color{#d91a1a}-13.92\%}$
test_serialize_model_pickle 0.4917s 0.4002s 2.4988 Ops/s 2.5143 Ops/s $\color{#d91a1a}-0.62\%$
test_serialize_weights 0.1284s 0.1175s 8.5088 Ops/s 8.6989 Ops/s $\color{#d91a1a}-2.19\%$
test_serialize_weights_returnearly 0.1980s 0.1637s 6.1084 Ops/s 5.5944 Ops/s $\textbf{\color{#35bf28}+9.19\%}$
test_serialize_weights_pickle 0.5401s 0.4429s 2.2577 Ops/s 2.4510 Ops/s $\textbf{\color{#d91a1a}-7.89\%}$
test_serialize_weights_filesystem 0.1635s 0.1458s 6.8577 Ops/s 6.7876 Ops/s $\color{#35bf28}+1.03\%$
test_serialize_model_filesystem 0.1561s 0.1494s 6.6932 Ops/s 6.5300 Ops/s $\color{#35bf28}+2.50\%$
test_reshape_pytree 61.5240μs 26.0554μs 38.3798 KOps/s 37.9137 KOps/s $\color{#35bf28}+1.23\%$
test_reshape_td 73.3570μs 33.2443μs 30.0803 KOps/s 30.3319 KOps/s $\color{#d91a1a}-0.83\%$
test_view_pytree 81.6720μs 26.2261μs 38.1300 KOps/s 36.5874 KOps/s $\color{#35bf28}+4.22\%$
test_view_td 0.1047ms 41.2923μs 24.2176 KOps/s 22.2917 KOps/s $\textbf{\color{#35bf28}+8.64\%}$
test_unbind_pytree 68.1660μs 29.3181μs 34.1086 KOps/s 33.8922 KOps/s $\color{#35bf28}+0.64\%$
test_unbind_td 0.2990ms 39.5918μs 25.2578 KOps/s 25.3619 KOps/s $\color{#d91a1a}-0.41\%$
test_split_pytree 0.1033ms 29.3451μs 34.0772 KOps/s 34.3473 KOps/s $\color{#d91a1a}-0.79\%$
test_split_td 0.5066ms 45.1749μs 22.1362 KOps/s 21.9190 KOps/s $\color{#35bf28}+0.99\%$
test_add_pytree 0.2763ms 35.3067μs 28.3233 KOps/s 28.0579 KOps/s $\color{#35bf28}+0.95\%$
test_add_td 0.1692ms 59.2697μs 16.8720 KOps/s 18.0738 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_compile_add_one_nested[tensordict-compile] 0.1939ms 67.6286μs 14.7866 KOps/s 15.2231 KOps/s $\color{#d91a1a}-2.87\%$
test_compile_add_one_nested[tensordict-eager] 1.2952ms 0.1723ms 5.8028 KOps/s 5.9028 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_add_one_nested[pytree-compile] 0.2030ms 47.0969μs 21.2328 KOps/s 22.1178 KOps/s $\color{#d91a1a}-4.00\%$
test_compile_add_one_nested[pytree-eager] 0.2433ms 0.1169ms 8.5562 KOps/s 8.3979 KOps/s $\color{#35bf28}+1.89\%$
test_compile_copy_nested[tensordict-compile] 0.1042ms 28.8206μs 34.6974 KOps/s 36.3074 KOps/s $\color{#d91a1a}-4.43\%$
test_compile_copy_nested[tensordict-eager] 0.1217ms 58.2831μs 17.1576 KOps/s 16.5841 KOps/s $\color{#35bf28}+3.46\%$
test_compile_copy_nested[pytree-compile] 0.1652ms 77.5121μs 12.9012 KOps/s 12.5391 KOps/s $\color{#35bf28}+2.89\%$
test_compile_copy_nested[pytree-eager] 0.1226ms 65.3761μs 15.2961 KOps/s 15.0415 KOps/s $\color{#35bf28}+1.69\%$
test_compile_add_one_flat[tensordict-compile] 0.2032ms 0.1069ms 9.3565 KOps/s 9.5089 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_add_one_flat[tensordict-eager] 0.4030ms 0.2160ms 4.6296 KOps/s 4.6363 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[tensorclass-compile] 0.1963ms 48.0967μs 20.7915 KOps/s 21.5145 KOps/s $\color{#d91a1a}-3.36\%$
test_compile_add_one_flat[tensorclass-eager] 0.1424ms 66.4191μs 15.0559 KOps/s 14.9921 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_one_flat[pytree-compile] 0.2164ms 99.7040μs 10.0297 KOps/s 10.0162 KOps/s $\color{#35bf28}+0.14\%$
test_compile_add_one_flat[pytree-eager] 0.4672ms 0.2024ms 4.9397 KOps/s 4.9553 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_add_self_flat[tensordict-eager] 0.4952ms 0.2314ms 4.3210 KOps/s 4.3425 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[tensordict-compile] 0.2334ms 0.1112ms 8.9902 KOps/s 9.5904 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_compile_add_self_flat[tensorclass-eager] 0.2609ms 63.9969μs 15.6257 KOps/s 16.2594 KOps/s $\color{#d91a1a}-3.90\%$
test_compile_add_self_flat[tensorclass-compile] 0.3170ms 48.7509μs 20.5125 KOps/s 21.3754 KOps/s $\color{#d91a1a}-4.04\%$
test_compile_add_self_flat[pytree-eager] 0.2507ms 0.1573ms 6.3566 KOps/s 6.3883 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[pytree-compile] 0.2461ms 0.1014ms 9.8609 KOps/s 10.1082 KOps/s $\color{#d91a1a}-2.45\%$
test_compile_copy_flat[tensordict-compile] 0.1021ms 21.0833μs 47.4309 KOps/s 48.0320 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_copy_flat[tensordict-eager] 0.1577ms 66.8219μs 14.9652 KOps/s 15.2502 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_copy_flat[pytree-compile] 0.1593ms 81.2883μs 12.3019 KOps/s 12.4960 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_copy_flat[pytree-eager] 0.1540ms 67.1775μs 14.8859 KOps/s 15.0698 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_assign_and_add[tensordict-compile] 0.2867ms 0.2151ms 4.6491 KOps/s 4.7377 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_assign_and_add[tensordict-eager] 1.7632ms 1.3818ms 723.6811 Ops/s 722.0411 Ops/s $\color{#35bf28}+0.23\%$
test_compile_assign_and_add[pytree-compile] 0.3277ms 0.2108ms 4.7445 KOps/s 4.9025 KOps/s $\color{#d91a1a}-3.22\%$
test_compile_assign_and_add[pytree-eager] 0.9170ms 0.8261ms 1.2104 KOps/s 1.2112 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_assign_and_add_stack[compile] 0.5842ms 0.4586ms 2.1805 KOps/s 2.2314 KOps/s $\color{#d91a1a}-2.28\%$
test_compile_assign_and_add_stack[eager] 3.5769ms 2.8235ms 354.1700 Ops/s 370.8073 Ops/s $\color{#d91a1a}-4.49\%$
test_compile_indexing[tensor-tensordict-compile] 0.1152ms 40.3059μs 24.8103 KOps/s 26.3024 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5659ms 33.7511μs 29.6286 KOps/s 30.1424 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_indexing[tensor-tensorclass-compile] 76.8130μs 31.5524μs 31.6933 KOps/s 32.7168 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1181ms 22.5860μs 44.2752 KOps/s 43.2656 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[tensor-pytree-compile] 0.1154ms 32.6349μs 30.6421 KOps/s 31.7593 KOps/s $\color{#d91a1a}-3.52\%$
test_compile_indexing[tensor-pytree-eager] 63.2280μs 22.1554μs 45.1356 KOps/s 43.1624 KOps/s $\color{#35bf28}+4.57\%$
test_compile_indexing[slice-tensordict-compile] 0.1114ms 53.0138μs 18.8630 KOps/s 18.6741 KOps/s $\color{#35bf28}+1.01\%$
test_compile_indexing[slice-tensordict-eager] 0.4613ms 20.0634μs 49.8420 KOps/s 49.2065 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[slice-tensorclass-compile] 95.1270μs 45.7950μs 21.8365 KOps/s 22.5420 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_indexing[slice-tensorclass-eager] 94.5490μs 18.9913μs 52.6558 KOps/s 53.4527 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_indexing[slice-pytree-compile] 0.1076ms 46.7625μs 21.3847 KOps/s 22.0683 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_indexing[slice-pytree-eager] 83.8960μs 18.2671μs 54.7433 KOps/s 53.6742 KOps/s $\color{#35bf28}+1.99\%$
test_compile_indexing[int-tensordict-compile] 0.1324ms 54.6673μs 18.2925 KOps/s 18.8128 KOps/s $\color{#d91a1a}-2.77\%$
test_compile_indexing[int-tensordict-eager] 0.8534ms 19.8560μs 50.3626 KOps/s 50.6036 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_indexing[int-tensorclass-compile] 0.1192ms 47.2124μs 21.1809 KOps/s 22.0102 KOps/s $\color{#d91a1a}-3.77\%$
test_compile_indexing[int-tensorclass-eager] 73.8170μs 18.5568μs 53.8886 KOps/s 53.4697 KOps/s $\color{#35bf28}+0.78\%$
test_compile_indexing[int-pytree-compile] 0.1169ms 47.2682μs 21.1559 KOps/s 21.9257 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_indexing[int-pytree-eager] 79.0770μs 18.1244μs 55.1743 KOps/s 53.7966 KOps/s $\color{#35bf28}+2.56\%$
test_mod_add[eager] 90.1880μs 37.4838μs 26.6782 KOps/s 28.2482 KOps/s $\textbf{\color{#d91a1a}-5.56\%}$
test_mod_add[compile] 0.1343ms 67.2020μs 14.8805 KOps/s 15.6907 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_mod_add[compile-overhead] 0.1225ms 63.2403μs 15.8127 KOps/s 15.8378 KOps/s $\color{#d91a1a}-0.16\%$
test_mod_wrap[eager] 0.4618ms 0.2248ms 4.4481 KOps/s 4.5261 KOps/s $\color{#d91a1a}-1.72\%$
test_mod_wrap[compile] 2.0559ms 0.2319ms 4.3124 KOps/s 4.4440 KOps/s $\color{#d91a1a}-2.96\%$
test_mod_wrap[compile-overhead] 0.3531ms 0.2264ms 4.4174 KOps/s 4.5451 KOps/s $\color{#d91a1a}-2.81\%$
test_mod_wrap_and_backward[eager] 16.7989ms 13.1256ms 76.1872 Ops/s 77.5695 Ops/s $\color{#d91a1a}-1.78\%$
test_mod_wrap_and_backward[compile] 13.4150ms 11.3727ms 87.9298 Ops/s 88.5616 Ops/s $\color{#d91a1a}-0.71\%$
test_mod_wrap_and_backward[compile-overhead] 23.5749ms 11.5792ms 86.3618 Ops/s 87.2851 Ops/s $\color{#d91a1a}-1.06\%$
test_seq_add[eager] 0.2161ms 0.1220ms 8.1979 KOps/s 8.2828 KOps/s $\color{#d91a1a}-1.02\%$
test_seq_add[compile] 0.1578ms 79.5699μs 12.5676 KOps/s 13.2035 KOps/s $\color{#d91a1a}-4.82\%$
test_seq_add[compile-overhead] 0.1375ms 78.0119μs 12.8186 KOps/s 13.4170 KOps/s $\color{#d91a1a}-4.46\%$
test_seq_wrap[eager] 0.7208ms 0.4484ms 2.2301 KOps/s 2.2284 KOps/s $\color{#35bf28}+0.08\%$
test_seq_wrap[compile] 0.8672ms 0.2471ms 4.0476 KOps/s 4.2309 KOps/s $\color{#d91a1a}-4.33\%$
test_seq_wrap[compile-overhead] 0.3958ms 0.2437ms 4.1038 KOps/s 4.2260 KOps/s $\color{#d91a1a}-2.89\%$
test_func_call_runtime[False-eager] 0.9624ms 0.5333ms 1.8750 KOps/s 1.8190 KOps/s $\color{#35bf28}+3.07\%$
test_func_call_runtime[False-compile] 0.8576ms 0.4461ms 2.2415 KOps/s 2.2973 KOps/s $\color{#d91a1a}-2.43\%$
test_func_call_runtime[False-compile-overhead] 0.8431ms 0.4454ms 2.2451 KOps/s 2.3153 KOps/s $\color{#d91a1a}-3.03\%$
test_func_call_runtime[True-eager] 0.8583ms 0.7457ms 1.3410 KOps/s 1.3364 KOps/s $\color{#35bf28}+0.35\%$
test_func_call_runtime[True-compile] 0.7884ms 0.4666ms 2.1433 KOps/s 2.2272 KOps/s $\color{#d91a1a}-3.77\%$
test_func_call_runtime[True-compile-overhead] 0.7212ms 0.4666ms 2.1432 KOps/s 2.2284 KOps/s $\color{#d91a1a}-3.83\%$
test_func_call_cm_runtime[False-eager] 0.8586ms 0.5275ms 1.8958 KOps/s 1.8600 KOps/s $\color{#35bf28}+1.93\%$
test_func_call_cm_runtime[False-compile] 0.5655ms 0.4417ms 2.2639 KOps/s 2.3147 KOps/s $\color{#d91a1a}-2.19\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7470ms 0.4448ms 2.2482 KOps/s 2.3240 KOps/s $\color{#d91a1a}-3.26\%$
test_func_call_cm_runtime[True-eager] 1.4369ms 0.8966ms 1.1154 KOps/s 1.1146 KOps/s $\color{#35bf28}+0.07\%$
test_func_call_cm_runtime[True-compile] 1.2765ms 0.7944ms 1.2587 KOps/s 1.2538 KOps/s $\color{#35bf28}+0.40\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1562ms 0.7937ms 1.2600 KOps/s 1.2425 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_func_call_cm_runtime[eager] 2.6785ms 1.8930ms 528.2569 Ops/s 525.0149 Ops/s $\color{#35bf28}+0.62\%$
test_vmap_func_call_cm_runtime[compile] 0.6714ms 0.5446ms 1.8363 KOps/s 1.8703 KOps/s $\color{#d91a1a}-1.82\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.0250ms 0.5424ms 1.8437 KOps/s 1.8497 KOps/s $\color{#d91a1a}-0.33\%$
test_distributed 0.2519ms 0.1237ms 8.0861 KOps/s 7.8273 KOps/s $\color{#35bf28}+3.31\%$
test_tdmodule 65.8830μs 27.9351μs 35.7973 KOps/s 36.5724 KOps/s $\color{#d91a1a}-2.12\%$
test_tdmodule_dispatch 98.2130μs 51.0001μs 19.6078 KOps/s 20.3048 KOps/s $\color{#d91a1a}-3.43\%$
test_tdseq 46.8980μs 28.8126μs 34.7071 KOps/s 34.9633 KOps/s $\color{#d91a1a}-0.73\%$
test_tdseq_dispatch 90.4090μs 55.3411μs 18.0697 KOps/s 18.3816 KOps/s $\color{#d91a1a}-1.70\%$
test_instantiation_functorch 1.7533ms 1.5169ms 659.2431 Ops/s 660.2769 Ops/s $\color{#d91a1a}-0.16\%$
test_exec_functorch 0.3873ms 0.1775ms 5.6337 KOps/s 5.6425 KOps/s $\color{#d91a1a}-0.16\%$
test_exec_functional_call 0.4193ms 0.1701ms 5.8774 KOps/s 5.9593 KOps/s $\color{#d91a1a}-1.38\%$
test_exec_td_decorator 0.5216ms 0.2319ms 4.3117 KOps/s 4.1910 KOps/s $\color{#35bf28}+2.88\%$
test_vmap_mlp_speed_decorator[True-True] 0.9217ms 0.6575ms 1.5209 KOps/s 1.5271 KOps/s $\color{#d91a1a}-0.40\%$
test_vmap_mlp_speed_decorator[True-False] 1.0037ms 0.6666ms 1.5001 KOps/s 1.5321 KOps/s $\color{#d91a1a}-2.09\%$
test_vmap_mlp_speed_decorator[False-True] 0.7525ms 0.5265ms 1.8992 KOps/s 1.8471 KOps/s $\color{#35bf28}+2.82\%$
test_vmap_mlp_speed_decorator[False-False] 0.7769ms 0.5293ms 1.8892 KOps/s 1.9017 KOps/s $\color{#d91a1a}-0.65\%$
test_to_module_speed[True] 2.1472ms 1.3232ms 755.7208 Ops/s 757.5219 Ops/s $\color{#d91a1a}-0.24\%$
test_to_module_speed[False] 1.4540ms 1.2847ms 778.4065 Ops/s 769.2371 Ops/s $\color{#35bf28}+1.19\%$
test_tc_init 0.1013ms 48.4820μs 20.6262 KOps/s 22.1269 KOps/s $\textbf{\color{#d91a1a}-6.78\%}$
test_tc_init_nested 0.2215ms 96.3698μs 10.3767 KOps/s 11.0897 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_tc_first_layer_tensor 18.8250μs 1.5457μs 646.9390 KOps/s 632.3299 KOps/s $\color{#35bf28}+2.31\%$
test_tc_first_layer_nontensor 26.2980μs 4.6956μs 212.9650 KOps/s 215.6281 KOps/s $\color{#d91a1a}-1.24\%$
test_tc_second_layer_tensor 28.1920μs 2.9301μs 341.2892 KOps/s 343.7539 KOps/s $\color{#d91a1a}-0.72\%$
test_tc_second_layer_nontensor 27.1900μs 6.0048μs 166.5328 KOps/s 166.2796 KOps/s $\color{#35bf28}+0.15\%$
test_unbind 0.2415s 13.3193ms 75.0792 Ops/s 69.6191 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_full_like 10.0623ms 9.1869ms 108.8508 Ops/s 130.0277 Ops/s $\textbf{\color{#d91a1a}-16.29\%}$
test_zeros_like 5.4620ms 2.8719ms 348.1979 Ops/s 328.6395 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_ones_like 6.3716ms 3.5266ms 283.5588 Ops/s 277.0852 Ops/s $\color{#35bf28}+2.34\%$
test_clone 8.7912ms 7.0195ms 142.4597 Ops/s 178.4710 Ops/s $\textbf{\color{#d91a1a}-20.18\%}$
test_squeeze 60.3630μs 12.7242μs 78.5902 KOps/s 77.1991 KOps/s $\color{#35bf28}+1.80\%$
test_unsqueeze 0.3020ms 96.2662μs 10.3879 KOps/s 10.6901 KOps/s $\color{#d91a1a}-2.83\%$
test_split 0.3536ms 0.1984ms 5.0412 KOps/s 5.1385 KOps/s $\color{#d91a1a}-1.89\%$
test_permute 0.3256ms 0.2030ms 4.9271 KOps/s 4.9457 KOps/s $\color{#d91a1a}-0.37\%$
test_stack 33.5030ms 25.1353ms 39.7847 Ops/s 38.3561 Ops/s $\color{#35bf28}+3.72\%$
test_cat 25.6903ms 25.1244ms 39.8019 Ops/s 38.7463 Ops/s $\color{#35bf28}+2.72\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 25, 2025
ghstack-source-id: ea71af4f2eb5813bc5b25ed595edda0cf4fa1438
Pull Request resolved: #1236
Copy link

github-actions bot commented Feb 25, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}42$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.3310μs 13.0562μs 76.5917 KOps/s 82.2422 KOps/s $\textbf{\color{#d91a1a}-6.87\%}$
test_plain_set_stack_nested 46.9210μs 13.2046μs 75.7311 KOps/s 81.8239 KOps/s $\textbf{\color{#d91a1a}-7.45\%}$
test_plain_set_nested_inplace 45.5710μs 14.2935μs 69.9618 KOps/s 75.1244 KOps/s $\textbf{\color{#d91a1a}-6.87\%}$
test_plain_set_stack_nested_inplace 43.0810μs 14.1508μs 70.6675 KOps/s 76.2230 KOps/s $\textbf{\color{#d91a1a}-7.29\%}$
test_items 29.7310μs 2.8711μs 348.3036 KOps/s 341.3312 KOps/s $\color{#35bf28}+2.04\%$
test_items_nested 0.4136ms 0.3640ms 2.7471 KOps/s 2.7415 KOps/s $\color{#35bf28}+0.20\%$
test_items_nested_locked 0.4106ms 0.3641ms 2.7465 KOps/s 2.7342 KOps/s $\color{#35bf28}+0.45\%$
test_items_nested_leaf 82.0420μs 60.1865μs 16.6150 KOps/s 16.5860 KOps/s $\color{#35bf28}+0.18\%$
test_items_stack_nested 0.4175ms 0.3608ms 2.7713 KOps/s 2.8062 KOps/s $\color{#d91a1a}-1.24\%$
test_items_stack_nested_leaf 83.7420μs 59.9731μs 16.6742 KOps/s 16.5464 KOps/s $\color{#35bf28}+0.77\%$
test_items_stack_nested_locked 0.4009ms 0.3630ms 2.7550 KOps/s 2.7567 KOps/s $\color{#d91a1a}-0.06\%$
test_keys 28.1500μs 3.3980μs 294.2924 KOps/s 292.9632 KOps/s $\color{#35bf28}+0.45\%$
test_keys_nested 0.1145ms 88.5165μs 11.2973 KOps/s 11.2489 KOps/s $\color{#35bf28}+0.43\%$
test_keys_nested_locked 0.7802ms 93.9334μs 10.6458 KOps/s 10.6213 KOps/s $\color{#35bf28}+0.23\%$
test_keys_nested_leaf 0.1114ms 79.5337μs 12.5733 KOps/s 12.5669 KOps/s $\color{#35bf28}+0.05\%$
test_keys_stack_nested 0.1222ms 87.9784μs 11.3664 KOps/s 11.3551 KOps/s $\color{#35bf28}+0.10\%$
test_keys_stack_nested_leaf 0.1664ms 78.9582μs 12.6649 KOps/s 12.7020 KOps/s $\color{#d91a1a}-0.29\%$
test_keys_stack_nested_locked 0.1201ms 93.3872μs 10.7081 KOps/s 10.5933 KOps/s $\color{#35bf28}+1.08\%$
test_values 5.3550μs 0.8519μs 1.1738 MOps/s 1.1696 MOps/s $\color{#35bf28}+0.36\%$
test_values_nested 62.8810μs 37.0518μs 26.9892 KOps/s 27.0121 KOps/s $\color{#d91a1a}-0.08\%$
test_values_nested_locked 66.3120μs 39.0041μs 25.6383 KOps/s 25.4905 KOps/s $\color{#35bf28}+0.58\%$
test_values_nested_leaf 66.8210μs 42.1176μs 23.7430 KOps/s 23.4027 KOps/s $\color{#35bf28}+1.45\%$
test_values_stack_nested 0.1312ms 37.0032μs 27.0247 KOps/s 26.7640 KOps/s $\color{#35bf28}+0.97\%$
test_values_stack_nested_leaf 86.0720μs 41.9533μs 23.8360 KOps/s 23.4783 KOps/s $\color{#35bf28}+1.52\%$
test_values_stack_nested_locked 63.8810μs 38.7440μs 25.8105 KOps/s 25.6085 KOps/s $\color{#35bf28}+0.79\%$
test_membership 4.3111μs 0.4986μs 2.0058 MOps/s 1.9871 MOps/s $\color{#35bf28}+0.94\%$
test_membership_nested 15.8405μs 1.9572μs 510.9226 KOps/s 515.5777 KOps/s $\color{#d91a1a}-0.90\%$
test_membership_nested_leaf 19.9405μs 1.9644μs 509.0641 KOps/s 514.0159 KOps/s $\color{#d91a1a}-0.96\%$
test_membership_stacked_nested 28.2510μs 2.0366μs 491.0203 KOps/s 489.1062 KOps/s $\color{#35bf28}+0.39\%$
test_membership_stacked_nested_leaf 26.6510μs 2.0277μs 493.1680 KOps/s 498.3859 KOps/s $\color{#d91a1a}-1.05\%$
test_membership_nested_last 30.9210μs 3.0908μs 323.5423 KOps/s 334.1193 KOps/s $\color{#d91a1a}-3.17\%$
test_membership_nested_leaf_last 29.4810μs 3.0513μs 327.7325 KOps/s 336.2933 KOps/s $\color{#d91a1a}-2.55\%$
test_membership_stacked_nested_last 35.2410μs 2.9951μs 333.8763 KOps/s 330.6845 KOps/s $\color{#35bf28}+0.97\%$
test_membership_stacked_nested_leaf_last 24.0400μs 2.9711μs 336.5728 KOps/s 338.8160 KOps/s $\color{#d91a1a}-0.66\%$
test_nested_getleaf 30.8500μs 6.2098μs 161.0347 KOps/s 160.8394 KOps/s $\color{#35bf28}+0.12\%$
test_nested_get 26.4700μs 5.9407μs 168.3293 KOps/s 169.3171 KOps/s $\color{#d91a1a}-0.58\%$
test_stacked_getleaf 34.9110μs 6.1109μs 163.6422 KOps/s 163.7771 KOps/s $\color{#d91a1a}-0.08\%$
test_stacked_get 31.9110μs 5.7964μs 172.5195 KOps/s 173.7930 KOps/s $\color{#d91a1a}-0.73\%$
test_nested_getitemleaf 43.9010μs 6.4195μs 155.7764 KOps/s 154.3585 KOps/s $\color{#35bf28}+0.92\%$
test_nested_getitem 34.9900μs 6.0807μs 164.4549 KOps/s 163.6712 KOps/s $\color{#35bf28}+0.48\%$
test_stacked_getitemleaf 55.0010μs 6.2938μs 158.8874 KOps/s 156.7052 KOps/s $\color{#35bf28}+1.39\%$
test_stacked_getitem 35.2610μs 5.9920μs 166.8903 KOps/s 166.4132 KOps/s $\color{#35bf28}+0.29\%$
test_lock_nested 2.0289ms 0.3381ms 2.9576 KOps/s 3.0154 KOps/s $\color{#d91a1a}-1.92\%$
test_lock_stack_nested 0.3820ms 0.3407ms 2.9352 KOps/s 2.9437 KOps/s $\color{#d91a1a}-0.29\%$
test_unlock_nested 0.3499ms 0.2814ms 3.5535 KOps/s 3.6552 KOps/s $\color{#d91a1a}-2.78\%$
test_unlock_stack_nested 0.3415ms 0.2795ms 3.5782 KOps/s 3.6326 KOps/s $\color{#d91a1a}-1.50\%$
test_flatten_speed 0.1160ms 76.9065μs 13.0028 KOps/s 13.0102 KOps/s $\color{#d91a1a}-0.06\%$
test_unflatten_speed 0.3847ms 0.3191ms 3.1335 KOps/s 3.1521 KOps/s $\color{#d91a1a}-0.59\%$
test_common_ops 0.8062ms 0.6349ms 1.5751 KOps/s 1.7143 KOps/s $\textbf{\color{#d91a1a}-8.12\%}$
test_creation 0.1391ms 1.6975μs 589.0920 KOps/s 586.7819 KOps/s $\color{#35bf28}+0.39\%$
test_creation_empty 32.1200μs 9.6446μs 103.6852 KOps/s 132.8862 KOps/s $\textbf{\color{#d91a1a}-21.97\%}$
test_creation_nested_1 41.9610μs 11.3829μs 87.8513 KOps/s 110.5945 KOps/s $\textbf{\color{#d91a1a}-20.56\%}$
test_creation_nested_2 40.2010μs 13.9880μs 71.4896 KOps/s 84.2459 KOps/s $\textbf{\color{#d91a1a}-15.14\%}$
test_clone 56.7310μs 10.9061μs 91.6918 KOps/s 93.3755 KOps/s $\color{#d91a1a}-1.80\%$
test_getitem[int] 1.2683ms 10.5391μs 94.8848 KOps/s 94.9325 KOps/s $\color{#d91a1a}-0.05\%$
test_getitem[slice_int] 0.1194ms 20.6078μs 48.5253 KOps/s 49.6242 KOps/s $\color{#d91a1a}-2.21\%$
test_getitem[range] 0.1281ms 35.9271μs 27.8341 KOps/s 28.1202 KOps/s $\color{#d91a1a}-1.02\%$
test_getitem[tuple] 0.1113ms 17.7557μs 56.3199 KOps/s 56.3065 KOps/s $\color{#35bf28}+0.02\%$
test_getitem[list] 0.1256ms 32.0520μs 31.1993 KOps/s 31.5324 KOps/s $\color{#d91a1a}-1.06\%$
test_setitem_dim[int] 38.7900μs 19.1385μs 52.2506 KOps/s 53.0483 KOps/s $\color{#d91a1a}-1.50\%$
test_setitem_dim[slice_int] 57.8910μs 36.6035μs 27.3198 KOps/s 26.8280 KOps/s $\color{#35bf28}+1.83\%$
test_setitem_dim[range] 73.6110μs 50.7867μs 19.6902 KOps/s 19.4834 KOps/s $\color{#35bf28}+1.06\%$
test_setitem_dim[tuple] 53.9710μs 31.9656μs 31.2837 KOps/s 31.7877 KOps/s $\color{#d91a1a}-1.59\%$
test_setitem 69.8120μs 16.1835μs 61.7915 KOps/s 68.5281 KOps/s $\textbf{\color{#d91a1a}-9.83\%}$
test_set 70.6110μs 15.4804μs 64.5976 KOps/s 71.5481 KOps/s $\textbf{\color{#d91a1a}-9.71\%}$
test_set_shared 0.5099ms 0.1570ms 6.3695 KOps/s 6.4408 KOps/s $\color{#d91a1a}-1.11\%$
test_update 0.2337ms 19.5079μs 51.2612 KOps/s 59.5093 KOps/s $\textbf{\color{#d91a1a}-13.86\%}$
test_update_nested 79.9120μs 25.6475μs 38.9901 KOps/s 45.2673 KOps/s $\textbf{\color{#d91a1a}-13.87\%}$
test_update__nested 0.4973ms 24.9689μs 40.0498 KOps/s 39.6712 KOps/s $\color{#35bf28}+0.95\%$
test_set_nested 71.6810μs 17.1776μs 58.2154 KOps/s 64.2209 KOps/s $\textbf{\color{#d91a1a}-9.35\%}$
test_set_nested_new 78.8920μs 19.3755μs 51.6117 KOps/s 55.7376 KOps/s $\textbf{\color{#d91a1a}-7.40\%}$
test_select 90.7120μs 31.5851μs 31.6605 KOps/s 34.7791 KOps/s $\textbf{\color{#d91a1a}-8.97\%}$
test_select_nested 72.1510μs 43.4839μs 22.9970 KOps/s 22.8528 KOps/s $\color{#35bf28}+0.63\%$
test_exclude_nested 92.0320μs 61.5167μs 16.2558 KOps/s 16.4224 KOps/s $\color{#d91a1a}-1.01\%$
test_empty[True] 0.4003ms 0.2915ms 3.4309 KOps/s 3.4357 KOps/s $\color{#d91a1a}-0.14\%$
test_empty[False] 4.1220μs 0.8192μs 1.2207 MOps/s 1.2053 MOps/s $\color{#35bf28}+1.27\%$
test_to 88.0510μs 55.7634μs 17.9329 KOps/s 17.7252 KOps/s $\color{#35bf28}+1.17\%$
test_to_nonblocking 95.1720μs 47.0498μs 21.2541 KOps/s 21.1172 KOps/s $\color{#35bf28}+0.65\%$
test_unbind_speed 0.2745ms 0.2394ms 4.1770 KOps/s 4.2636 KOps/s $\color{#d91a1a}-2.03\%$
test_unbind_speed_stack0 0.2832ms 0.2341ms 4.2711 KOps/s 4.3028 KOps/s $\color{#d91a1a}-0.74\%$
test_unbind_speed_stack1 92.3942ms 0.7341ms 1.3622 KOps/s 1.3652 KOps/s $\color{#d91a1a}-0.22\%$
test_split 94.8370ms 1.5754ms 634.7607 Ops/s 642.3276 Ops/s $\color{#d91a1a}-1.18\%$
test_chunk 94.2265ms 1.5734ms 635.5625 Ops/s 638.8252 Ops/s $\color{#d91a1a}-0.51\%$
test_consolidate[False-None] 96.3946ms 2.9234ms 342.0616 Ops/s 344.7759 Ops/s $\color{#d91a1a}-0.79\%$
test_consolidate[default-None] 1.7658ms 1.6923ms 590.9120 Ops/s 607.8961 Ops/s $\color{#d91a1a}-2.79\%$
test_consolidate[reduce-overhead-None] 1.7539ms 1.6963ms 589.5296 Ops/s 589.6191 Ops/s $\color{#d91a1a}-0.02\%$
test_consolidate_njt[False-None] 6.7002ms 6.3927ms 156.4295 Ops/s 156.2226 Ops/s $\color{#35bf28}+0.13\%$
test_to[False-False-None] 1.8348ms 1.7459ms 572.7753 Ops/s 582.5018 Ops/s $\color{#d91a1a}-1.67\%$
test_to[True-False-None] 1.5569ms 1.3294ms 752.1968 Ops/s 774.7796 Ops/s $\color{#d91a1a}-2.91\%$
test_to[within-False-None] 4.3408ms 4.1261ms 242.3574 Ops/s 247.5707 Ops/s $\color{#d91a1a}-2.11\%$
test_to[True-default-None] 5.7207ms 5.2966ms 188.7991 Ops/s 193.2427 Ops/s $\color{#d91a1a}-2.30\%$
test_to_njt[False-False-None] 6.9073ms 6.7709ms 147.6906 Ops/s 142.1001 Ops/s $\color{#35bf28}+3.93\%$
test_to_njt[True-False-None] 5.5416ms 5.3551ms 186.7367 Ops/s 177.3835 Ops/s $\textbf{\color{#35bf28}+5.27\%}$
test_to_njt[within-False-None] 12.0294ms 11.5749ms 86.3936 Ops/s 83.9690 Ops/s $\color{#35bf28}+2.89\%$
test_creation[device0] 0.5483ms 82.6365μs 12.1012 KOps/s 12.1267 KOps/s $\color{#d91a1a}-0.21\%$
test_creation_from_tensor 0.5075ms 84.7327μs 11.8018 KOps/s 11.5977 KOps/s $\color{#35bf28}+1.76\%$
test_add_one[memmap_tensor0] 0.4423ms 6.7959μs 147.1475 KOps/s 150.1190 KOps/s $\color{#d91a1a}-1.98\%$
test_contiguous[memmap_tensor0] 2.2291μs 0.4634μs 2.1578 MOps/s 2.4155 MOps/s $\textbf{\color{#d91a1a}-10.67\%}$
test_stack[memmap_tensor0] 43.3610μs 4.2608μs 234.6952 KOps/s 236.5954 KOps/s $\color{#d91a1a}-0.80\%$
test_memmaptd_index 1.5224ms 0.2402ms 4.1636 KOps/s 4.2226 KOps/s $\color{#d91a1a}-1.40\%$
test_memmaptd_index_astensor 0.4317ms 0.3027ms 3.3032 KOps/s 3.3801 KOps/s $\color{#d91a1a}-2.27\%$
test_memmaptd_index_op 0.7168ms 0.5865ms 1.7050 KOps/s 1.8139 KOps/s $\textbf{\color{#d91a1a}-6.00\%}$
test_serialize_model 0.1315s 0.1307s 7.6489 Ops/s 7.6291 Ops/s $\color{#35bf28}+0.26\%$
test_serialize_model_pickle 1.3508s 1.2113s 0.8256 Ops/s 0.8441 Ops/s $\color{#d91a1a}-2.19\%$
test_serialize_weights 0.2818s 0.1516s 6.5953 Ops/s 7.6659 Ops/s $\textbf{\color{#d91a1a}-13.97\%}$
test_serialize_weights_returnearly 0.3313s 53.4598ms 18.7056 Ops/s 11.7420 Ops/s $\textbf{\color{#35bf28}+59.30\%}$
test_serialize_weights_pickle 1.3834s 1.1916s 0.8392 Ops/s 0.8219 Ops/s $\color{#35bf28}+2.11\%$
test_reshape_pytree 68.9110μs 23.3210μs 42.8797 KOps/s 45.8770 KOps/s $\textbf{\color{#d91a1a}-6.53\%}$
test_reshape_td 62.3310μs 29.5097μs 33.8872 KOps/s 38.1605 KOps/s $\textbf{\color{#d91a1a}-11.20\%}$
test_view_pytree 57.2510μs 22.9984μs 43.4813 KOps/s 46.9702 KOps/s $\textbf{\color{#d91a1a}-7.43\%}$
test_view_td 69.9210μs 34.8409μs 28.7019 KOps/s 30.5299 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_unbind_pytree 60.4510μs 29.3571μs 34.0634 KOps/s 36.1095 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_unbind_td 0.6892ms 37.2687μs 26.8321 KOps/s 27.1219 KOps/s $\color{#d91a1a}-1.07\%$
test_split_pytree 64.9910μs 32.1113μs 31.1416 KOps/s 33.6919 KOps/s $\textbf{\color{#d91a1a}-7.57\%}$
test_split_td 0.8228ms 37.2050μs 26.8781 KOps/s 25.7887 KOps/s $\color{#35bf28}+4.22\%$
test_add_pytree 73.0420μs 34.7075μs 28.8122 KOps/s 29.6139 KOps/s $\color{#d91a1a}-2.71\%$
test_add_td 0.1918ms 53.6110μs 18.6529 KOps/s 21.3055 KOps/s $\textbf{\color{#d91a1a}-12.45\%}$
test_compile_add_one_nested[tensordict-compile] 0.1703ms 0.1187ms 8.4254 KOps/s 8.5080 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_add_one_nested[tensordict-eager] 0.2225ms 0.1320ms 7.5754 KOps/s 7.5613 KOps/s $\color{#35bf28}+0.19\%$
test_compile_add_one_nested[pytree-compile] 0.1520ms 92.2934μs 10.8350 KOps/s 10.8608 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_add_one_nested[pytree-eager] 0.3263ms 0.1460ms 6.8486 KOps/s 6.7671 KOps/s $\color{#35bf28}+1.20\%$
test_compile_copy_nested[tensordict-compile] 69.3610μs 31.9321μs 31.3165 KOps/s 34.7168 KOps/s $\textbf{\color{#d91a1a}-9.79\%}$
test_compile_copy_nested[tensordict-eager] 69.8320μs 29.2330μs 34.2079 KOps/s 34.1776 KOps/s $\color{#35bf28}+0.09\%$
test_compile_copy_nested[pytree-compile] 0.4484ms 62.6460μs 15.9627 KOps/s 15.7978 KOps/s $\color{#35bf28}+1.04\%$
test_compile_copy_nested[pytree-eager] 94.5720μs 48.2943μs 20.7064 KOps/s 20.4458 KOps/s $\color{#35bf28}+1.27\%$
test_compile_add_one_flat[tensordict-compile] 0.1757ms 0.1369ms 7.3069 KOps/s 7.2967 KOps/s $\color{#35bf28}+0.14\%$
test_compile_add_one_flat[tensordict-eager] 0.3071ms 0.2108ms 4.7427 KOps/s 4.6746 KOps/s $\color{#35bf28}+1.46\%$
test_compile_add_one_flat[tensorclass-compile] 0.1442ms 98.2315μs 10.1800 KOps/s 10.6656 KOps/s $\color{#d91a1a}-4.55\%$
test_compile_add_one_flat[tensorclass-eager] 0.1185ms 55.5520μs 18.0011 KOps/s 18.5298 KOps/s $\color{#d91a1a}-2.85\%$
test_compile_add_one_flat[pytree-compile] 0.1737ms 0.1313ms 7.6171 KOps/s 7.5662 KOps/s $\color{#35bf28}+0.67\%$
test_compile_add_one_flat[pytree-eager] 0.5273ms 0.4745ms 2.1077 KOps/s 2.1031 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_self_flat[tensordict-eager] 0.3987ms 0.2582ms 3.8732 KOps/s 3.8748 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_add_self_flat[tensordict-compile] 0.1855ms 0.1419ms 7.0494 KOps/s 7.2448 KOps/s $\color{#d91a1a}-2.70\%$
test_compile_add_self_flat[tensorclass-eager] 0.1652ms 68.9438μs 14.5046 KOps/s 15.1002 KOps/s $\color{#d91a1a}-3.94\%$
test_compile_add_self_flat[tensorclass-compile] 0.1402ms 99.7239μs 10.0277 KOps/s 10.5395 KOps/s $\color{#d91a1a}-4.86\%$
test_compile_add_self_flat[pytree-eager] 0.4503ms 0.4014ms 2.4912 KOps/s 2.4656 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_self_flat[pytree-compile] 0.1683ms 0.1313ms 7.6133 KOps/s 7.5945 KOps/s $\color{#35bf28}+0.25\%$
test_compile_copy_flat[tensordict-compile] 47.2510μs 18.5092μs 54.0273 KOps/s 39.4582 KOps/s $\textbf{\color{#35bf28}+36.92\%}$
test_compile_copy_flat[tensordict-eager] 59.5110μs 30.9793μs 32.2796 KOps/s 31.4080 KOps/s $\color{#35bf28}+2.78\%$
test_compile_copy_flat[pytree-compile] 0.1038ms 68.5864μs 14.5801 KOps/s 14.7650 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_copy_flat[pytree-eager] 85.8710μs 51.7739μs 19.3148 KOps/s 19.4673 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_assign_and_add[tensordict-compile] 1.5674ms 0.3816ms 2.6202 KOps/s 2.2026 KOps/s $\textbf{\color{#35bf28}+18.96\%}$
test_compile_assign_and_add[tensordict-eager] 2.7875ms 2.6284ms 380.4652 Ops/s 369.4972 Ops/s $\color{#35bf28}+2.97\%$
test_compile_assign_and_add[pytree-compile] 1.5609ms 0.3753ms 2.6646 KOps/s 2.3458 KOps/s $\textbf{\color{#35bf28}+13.59\%}$
test_compile_assign_and_add[pytree-eager] 2.9763ms 2.7269ms 366.7214 Ops/s 383.4566 Ops/s $\color{#d91a1a}-4.36\%$
test_compile_indexing[tensor-tensordict-compile] 0.1895ms 0.1183ms 8.4547 KOps/s 9.0064 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5606ms 83.8706μs 11.9231 KOps/s 12.0134 KOps/s $\color{#d91a1a}-0.75\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1814ms 0.1156ms 8.6529 KOps/s 9.6852 KOps/s $\textbf{\color{#d91a1a}-10.66\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.1247ms 72.0712μs 13.8752 KOps/s 14.8284 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_compile_indexing[tensor-pytree-compile] 0.1632ms 0.1141ms 8.7671 KOps/s 9.6096 KOps/s $\textbf{\color{#d91a1a}-8.77\%}$
test_compile_indexing[tensor-pytree-eager] 0.1160ms 71.6943μs 13.9481 KOps/s 14.9899 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_compile_indexing[slice-tensordict-compile] 0.1427ms 97.7247μs 10.2328 KOps/s 10.2430 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_indexing[slice-tensordict-eager] 0.1581ms 17.0170μs 58.7647 KOps/s 58.5538 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[slice-tensorclass-compile] 0.1397ms 93.9723μs 10.6414 KOps/s 10.6980 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[slice-tensorclass-eager] 0.1174ms 19.7042μs 50.7507 KOps/s 65.2124 KOps/s $\textbf{\color{#d91a1a}-22.18\%}$
test_compile_indexing[slice-pytree-compile] 0.1559ms 97.4108μs 10.2658 KOps/s 10.6556 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_indexing[slice-pytree-eager] 53.2710μs 16.3605μs 61.1230 KOps/s 65.0199 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_compile_indexing[int-tensordict-compile] 0.1533ms 0.1052ms 9.5043 KOps/s 9.8452 KOps/s $\color{#d91a1a}-3.46\%$
test_compile_indexing[int-tensordict-eager] 0.5854ms 17.9651μs 55.6635 KOps/s 60.1399 KOps/s $\textbf{\color{#d91a1a}-7.44\%}$
test_compile_indexing[int-tensorclass-compile] 0.1519ms 97.8329μs 10.2215 KOps/s 10.6591 KOps/s $\color{#d91a1a}-4.11\%$
test_compile_indexing[int-tensorclass-eager] 52.4810μs 16.4781μs 60.6868 KOps/s 64.7656 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_compile_indexing[int-pytree-compile] 0.1574ms 99.1330μs 10.0875 KOps/s 10.6717 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_compile_indexing[int-pytree-eager] 50.7510μs 16.0335μs 62.3696 KOps/s 64.9530 KOps/s $\color{#d91a1a}-3.98\%$
test_mod_add[eager] 80.3410μs 38.8662μs 25.7293 KOps/s 26.6322 KOps/s $\color{#d91a1a}-3.39\%$
test_mod_add[compile] 0.1240ms 78.8060μs 12.6894 KOps/s 12.5547 KOps/s $\color{#35bf28}+1.07\%$
test_mod_add[compile-overhead] 0.3139ms 0.1635ms 6.1168 KOps/s 5.7059 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_mod_wrap[eager] 0.3205ms 0.2430ms 4.1158 KOps/s 3.8509 KOps/s $\textbf{\color{#35bf28}+6.88\%}$
test_mod_wrap[compile] 0.3263ms 0.2785ms 3.5908 KOps/s 3.3647 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_mod_wrap[compile-overhead] 7.4311ms 3.8903ms 257.0501 Ops/s 262.1870 Ops/s $\color{#d91a1a}-1.96\%$
test_mod_wrap_and_backward[eager] 1.5473ms 1.4289ms 699.8575 Ops/s 684.8577 Ops/s $\color{#35bf28}+2.19\%$
test_mod_wrap_and_backward[compile] 1.4425ms 1.3386ms 747.0351 Ops/s 737.9138 Ops/s $\color{#35bf28}+1.24\%$
test_mod_wrap_and_backward[compile-overhead] 1.4893ms 1.0128ms 987.3971 Ops/s 964.9581 Ops/s $\color{#35bf28}+2.33\%$
test_seq_add[eager] 0.1752ms 0.1165ms 8.5872 KOps/s 8.4315 KOps/s $\color{#35bf28}+1.85\%$
test_seq_add[compile] 0.1460ms 87.5439μs 11.4228 KOps/s 10.7457 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_seq_add[compile-overhead] 0.1838ms 0.1278ms 7.8274 KOps/s 7.6610 KOps/s $\color{#35bf28}+2.17\%$
test_seq_wrap[eager] 0.4839ms 0.4203ms 2.3794 KOps/s 2.2776 KOps/s $\color{#35bf28}+4.47\%$
test_seq_wrap[compile] 0.3611ms 0.2967ms 3.3701 KOps/s 3.1571 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_seq_wrap[compile-overhead] 0.2807ms 0.2281ms 4.3846 KOps/s 4.3722 KOps/s $\color{#35bf28}+0.28\%$
test_func_call_runtime[False-eager] 0.8448ms 0.7550ms 1.3245 KOps/s 1.2584 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_func_call_runtime[False-compile] 0.9176ms 0.7278ms 1.3740 KOps/s 1.3593 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_runtime[False-compile-overhead] 0.4090ms 0.3560ms 2.8093 KOps/s 2.8177 KOps/s $\color{#d91a1a}-0.30\%$
test_func_call_runtime[True-eager] 0.9931ms 0.8850ms 1.1300 KOps/s 1.1005 KOps/s $\color{#35bf28}+2.68\%$
test_func_call_runtime[True-compile] 0.8713ms 0.7868ms 1.2710 KOps/s 1.3216 KOps/s $\color{#d91a1a}-3.82\%$
test_func_call_runtime[True-compile-overhead] 0.4424ms 0.3745ms 2.6703 KOps/s 2.6910 KOps/s $\color{#d91a1a}-0.77\%$
test_func_call_cm_runtime[False-eager] 0.7654ms 0.7193ms 1.3902 KOps/s 1.3650 KOps/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[False-compile] 0.8468ms 0.7411ms 1.3493 KOps/s 1.3580 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4407ms 0.3562ms 2.8072 KOps/s 2.8273 KOps/s $\color{#d91a1a}-0.71\%$
test_func_call_cm_runtime[True-eager] 1.0604ms 0.9852ms 1.0150 KOps/s 993.9234 Ops/s $\color{#35bf28}+2.12\%$
test_func_call_cm_runtime[True-compile] 1.0960ms 0.9755ms 1.0251 KOps/s 1.0114 KOps/s $\color{#35bf28}+1.36\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0956ms 0.9716ms 1.0292 KOps/s 1.0045 KOps/s $\color{#35bf28}+2.46\%$
test_vmap_func_call_cm_runtime[eager] 2.4944ms 2.0726ms 482.4938 Ops/s 470.8302 Ops/s $\color{#35bf28}+2.48\%$
test_vmap_func_call_cm_runtime[compile] 0.8980ms 0.7983ms 1.2527 KOps/s 1.2624 KOps/s $\color{#d91a1a}-0.77\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5548ms 0.4085ms 2.4480 KOps/s 2.4542 KOps/s $\color{#d91a1a}-0.26\%$
test_distributed 2.8786ms 0.1872ms 5.3427 KOps/s 8.2695 KOps/s $\textbf{\color{#d91a1a}-35.39\%}$
test_tdmodule 30.9610μs 21.3521μs 46.8338 KOps/s 49.5152 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_tdmodule_dispatch 59.9610μs 37.5030μs 26.6645 KOps/s 28.1643 KOps/s $\textbf{\color{#d91a1a}-5.33\%}$
test_tdseq 41.0510μs 21.1444μs 47.2939 KOps/s 49.2647 KOps/s $\color{#d91a1a}-4.00\%$
test_tdseq_dispatch 70.9610μs 40.2655μs 24.8352 KOps/s 26.3837 KOps/s $\textbf{\color{#d91a1a}-5.87\%}$
test_instantiation_functorch 1.6541ms 1.5304ms 653.4300 Ops/s 658.6646 Ops/s $\color{#d91a1a}-0.79\%$
test_exec_functorch 0.1923ms 0.1426ms 7.0103 KOps/s 7.0900 KOps/s $\color{#d91a1a}-1.12\%$
test_exec_functional_call 0.2137ms 0.1358ms 7.3625 KOps/s 7.4880 KOps/s $\color{#d91a1a}-1.68\%$
test_exec_td_decorator 0.3752ms 0.1867ms 5.3553 KOps/s 5.4679 KOps/s $\color{#d91a1a}-2.06\%$
test_vmap_mlp_speed_decorator[True-True] 0.7476ms 0.6818ms 1.4667 KOps/s 1.4677 KOps/s $\color{#d91a1a}-0.07\%$
test_vmap_mlp_speed_decorator[True-False] 0.8162ms 0.6811ms 1.4683 KOps/s 1.4726 KOps/s $\color{#d91a1a}-0.29\%$
test_vmap_mlp_speed_decorator[False-True] 0.7322ms 0.5883ms 1.6998 KOps/s 1.6920 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[False-False] 0.7053ms 0.5906ms 1.6932 KOps/s 1.6894 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_transformer_speed_decorator[True-True] 19.1390ms 19.0740ms 52.4273 Ops/s 52.1690 Ops/s $\color{#35bf28}+0.50\%$
test_vmap_transformer_speed_decorator[True-False] 19.9766ms 19.1738ms 52.1544 Ops/s 52.1292 Ops/s $\color{#35bf28}+0.05\%$
test_vmap_transformer_speed_decorator[False-True] 19.0351ms 18.9221ms 52.8483 Ops/s 52.4749 Ops/s $\color{#35bf28}+0.71\%$
test_vmap_transformer_speed_decorator[False-False] 19.7953ms 18.9652ms 52.7281 Ops/s 52.5432 Ops/s $\color{#35bf28}+0.35\%$
test_to_module_speed[True] 1.0605ms 0.9646ms 1.0367 KOps/s 1.0457 KOps/s $\color{#d91a1a}-0.86\%$
test_to_module_speed[False] 1.2834ms 0.9551ms 1.0470 KOps/s 1.0591 KOps/s $\color{#d91a1a}-1.15\%$
test_tc_init 62.1910μs 35.7775μs 27.9505 KOps/s 28.1726 KOps/s $\color{#d91a1a}-0.79\%$
test_tc_init_nested 0.1563ms 72.2190μs 13.8468 KOps/s 14.3370 KOps/s $\color{#d91a1a}-3.42\%$
test_tc_first_layer_tensor 3.8643μs 0.7071μs 1.4142 MOps/s 1.2606 MOps/s $\textbf{\color{#35bf28}+12.19\%}$
test_tc_first_layer_nontensor 39.5700μs 2.2217μs 450.1067 KOps/s 450.6666 KOps/s $\color{#d91a1a}-0.12\%$
test_tc_second_layer_tensor 18.8753μs 1.4151μs 706.6870 KOps/s 706.4613 KOps/s $\color{#35bf28}+0.03\%$
test_tc_second_layer_nontensor 0.2034ms 2.9569μs 338.1888 KOps/s 339.4571 KOps/s $\color{#d91a1a}-0.37\%$
test_unbind 0.2180s 10.0211ms 99.7894 Ops/s 145.6810 Ops/s $\textbf{\color{#d91a1a}-31.50\%}$
test_full_like 10.1893ms 9.0884ms 110.0308 Ops/s 108.7642 Ops/s $\color{#35bf28}+1.16\%$
test_zeros_like 11.5338ms 8.5979ms 116.3079 Ops/s 234.6017 Ops/s $\textbf{\color{#d91a1a}-50.42\%}$
test_ones_like 5.1121ms 4.3269ms 231.1121 Ops/s 235.2688 Ops/s $\color{#d91a1a}-1.77\%$
test_clone 6.6580ms 6.3468ms 157.5607 Ops/s 110.0591 Ops/s $\textbf{\color{#35bf28}+43.16\%}$
test_squeeze 58.0920μs 9.9025μs 100.9844 KOps/s 105.6169 KOps/s $\color{#d91a1a}-4.39\%$
test_unsqueeze 0.1691ms 72.2314μs 13.8444 KOps/s 13.5176 KOps/s $\color{#35bf28}+2.42\%$
test_split 0.3738ms 0.1573ms 6.3584 KOps/s 6.2375 KOps/s $\color{#35bf28}+1.94\%$
test_permute 0.2324ms 0.1832ms 5.4596 KOps/s 5.4333 KOps/s $\color{#35bf28}+0.48\%$
test_stack 50.2610ms 50.0133ms 19.9947 Ops/s 19.9679 Ops/s $\color{#35bf28}+0.13\%$
test_cat 50.4678ms 49.9448ms 20.0221 Ops/s 20.0147 Ops/s $\color{#35bf28}+0.04\%$

[ghstack-poisoned]
@vmoens vmoens merged commit 5091e16 into gh/vmoens/48/base Feb 26, 2025
44 of 53 checks passed
vmoens added a commit that referenced this pull request Feb 26, 2025
ghstack-source-id: 8e47f46e83982d554237604f6ef7c845eeed1b50
Pull Request resolved: #1236
@vmoens vmoens deleted the gh/vmoens/48/head branch February 26, 2025 11:03
vmoens added a commit that referenced this pull request Feb 26, 2025
ghstack-source-id: 8e47f46e83982d554237604f6ef7c845eeed1b50
Pull Request resolved: #1236

(cherry picked from commit 635c9c0)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants