-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[BugFix] Fix serialization of stacks of Tensorclasses #1236
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 25, 2025
ghstack-source-id: 0f479c80655d1e663ce67a16031556dbe70937f9 Pull Request resolved: #1236
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 68.6270μs | 21.1877μs | 47.1972 KOps/s | 47.4125 KOps/s | |
test_plain_set_stack_nested | 66.6740μs | 21.3683μs | 46.7982 KOps/s | 47.9190 KOps/s | |
test_plain_set_nested_inplace | 65.3320μs | 23.0791μs | 43.3293 KOps/s | 44.8219 KOps/s | |
test_plain_set_stack_nested_inplace | 63.5280μs | 22.7969μs | 43.8656 KOps/s | 44.9595 KOps/s | |
test_items | 42.8000μs | 4.2595μs | 234.7670 KOps/s | 232.0791 KOps/s | |
test_items_nested | 0.7136ms | 0.4069ms | 2.4577 KOps/s | 2.5198 KOps/s | |
test_items_nested_locked | 0.7271ms | 0.4101ms | 2.4385 KOps/s | 2.4881 KOps/s | |
test_items_nested_leaf | 0.1947ms | 77.0058μs | 12.9860 KOps/s | 13.1824 KOps/s | |
test_items_stack_nested | 0.7172ms | 0.4048ms | 2.4701 KOps/s | 2.4753 KOps/s | |
test_items_stack_nested_leaf | 0.1581ms | 78.0396μs | 12.8140 KOps/s | 13.1799 KOps/s | |
test_items_stack_nested_locked | 0.8176ms | 0.4054ms | 2.4667 KOps/s | 2.4856 KOps/s | |
test_keys | 18.1540μs | 3.4630μs | 288.7648 KOps/s | 285.4286 KOps/s | |
test_keys_nested | 0.2689ms | 0.1631ms | 6.1312 KOps/s | 6.0615 KOps/s | |
test_keys_nested_locked | 0.6743ms | 0.1678ms | 5.9583 KOps/s | 5.8341 KOps/s | |
test_keys_nested_leaf | 0.2444ms | 0.1422ms | 7.0346 KOps/s | 6.9259 KOps/s | |
test_keys_stack_nested | 0.2838ms | 0.1632ms | 6.1271 KOps/s | 6.0592 KOps/s | |
test_keys_stack_nested_leaf | 0.2303ms | 0.1417ms | 7.0579 KOps/s | 6.9596 KOps/s | |
test_keys_stack_nested_locked | 0.2980ms | 0.1694ms | 5.9048 KOps/s | 5.8540 KOps/s | |
test_values | 10.4132μs | 1.0525μs | 950.1025 KOps/s | 945.3966 KOps/s | |
test_values_nested | 0.1123ms | 62.4829μs | 16.0044 KOps/s | 16.1875 KOps/s | |
test_values_nested_locked | 0.1176ms | 62.7452μs | 15.9375 KOps/s | 16.1262 KOps/s | |
test_values_nested_leaf | 0.1366ms | 71.8473μs | 13.9184 KOps/s | 14.0695 KOps/s | |
test_values_stack_nested | 0.1170ms | 63.2148μs | 15.8191 KOps/s | 16.1530 KOps/s | |
test_values_stack_nested_leaf | 0.1280ms | 71.4293μs | 13.9998 KOps/s | 14.0453 KOps/s | |
test_values_stack_nested_locked | 0.1155ms | 62.6686μs | 15.9570 KOps/s | 16.2534 KOps/s | |
test_membership | 16.0700μs | 0.8583μs | 1.1651 MOps/s | 1.1494 MOps/s | |
test_membership_nested | 43.3510μs | 2.8789μs | 347.3493 KOps/s | 345.9851 KOps/s | |
test_membership_nested_leaf | 45.4440μs | 2.9265μs | 341.7081 KOps/s | 343.8202 KOps/s | |
test_membership_stacked_nested | 26.9300μs | 2.8712μs | 348.2806 KOps/s | 346.5005 KOps/s | |
test_membership_stacked_nested_leaf | 18.2440μs | 2.8896μs | 346.0641 KOps/s | 350.8263 KOps/s | |
test_membership_nested_last | 46.4070μs | 4.2842μs | 233.4166 KOps/s | 234.2487 KOps/s | |
test_membership_nested_leaf_last | 27.2610μs | 4.2949μs | 232.8370 KOps/s | 232.3410 KOps/s | |
test_membership_stacked_nested_last | 51.0650μs | 4.2841μs | 233.4221 KOps/s | 234.6809 KOps/s | |
test_membership_stacked_nested_leaf_last | 20.6880μs | 4.2940μs | 232.8839 KOps/s | 230.2948 KOps/s | |
test_nested_getleaf | 54.5210μs | 10.5303μs | 94.9644 KOps/s | 94.8979 KOps/s | |
test_nested_get | 57.4370μs | 9.9926μs | 100.0745 KOps/s | 98.4952 KOps/s | |
test_stacked_getleaf | 31.8190μs | 10.6265μs | 94.1039 KOps/s | 96.3279 KOps/s | |
test_stacked_get | 52.7790μs | 10.0184μs | 99.8165 KOps/s | 100.4058 KOps/s | |
test_nested_getitemleaf | 32.6210μs | 11.1541μs | 89.6531 KOps/s | 89.4379 KOps/s | |
test_nested_getitem | 55.7940μs | 10.6071μs | 94.2762 KOps/s | 94.8007 KOps/s | |
test_stacked_getitemleaf | 52.8780μs | 11.1544μs | 89.6511 KOps/s | 90.1325 KOps/s | |
test_stacked_getitem | 35.0650μs | 10.6137μs | 94.2181 KOps/s | 95.2833 KOps/s | |
test_lock_nested | 0.6004ms | 0.4101ms | 2.4384 KOps/s | 2.4428 KOps/s | |
test_lock_stack_nested | 0.5271ms | 0.4217ms | 2.3712 KOps/s | 2.3532 KOps/s | |
test_unlock_nested | 0.4221ms | 0.3313ms | 3.0181 KOps/s | 2.9852 KOps/s | |
test_unlock_stack_nested | 0.7135ms | 0.3393ms | 2.9469 KOps/s | 2.9249 KOps/s | |
test_flatten_speed | 0.1899ms | 0.1009ms | 9.9077 KOps/s | 9.9739 KOps/s | |
test_unflatten_speed | 1.2497ms | 0.5147ms | 1.9429 KOps/s | 1.9319 KOps/s | |
test_common_ops | 4.3503ms | 0.8467ms | 1.1810 KOps/s | 1.2574 KOps/s | |
test_creation | 48.2100μs | 2.5248μs | 396.0717 KOps/s | 402.1641 KOps/s | |
test_creation_empty | 72.4150μs | 13.0993μs | 76.3397 KOps/s | 86.2605 KOps/s | |
test_creation_nested_1 | 50.2740μs | 15.8611μs | 63.0475 KOps/s | 68.6295 KOps/s | |
test_creation_nested_2 | 43.9210μs | 20.5275μs | 48.7153 KOps/s | 53.1394 KOps/s | |
test_clone | 85.5790μs | 13.3859μs | 74.7055 KOps/s | 71.6282 KOps/s | |
test_getitem[int] | 0.8662ms | 12.9973μs | 76.9391 KOps/s | 79.4520 KOps/s | |
test_getitem[slice_int] | 0.1276ms | 24.8290μs | 40.2754 KOps/s | 41.8876 KOps/s | |
test_getitem[range] | 0.1583ms | 50.3413μs | 19.8644 KOps/s | 18.2768 KOps/s | |
test_getitem[tuple] | 0.1485ms | 20.5637μs | 48.6294 KOps/s | 47.0832 KOps/s | |
test_getitem[list] | 0.1594ms | 45.9831μs | 21.7471 KOps/s | 21.8449 KOps/s | |
test_setitem_dim[int] | 51.0550μs | 26.4152μs | 37.8570 KOps/s | 39.4709 KOps/s | |
test_setitem_dim[slice_int] | 0.1003ms | 51.5858μs | 19.3852 KOps/s | 19.7043 KOps/s | |
test_setitem_dim[range] | 0.1287ms | 76.6056μs | 13.0539 KOps/s | 13.1431 KOps/s | |
test_setitem_dim[tuple] | 81.8520μs | 41.9900μs | 23.8152 KOps/s | 24.9752 KOps/s | |
test_setitem | 0.1034ms | 21.5737μs | 46.3527 KOps/s | 49.4738 KOps/s | |
test_set | 0.1013ms | 21.0418μs | 47.5245 KOps/s | 48.9950 KOps/s | |
test_set_shared | 4.2452ms | 0.1868ms | 5.3536 KOps/s | 5.4822 KOps/s | |
test_update | 0.2872ms | 25.9915μs | 38.4742 KOps/s | 45.1076 KOps/s | |
test_update_nested | 75.4400μs | 36.1942μs | 27.6288 KOps/s | 30.2695 KOps/s | |
test_update__nested | 0.4213ms | 33.9868μs | 29.4232 KOps/s | 30.2636 KOps/s | |
test_set_nested | 0.1216ms | 23.3195μs | 42.8826 KOps/s | 45.8023 KOps/s | |
test_set_nested_new | 71.3920μs | 27.3353μs | 36.5828 KOps/s | 37.9313 KOps/s | |
test_select | 0.1100ms | 43.3238μs | 23.0820 KOps/s | 23.4942 KOps/s | |
test_select_nested | 0.1175ms | 61.8263μs | 16.1743 KOps/s | 16.0707 KOps/s | |
test_exclude_nested | 0.1671ms | 79.4677μs | 12.5837 KOps/s | 12.3459 KOps/s | |
test_empty[True] | 0.5682ms | 0.4023ms | 2.4859 KOps/s | 2.4468 KOps/s | |
test_empty[False] | 35.6437μs | 1.3771μs | 726.1894 KOps/s | 728.3850 KOps/s | |
test_unbind_speed | 0.5399ms | 0.2673ms | 3.7406 KOps/s | 3.6634 KOps/s | |
test_unbind_speed_stack0 | 0.4083ms | 0.2658ms | 3.7623 KOps/s | 3.7130 KOps/s | |
test_unbind_speed_stack1 | 0.1001s | 0.7214ms | 1.3861 KOps/s | 1.1924 KOps/s | |
test_split | 0.1055s | 1.7776ms | 562.5594 Ops/s | 559.5183 Ops/s | |
test_chunk | 0.1047s | 1.7724ms | 564.2086 Ops/s | 629.9324 Ops/s | |
test_consolidate_njt[False-None] | 8.8017ms | 8.4427ms | 118.4449 Ops/s | 108.8788 Ops/s | |
test_creation[device0] | 0.2216ms | 91.6641μs | 10.9094 KOps/s | 10.7933 KOps/s | |
test_creation_from_tensor | 4.2526ms | 95.4257μs | 10.4794 KOps/s | 10.7543 KOps/s | |
test_add_one[memmap_tensor0] | 0.1097ms | 4.8984μs | 204.1478 KOps/s | 203.9570 KOps/s | |
test_contiguous[memmap_tensor0] | 15.1680μs | 0.5147μs | 1.9429 MOps/s | 1.9556 MOps/s | |
test_stack[memmap_tensor0] | 26.9500μs | 3.3687μs | 296.8499 KOps/s | 289.6538 KOps/s | |
test_memmaptd_index | 1.2587ms | 0.2280ms | 4.3861 KOps/s | 4.3516 KOps/s | |
test_memmaptd_index_astensor | 0.4926ms | 0.3140ms | 3.1851 KOps/s | 3.1626 KOps/s | |
test_memmaptd_index_op | 0.8673ms | 0.6060ms | 1.6502 KOps/s | 1.7293 KOps/s | |
test_serialize_model | 0.2180s | 0.1332s | 7.5060 Ops/s | 8.7201 Ops/s | |
test_serialize_model_pickle | 0.4917s | 0.4002s | 2.4988 Ops/s | 2.5143 Ops/s | |
test_serialize_weights | 0.1284s | 0.1175s | 8.5088 Ops/s | 8.6989 Ops/s | |
test_serialize_weights_returnearly | 0.1980s | 0.1637s | 6.1084 Ops/s | 5.5944 Ops/s | |
test_serialize_weights_pickle | 0.5401s | 0.4429s | 2.2577 Ops/s | 2.4510 Ops/s | |
test_serialize_weights_filesystem | 0.1635s | 0.1458s | 6.8577 Ops/s | 6.7876 Ops/s | |
test_serialize_model_filesystem | 0.1561s | 0.1494s | 6.6932 Ops/s | 6.5300 Ops/s | |
test_reshape_pytree | 61.5240μs | 26.0554μs | 38.3798 KOps/s | 37.9137 KOps/s | |
test_reshape_td | 73.3570μs | 33.2443μs | 30.0803 KOps/s | 30.3319 KOps/s | |
test_view_pytree | 81.6720μs | 26.2261μs | 38.1300 KOps/s | 36.5874 KOps/s | |
test_view_td | 0.1047ms | 41.2923μs | 24.2176 KOps/s | 22.2917 KOps/s | |
test_unbind_pytree | 68.1660μs | 29.3181μs | 34.1086 KOps/s | 33.8922 KOps/s | |
test_unbind_td | 0.2990ms | 39.5918μs | 25.2578 KOps/s | 25.3619 KOps/s | |
test_split_pytree | 0.1033ms | 29.3451μs | 34.0772 KOps/s | 34.3473 KOps/s | |
test_split_td | 0.5066ms | 45.1749μs | 22.1362 KOps/s | 21.9190 KOps/s | |
test_add_pytree | 0.2763ms | 35.3067μs | 28.3233 KOps/s | 28.0579 KOps/s | |
test_add_td | 0.1692ms | 59.2697μs | 16.8720 KOps/s | 18.0738 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1939ms | 67.6286μs | 14.7866 KOps/s | 15.2231 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.2952ms | 0.1723ms | 5.8028 KOps/s | 5.9028 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2030ms | 47.0969μs | 21.2328 KOps/s | 22.1178 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2433ms | 0.1169ms | 8.5562 KOps/s | 8.3979 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1042ms | 28.8206μs | 34.6974 KOps/s | 36.3074 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1217ms | 58.2831μs | 17.1576 KOps/s | 16.5841 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1652ms | 77.5121μs | 12.9012 KOps/s | 12.5391 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1226ms | 65.3761μs | 15.2961 KOps/s | 15.0415 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2032ms | 0.1069ms | 9.3565 KOps/s | 9.5089 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4030ms | 0.2160ms | 4.6296 KOps/s | 4.6363 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1963ms | 48.0967μs | 20.7915 KOps/s | 21.5145 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1424ms | 66.4191μs | 15.0559 KOps/s | 14.9921 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2164ms | 99.7040μs | 10.0297 KOps/s | 10.0162 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4672ms | 0.2024ms | 4.9397 KOps/s | 4.9553 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4952ms | 0.2314ms | 4.3210 KOps/s | 4.3425 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2334ms | 0.1112ms | 8.9902 KOps/s | 9.5904 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2609ms | 63.9969μs | 15.6257 KOps/s | 16.2594 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.3170ms | 48.7509μs | 20.5125 KOps/s | 21.3754 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2507ms | 0.1573ms | 6.3566 KOps/s | 6.3883 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2461ms | 0.1014ms | 9.8609 KOps/s | 10.1082 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1021ms | 21.0833μs | 47.4309 KOps/s | 48.0320 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1577ms | 66.8219μs | 14.9652 KOps/s | 15.2502 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1593ms | 81.2883μs | 12.3019 KOps/s | 12.4960 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1540ms | 67.1775μs | 14.8859 KOps/s | 15.0698 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2867ms | 0.2151ms | 4.6491 KOps/s | 4.7377 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.7632ms | 1.3818ms | 723.6811 Ops/s | 722.0411 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3277ms | 0.2108ms | 4.7445 KOps/s | 4.9025 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9170ms | 0.8261ms | 1.2104 KOps/s | 1.2112 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5842ms | 0.4586ms | 2.1805 KOps/s | 2.2314 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.5769ms | 2.8235ms | 354.1700 Ops/s | 370.8073 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1152ms | 40.3059μs | 24.8103 KOps/s | 26.3024 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5659ms | 33.7511μs | 29.6286 KOps/s | 30.1424 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 76.8130μs | 31.5524μs | 31.6933 KOps/s | 32.7168 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1181ms | 22.5860μs | 44.2752 KOps/s | 43.2656 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1154ms | 32.6349μs | 30.6421 KOps/s | 31.7593 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 63.2280μs | 22.1554μs | 45.1356 KOps/s | 43.1624 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1114ms | 53.0138μs | 18.8630 KOps/s | 18.6741 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4613ms | 20.0634μs | 49.8420 KOps/s | 49.2065 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 95.1270μs | 45.7950μs | 21.8365 KOps/s | 22.5420 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 94.5490μs | 18.9913μs | 52.6558 KOps/s | 53.4527 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1076ms | 46.7625μs | 21.3847 KOps/s | 22.0683 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 83.8960μs | 18.2671μs | 54.7433 KOps/s | 53.6742 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1324ms | 54.6673μs | 18.2925 KOps/s | 18.8128 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8534ms | 19.8560μs | 50.3626 KOps/s | 50.6036 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1192ms | 47.2124μs | 21.1809 KOps/s | 22.0102 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 73.8170μs | 18.5568μs | 53.8886 KOps/s | 53.4697 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1169ms | 47.2682μs | 21.1559 KOps/s | 21.9257 KOps/s | |
test_compile_indexing[int-pytree-eager] | 79.0770μs | 18.1244μs | 55.1743 KOps/s | 53.7966 KOps/s | |
test_mod_add[eager] | 90.1880μs | 37.4838μs | 26.6782 KOps/s | 28.2482 KOps/s | |
test_mod_add[compile] | 0.1343ms | 67.2020μs | 14.8805 KOps/s | 15.6907 KOps/s | |
test_mod_add[compile-overhead] | 0.1225ms | 63.2403μs | 15.8127 KOps/s | 15.8378 KOps/s | |
test_mod_wrap[eager] | 0.4618ms | 0.2248ms | 4.4481 KOps/s | 4.5261 KOps/s | |
test_mod_wrap[compile] | 2.0559ms | 0.2319ms | 4.3124 KOps/s | 4.4440 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3531ms | 0.2264ms | 4.4174 KOps/s | 4.5451 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.7989ms | 13.1256ms | 76.1872 Ops/s | 77.5695 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.4150ms | 11.3727ms | 87.9298 Ops/s | 88.5616 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 23.5749ms | 11.5792ms | 86.3618 Ops/s | 87.2851 Ops/s | |
test_seq_add[eager] | 0.2161ms | 0.1220ms | 8.1979 KOps/s | 8.2828 KOps/s | |
test_seq_add[compile] | 0.1578ms | 79.5699μs | 12.5676 KOps/s | 13.2035 KOps/s | |
test_seq_add[compile-overhead] | 0.1375ms | 78.0119μs | 12.8186 KOps/s | 13.4170 KOps/s | |
test_seq_wrap[eager] | 0.7208ms | 0.4484ms | 2.2301 KOps/s | 2.2284 KOps/s | |
test_seq_wrap[compile] | 0.8672ms | 0.2471ms | 4.0476 KOps/s | 4.2309 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3958ms | 0.2437ms | 4.1038 KOps/s | 4.2260 KOps/s | |
test_func_call_runtime[False-eager] | 0.9624ms | 0.5333ms | 1.8750 KOps/s | 1.8190 KOps/s | |
test_func_call_runtime[False-compile] | 0.8576ms | 0.4461ms | 2.2415 KOps/s | 2.2973 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8431ms | 0.4454ms | 2.2451 KOps/s | 2.3153 KOps/s | |
test_func_call_runtime[True-eager] | 0.8583ms | 0.7457ms | 1.3410 KOps/s | 1.3364 KOps/s | |
test_func_call_runtime[True-compile] | 0.7884ms | 0.4666ms | 2.1433 KOps/s | 2.2272 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7212ms | 0.4666ms | 2.1432 KOps/s | 2.2284 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8586ms | 0.5275ms | 1.8958 KOps/s | 1.8600 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5655ms | 0.4417ms | 2.2639 KOps/s | 2.3147 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7470ms | 0.4448ms | 2.2482 KOps/s | 2.3240 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4369ms | 0.8966ms | 1.1154 KOps/s | 1.1146 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2765ms | 0.7944ms | 1.2587 KOps/s | 1.2538 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1562ms | 0.7937ms | 1.2600 KOps/s | 1.2425 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6785ms | 1.8930ms | 528.2569 Ops/s | 525.0149 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6714ms | 0.5446ms | 1.8363 KOps/s | 1.8703 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0250ms | 0.5424ms | 1.8437 KOps/s | 1.8497 KOps/s | |
test_distributed | 0.2519ms | 0.1237ms | 8.0861 KOps/s | 7.8273 KOps/s | |
test_tdmodule | 65.8830μs | 27.9351μs | 35.7973 KOps/s | 36.5724 KOps/s | |
test_tdmodule_dispatch | 98.2130μs | 51.0001μs | 19.6078 KOps/s | 20.3048 KOps/s | |
test_tdseq | 46.8980μs | 28.8126μs | 34.7071 KOps/s | 34.9633 KOps/s | |
test_tdseq_dispatch | 90.4090μs | 55.3411μs | 18.0697 KOps/s | 18.3816 KOps/s | |
test_instantiation_functorch | 1.7533ms | 1.5169ms | 659.2431 Ops/s | 660.2769 Ops/s | |
test_exec_functorch | 0.3873ms | 0.1775ms | 5.6337 KOps/s | 5.6425 KOps/s | |
test_exec_functional_call | 0.4193ms | 0.1701ms | 5.8774 KOps/s | 5.9593 KOps/s | |
test_exec_td_decorator | 0.5216ms | 0.2319ms | 4.3117 KOps/s | 4.1910 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9217ms | 0.6575ms | 1.5209 KOps/s | 1.5271 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0037ms | 0.6666ms | 1.5001 KOps/s | 1.5321 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7525ms | 0.5265ms | 1.8992 KOps/s | 1.8471 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7769ms | 0.5293ms | 1.8892 KOps/s | 1.9017 KOps/s | |
test_to_module_speed[True] | 2.1472ms | 1.3232ms | 755.7208 Ops/s | 757.5219 Ops/s | |
test_to_module_speed[False] | 1.4540ms | 1.2847ms | 778.4065 Ops/s | 769.2371 Ops/s | |
test_tc_init | 0.1013ms | 48.4820μs | 20.6262 KOps/s | 22.1269 KOps/s | |
test_tc_init_nested | 0.2215ms | 96.3698μs | 10.3767 KOps/s | 11.0897 KOps/s | |
test_tc_first_layer_tensor | 18.8250μs | 1.5457μs | 646.9390 KOps/s | 632.3299 KOps/s | |
test_tc_first_layer_nontensor | 26.2980μs | 4.6956μs | 212.9650 KOps/s | 215.6281 KOps/s | |
test_tc_second_layer_tensor | 28.1920μs | 2.9301μs | 341.2892 KOps/s | 343.7539 KOps/s | |
test_tc_second_layer_nontensor | 27.1900μs | 6.0048μs | 166.5328 KOps/s | 166.2796 KOps/s | |
test_unbind | 0.2415s | 13.3193ms | 75.0792 Ops/s | 69.6191 Ops/s | |
test_full_like | 10.0623ms | 9.1869ms | 108.8508 Ops/s | 130.0277 Ops/s | |
test_zeros_like | 5.4620ms | 2.8719ms | 348.1979 Ops/s | 328.6395 Ops/s | |
test_ones_like | 6.3716ms | 3.5266ms | 283.5588 Ops/s | 277.0852 Ops/s | |
test_clone | 8.7912ms | 7.0195ms | 142.4597 Ops/s | 178.4710 Ops/s | |
test_squeeze | 60.3630μs | 12.7242μs | 78.5902 KOps/s | 77.1991 KOps/s | |
test_unsqueeze | 0.3020ms | 96.2662μs | 10.3879 KOps/s | 10.6901 KOps/s | |
test_split | 0.3536ms | 0.1984ms | 5.0412 KOps/s | 5.1385 KOps/s | |
test_permute | 0.3256ms | 0.2030ms | 4.9271 KOps/s | 4.9457 KOps/s | |
test_stack | 33.5030ms | 25.1353ms | 39.7847 Ops/s | 38.3561 Ops/s | |
test_cat | 25.6903ms | 25.1244ms | 39.8019 Ops/s | 38.7463 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 25, 2025
ghstack-source-id: ea71af4f2eb5813bc5b25ed595edda0cf4fa1438 Pull Request resolved: #1236
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.3310μs | 13.0562μs | 76.5917 KOps/s | 82.2422 KOps/s | |
test_plain_set_stack_nested | 46.9210μs | 13.2046μs | 75.7311 KOps/s | 81.8239 KOps/s | |
test_plain_set_nested_inplace | 45.5710μs | 14.2935μs | 69.9618 KOps/s | 75.1244 KOps/s | |
test_plain_set_stack_nested_inplace | 43.0810μs | 14.1508μs | 70.6675 KOps/s | 76.2230 KOps/s | |
test_items | 29.7310μs | 2.8711μs | 348.3036 KOps/s | 341.3312 KOps/s | |
test_items_nested | 0.4136ms | 0.3640ms | 2.7471 KOps/s | 2.7415 KOps/s | |
test_items_nested_locked | 0.4106ms | 0.3641ms | 2.7465 KOps/s | 2.7342 KOps/s | |
test_items_nested_leaf | 82.0420μs | 60.1865μs | 16.6150 KOps/s | 16.5860 KOps/s | |
test_items_stack_nested | 0.4175ms | 0.3608ms | 2.7713 KOps/s | 2.8062 KOps/s | |
test_items_stack_nested_leaf | 83.7420μs | 59.9731μs | 16.6742 KOps/s | 16.5464 KOps/s | |
test_items_stack_nested_locked | 0.4009ms | 0.3630ms | 2.7550 KOps/s | 2.7567 KOps/s | |
test_keys | 28.1500μs | 3.3980μs | 294.2924 KOps/s | 292.9632 KOps/s | |
test_keys_nested | 0.1145ms | 88.5165μs | 11.2973 KOps/s | 11.2489 KOps/s | |
test_keys_nested_locked | 0.7802ms | 93.9334μs | 10.6458 KOps/s | 10.6213 KOps/s | |
test_keys_nested_leaf | 0.1114ms | 79.5337μs | 12.5733 KOps/s | 12.5669 KOps/s | |
test_keys_stack_nested | 0.1222ms | 87.9784μs | 11.3664 KOps/s | 11.3551 KOps/s | |
test_keys_stack_nested_leaf | 0.1664ms | 78.9582μs | 12.6649 KOps/s | 12.7020 KOps/s | |
test_keys_stack_nested_locked | 0.1201ms | 93.3872μs | 10.7081 KOps/s | 10.5933 KOps/s | |
test_values | 5.3550μs | 0.8519μs | 1.1738 MOps/s | 1.1696 MOps/s | |
test_values_nested | 62.8810μs | 37.0518μs | 26.9892 KOps/s | 27.0121 KOps/s | |
test_values_nested_locked | 66.3120μs | 39.0041μs | 25.6383 KOps/s | 25.4905 KOps/s | |
test_values_nested_leaf | 66.8210μs | 42.1176μs | 23.7430 KOps/s | 23.4027 KOps/s | |
test_values_stack_nested | 0.1312ms | 37.0032μs | 27.0247 KOps/s | 26.7640 KOps/s | |
test_values_stack_nested_leaf | 86.0720μs | 41.9533μs | 23.8360 KOps/s | 23.4783 KOps/s | |
test_values_stack_nested_locked | 63.8810μs | 38.7440μs | 25.8105 KOps/s | 25.6085 KOps/s | |
test_membership | 4.3111μs | 0.4986μs | 2.0058 MOps/s | 1.9871 MOps/s | |
test_membership_nested | 15.8405μs | 1.9572μs | 510.9226 KOps/s | 515.5777 KOps/s | |
test_membership_nested_leaf | 19.9405μs | 1.9644μs | 509.0641 KOps/s | 514.0159 KOps/s | |
test_membership_stacked_nested | 28.2510μs | 2.0366μs | 491.0203 KOps/s | 489.1062 KOps/s | |
test_membership_stacked_nested_leaf | 26.6510μs | 2.0277μs | 493.1680 KOps/s | 498.3859 KOps/s | |
test_membership_nested_last | 30.9210μs | 3.0908μs | 323.5423 KOps/s | 334.1193 KOps/s | |
test_membership_nested_leaf_last | 29.4810μs | 3.0513μs | 327.7325 KOps/s | 336.2933 KOps/s | |
test_membership_stacked_nested_last | 35.2410μs | 2.9951μs | 333.8763 KOps/s | 330.6845 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.0400μs | 2.9711μs | 336.5728 KOps/s | 338.8160 KOps/s | |
test_nested_getleaf | 30.8500μs | 6.2098μs | 161.0347 KOps/s | 160.8394 KOps/s | |
test_nested_get | 26.4700μs | 5.9407μs | 168.3293 KOps/s | 169.3171 KOps/s | |
test_stacked_getleaf | 34.9110μs | 6.1109μs | 163.6422 KOps/s | 163.7771 KOps/s | |
test_stacked_get | 31.9110μs | 5.7964μs | 172.5195 KOps/s | 173.7930 KOps/s | |
test_nested_getitemleaf | 43.9010μs | 6.4195μs | 155.7764 KOps/s | 154.3585 KOps/s | |
test_nested_getitem | 34.9900μs | 6.0807μs | 164.4549 KOps/s | 163.6712 KOps/s | |
test_stacked_getitemleaf | 55.0010μs | 6.2938μs | 158.8874 KOps/s | 156.7052 KOps/s | |
test_stacked_getitem | 35.2610μs | 5.9920μs | 166.8903 KOps/s | 166.4132 KOps/s | |
test_lock_nested | 2.0289ms | 0.3381ms | 2.9576 KOps/s | 3.0154 KOps/s | |
test_lock_stack_nested | 0.3820ms | 0.3407ms | 2.9352 KOps/s | 2.9437 KOps/s | |
test_unlock_nested | 0.3499ms | 0.2814ms | 3.5535 KOps/s | 3.6552 KOps/s | |
test_unlock_stack_nested | 0.3415ms | 0.2795ms | 3.5782 KOps/s | 3.6326 KOps/s | |
test_flatten_speed | 0.1160ms | 76.9065μs | 13.0028 KOps/s | 13.0102 KOps/s | |
test_unflatten_speed | 0.3847ms | 0.3191ms | 3.1335 KOps/s | 3.1521 KOps/s | |
test_common_ops | 0.8062ms | 0.6349ms | 1.5751 KOps/s | 1.7143 KOps/s | |
test_creation | 0.1391ms | 1.6975μs | 589.0920 KOps/s | 586.7819 KOps/s | |
test_creation_empty | 32.1200μs | 9.6446μs | 103.6852 KOps/s | 132.8862 KOps/s | |
test_creation_nested_1 | 41.9610μs | 11.3829μs | 87.8513 KOps/s | 110.5945 KOps/s | |
test_creation_nested_2 | 40.2010μs | 13.9880μs | 71.4896 KOps/s | 84.2459 KOps/s | |
test_clone | 56.7310μs | 10.9061μs | 91.6918 KOps/s | 93.3755 KOps/s | |
test_getitem[int] | 1.2683ms | 10.5391μs | 94.8848 KOps/s | 94.9325 KOps/s | |
test_getitem[slice_int] | 0.1194ms | 20.6078μs | 48.5253 KOps/s | 49.6242 KOps/s | |
test_getitem[range] | 0.1281ms | 35.9271μs | 27.8341 KOps/s | 28.1202 KOps/s | |
test_getitem[tuple] | 0.1113ms | 17.7557μs | 56.3199 KOps/s | 56.3065 KOps/s | |
test_getitem[list] | 0.1256ms | 32.0520μs | 31.1993 KOps/s | 31.5324 KOps/s | |
test_setitem_dim[int] | 38.7900μs | 19.1385μs | 52.2506 KOps/s | 53.0483 KOps/s | |
test_setitem_dim[slice_int] | 57.8910μs | 36.6035μs | 27.3198 KOps/s | 26.8280 KOps/s | |
test_setitem_dim[range] | 73.6110μs | 50.7867μs | 19.6902 KOps/s | 19.4834 KOps/s | |
test_setitem_dim[tuple] | 53.9710μs | 31.9656μs | 31.2837 KOps/s | 31.7877 KOps/s | |
test_setitem | 69.8120μs | 16.1835μs | 61.7915 KOps/s | 68.5281 KOps/s | |
test_set | 70.6110μs | 15.4804μs | 64.5976 KOps/s | 71.5481 KOps/s | |
test_set_shared | 0.5099ms | 0.1570ms | 6.3695 KOps/s | 6.4408 KOps/s | |
test_update | 0.2337ms | 19.5079μs | 51.2612 KOps/s | 59.5093 KOps/s | |
test_update_nested | 79.9120μs | 25.6475μs | 38.9901 KOps/s | 45.2673 KOps/s | |
test_update__nested | 0.4973ms | 24.9689μs | 40.0498 KOps/s | 39.6712 KOps/s | |
test_set_nested | 71.6810μs | 17.1776μs | 58.2154 KOps/s | 64.2209 KOps/s | |
test_set_nested_new | 78.8920μs | 19.3755μs | 51.6117 KOps/s | 55.7376 KOps/s | |
test_select | 90.7120μs | 31.5851μs | 31.6605 KOps/s | 34.7791 KOps/s | |
test_select_nested | 72.1510μs | 43.4839μs | 22.9970 KOps/s | 22.8528 KOps/s | |
test_exclude_nested | 92.0320μs | 61.5167μs | 16.2558 KOps/s | 16.4224 KOps/s | |
test_empty[True] | 0.4003ms | 0.2915ms | 3.4309 KOps/s | 3.4357 KOps/s | |
test_empty[False] | 4.1220μs | 0.8192μs | 1.2207 MOps/s | 1.2053 MOps/s | |
test_to | 88.0510μs | 55.7634μs | 17.9329 KOps/s | 17.7252 KOps/s | |
test_to_nonblocking | 95.1720μs | 47.0498μs | 21.2541 KOps/s | 21.1172 KOps/s | |
test_unbind_speed | 0.2745ms | 0.2394ms | 4.1770 KOps/s | 4.2636 KOps/s | |
test_unbind_speed_stack0 | 0.2832ms | 0.2341ms | 4.2711 KOps/s | 4.3028 KOps/s | |
test_unbind_speed_stack1 | 92.3942ms | 0.7341ms | 1.3622 KOps/s | 1.3652 KOps/s | |
test_split | 94.8370ms | 1.5754ms | 634.7607 Ops/s | 642.3276 Ops/s | |
test_chunk | 94.2265ms | 1.5734ms | 635.5625 Ops/s | 638.8252 Ops/s | |
test_consolidate[False-None] | 96.3946ms | 2.9234ms | 342.0616 Ops/s | 344.7759 Ops/s | |
test_consolidate[default-None] | 1.7658ms | 1.6923ms | 590.9120 Ops/s | 607.8961 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.7539ms | 1.6963ms | 589.5296 Ops/s | 589.6191 Ops/s | |
test_consolidate_njt[False-None] | 6.7002ms | 6.3927ms | 156.4295 Ops/s | 156.2226 Ops/s | |
test_to[False-False-None] | 1.8348ms | 1.7459ms | 572.7753 Ops/s | 582.5018 Ops/s | |
test_to[True-False-None] | 1.5569ms | 1.3294ms | 752.1968 Ops/s | 774.7796 Ops/s | |
test_to[within-False-None] | 4.3408ms | 4.1261ms | 242.3574 Ops/s | 247.5707 Ops/s | |
test_to[True-default-None] | 5.7207ms | 5.2966ms | 188.7991 Ops/s | 193.2427 Ops/s | |
test_to_njt[False-False-None] | 6.9073ms | 6.7709ms | 147.6906 Ops/s | 142.1001 Ops/s | |
test_to_njt[True-False-None] | 5.5416ms | 5.3551ms | 186.7367 Ops/s | 177.3835 Ops/s | |
test_to_njt[within-False-None] | 12.0294ms | 11.5749ms | 86.3936 Ops/s | 83.9690 Ops/s | |
test_creation[device0] | 0.5483ms | 82.6365μs | 12.1012 KOps/s | 12.1267 KOps/s | |
test_creation_from_tensor | 0.5075ms | 84.7327μs | 11.8018 KOps/s | 11.5977 KOps/s | |
test_add_one[memmap_tensor0] | 0.4423ms | 6.7959μs | 147.1475 KOps/s | 150.1190 KOps/s | |
test_contiguous[memmap_tensor0] | 2.2291μs | 0.4634μs | 2.1578 MOps/s | 2.4155 MOps/s | |
test_stack[memmap_tensor0] | 43.3610μs | 4.2608μs | 234.6952 KOps/s | 236.5954 KOps/s | |
test_memmaptd_index | 1.5224ms | 0.2402ms | 4.1636 KOps/s | 4.2226 KOps/s | |
test_memmaptd_index_astensor | 0.4317ms | 0.3027ms | 3.3032 KOps/s | 3.3801 KOps/s | |
test_memmaptd_index_op | 0.7168ms | 0.5865ms | 1.7050 KOps/s | 1.8139 KOps/s | |
test_serialize_model | 0.1315s | 0.1307s | 7.6489 Ops/s | 7.6291 Ops/s | |
test_serialize_model_pickle | 1.3508s | 1.2113s | 0.8256 Ops/s | 0.8441 Ops/s | |
test_serialize_weights | 0.2818s | 0.1516s | 6.5953 Ops/s | 7.6659 Ops/s | |
test_serialize_weights_returnearly | 0.3313s | 53.4598ms | 18.7056 Ops/s | 11.7420 Ops/s | |
test_serialize_weights_pickle | 1.3834s | 1.1916s | 0.8392 Ops/s | 0.8219 Ops/s | |
test_reshape_pytree | 68.9110μs | 23.3210μs | 42.8797 KOps/s | 45.8770 KOps/s | |
test_reshape_td | 62.3310μs | 29.5097μs | 33.8872 KOps/s | 38.1605 KOps/s | |
test_view_pytree | 57.2510μs | 22.9984μs | 43.4813 KOps/s | 46.9702 KOps/s | |
test_view_td | 69.9210μs | 34.8409μs | 28.7019 KOps/s | 30.5299 KOps/s | |
test_unbind_pytree | 60.4510μs | 29.3571μs | 34.0634 KOps/s | 36.1095 KOps/s | |
test_unbind_td | 0.6892ms | 37.2687μs | 26.8321 KOps/s | 27.1219 KOps/s | |
test_split_pytree | 64.9910μs | 32.1113μs | 31.1416 KOps/s | 33.6919 KOps/s | |
test_split_td | 0.8228ms | 37.2050μs | 26.8781 KOps/s | 25.7887 KOps/s | |
test_add_pytree | 73.0420μs | 34.7075μs | 28.8122 KOps/s | 29.6139 KOps/s | |
test_add_td | 0.1918ms | 53.6110μs | 18.6529 KOps/s | 21.3055 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1703ms | 0.1187ms | 8.4254 KOps/s | 8.5080 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2225ms | 0.1320ms | 7.5754 KOps/s | 7.5613 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1520ms | 92.2934μs | 10.8350 KOps/s | 10.8608 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3263ms | 0.1460ms | 6.8486 KOps/s | 6.7671 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 69.3610μs | 31.9321μs | 31.3165 KOps/s | 34.7168 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 69.8320μs | 29.2330μs | 34.2079 KOps/s | 34.1776 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4484ms | 62.6460μs | 15.9627 KOps/s | 15.7978 KOps/s | |
test_compile_copy_nested[pytree-eager] | 94.5720μs | 48.2943μs | 20.7064 KOps/s | 20.4458 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1757ms | 0.1369ms | 7.3069 KOps/s | 7.2967 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3071ms | 0.2108ms | 4.7427 KOps/s | 4.6746 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1442ms | 98.2315μs | 10.1800 KOps/s | 10.6656 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1185ms | 55.5520μs | 18.0011 KOps/s | 18.5298 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1737ms | 0.1313ms | 7.6171 KOps/s | 7.5662 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5273ms | 0.4745ms | 2.1077 KOps/s | 2.1031 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3987ms | 0.2582ms | 3.8732 KOps/s | 3.8748 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1855ms | 0.1419ms | 7.0494 KOps/s | 7.2448 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1652ms | 68.9438μs | 14.5046 KOps/s | 15.1002 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1402ms | 99.7239μs | 10.0277 KOps/s | 10.5395 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4503ms | 0.4014ms | 2.4912 KOps/s | 2.4656 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1683ms | 0.1313ms | 7.6133 KOps/s | 7.5945 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 47.2510μs | 18.5092μs | 54.0273 KOps/s | 39.4582 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 59.5110μs | 30.9793μs | 32.2796 KOps/s | 31.4080 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1038ms | 68.5864μs | 14.5801 KOps/s | 14.7650 KOps/s | |
test_compile_copy_flat[pytree-eager] | 85.8710μs | 51.7739μs | 19.3148 KOps/s | 19.4673 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.5674ms | 0.3816ms | 2.6202 KOps/s | 2.2026 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7875ms | 2.6284ms | 380.4652 Ops/s | 369.4972 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5609ms | 0.3753ms | 2.6646 KOps/s | 2.3458 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.9763ms | 2.7269ms | 366.7214 Ops/s | 383.4566 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1895ms | 0.1183ms | 8.4547 KOps/s | 9.0064 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5606ms | 83.8706μs | 11.9231 KOps/s | 12.0134 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1814ms | 0.1156ms | 8.6529 KOps/s | 9.6852 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1247ms | 72.0712μs | 13.8752 KOps/s | 14.8284 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1632ms | 0.1141ms | 8.7671 KOps/s | 9.6096 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1160ms | 71.6943μs | 13.9481 KOps/s | 14.9899 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1427ms | 97.7247μs | 10.2328 KOps/s | 10.2430 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1581ms | 17.0170μs | 58.7647 KOps/s | 58.5538 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1397ms | 93.9723μs | 10.6414 KOps/s | 10.6980 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1174ms | 19.7042μs | 50.7507 KOps/s | 65.2124 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1559ms | 97.4108μs | 10.2658 KOps/s | 10.6556 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 53.2710μs | 16.3605μs | 61.1230 KOps/s | 65.0199 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1533ms | 0.1052ms | 9.5043 KOps/s | 9.8452 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5854ms | 17.9651μs | 55.6635 KOps/s | 60.1399 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1519ms | 97.8329μs | 10.2215 KOps/s | 10.6591 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 52.4810μs | 16.4781μs | 60.6868 KOps/s | 64.7656 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1574ms | 99.1330μs | 10.0875 KOps/s | 10.6717 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.7510μs | 16.0335μs | 62.3696 KOps/s | 64.9530 KOps/s | |
test_mod_add[eager] | 80.3410μs | 38.8662μs | 25.7293 KOps/s | 26.6322 KOps/s | |
test_mod_add[compile] | 0.1240ms | 78.8060μs | 12.6894 KOps/s | 12.5547 KOps/s | |
test_mod_add[compile-overhead] | 0.3139ms | 0.1635ms | 6.1168 KOps/s | 5.7059 KOps/s | |
test_mod_wrap[eager] | 0.3205ms | 0.2430ms | 4.1158 KOps/s | 3.8509 KOps/s | |
test_mod_wrap[compile] | 0.3263ms | 0.2785ms | 3.5908 KOps/s | 3.3647 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4311ms | 3.8903ms | 257.0501 Ops/s | 262.1870 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5473ms | 1.4289ms | 699.8575 Ops/s | 684.8577 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4425ms | 1.3386ms | 747.0351 Ops/s | 737.9138 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4893ms | 1.0128ms | 987.3971 Ops/s | 964.9581 Ops/s | |
test_seq_add[eager] | 0.1752ms | 0.1165ms | 8.5872 KOps/s | 8.4315 KOps/s | |
test_seq_add[compile] | 0.1460ms | 87.5439μs | 11.4228 KOps/s | 10.7457 KOps/s | |
test_seq_add[compile-overhead] | 0.1838ms | 0.1278ms | 7.8274 KOps/s | 7.6610 KOps/s | |
test_seq_wrap[eager] | 0.4839ms | 0.4203ms | 2.3794 KOps/s | 2.2776 KOps/s | |
test_seq_wrap[compile] | 0.3611ms | 0.2967ms | 3.3701 KOps/s | 3.1571 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2807ms | 0.2281ms | 4.3846 KOps/s | 4.3722 KOps/s | |
test_func_call_runtime[False-eager] | 0.8448ms | 0.7550ms | 1.3245 KOps/s | 1.2584 KOps/s | |
test_func_call_runtime[False-compile] | 0.9176ms | 0.7278ms | 1.3740 KOps/s | 1.3593 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4090ms | 0.3560ms | 2.8093 KOps/s | 2.8177 KOps/s | |
test_func_call_runtime[True-eager] | 0.9931ms | 0.8850ms | 1.1300 KOps/s | 1.1005 KOps/s | |
test_func_call_runtime[True-compile] | 0.8713ms | 0.7868ms | 1.2710 KOps/s | 1.3216 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4424ms | 0.3745ms | 2.6703 KOps/s | 2.6910 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7654ms | 0.7193ms | 1.3902 KOps/s | 1.3650 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8468ms | 0.7411ms | 1.3493 KOps/s | 1.3580 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4407ms | 0.3562ms | 2.8072 KOps/s | 2.8273 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0604ms | 0.9852ms | 1.0150 KOps/s | 993.9234 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.0960ms | 0.9755ms | 1.0251 KOps/s | 1.0114 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0956ms | 0.9716ms | 1.0292 KOps/s | 1.0045 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4944ms | 2.0726ms | 482.4938 Ops/s | 470.8302 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8980ms | 0.7983ms | 1.2527 KOps/s | 1.2624 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5548ms | 0.4085ms | 2.4480 KOps/s | 2.4542 KOps/s | |
test_distributed | 2.8786ms | 0.1872ms | 5.3427 KOps/s | 8.2695 KOps/s | |
test_tdmodule | 30.9610μs | 21.3521μs | 46.8338 KOps/s | 49.5152 KOps/s | |
test_tdmodule_dispatch | 59.9610μs | 37.5030μs | 26.6645 KOps/s | 28.1643 KOps/s | |
test_tdseq | 41.0510μs | 21.1444μs | 47.2939 KOps/s | 49.2647 KOps/s | |
test_tdseq_dispatch | 70.9610μs | 40.2655μs | 24.8352 KOps/s | 26.3837 KOps/s | |
test_instantiation_functorch | 1.6541ms | 1.5304ms | 653.4300 Ops/s | 658.6646 Ops/s | |
test_exec_functorch | 0.1923ms | 0.1426ms | 7.0103 KOps/s | 7.0900 KOps/s | |
test_exec_functional_call | 0.2137ms | 0.1358ms | 7.3625 KOps/s | 7.4880 KOps/s | |
test_exec_td_decorator | 0.3752ms | 0.1867ms | 5.3553 KOps/s | 5.4679 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7476ms | 0.6818ms | 1.4667 KOps/s | 1.4677 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8162ms | 0.6811ms | 1.4683 KOps/s | 1.4726 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7322ms | 0.5883ms | 1.6998 KOps/s | 1.6920 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7053ms | 0.5906ms | 1.6932 KOps/s | 1.6894 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.1390ms | 19.0740ms | 52.4273 Ops/s | 52.1690 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.9766ms | 19.1738ms | 52.1544 Ops/s | 52.1292 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.0351ms | 18.9221ms | 52.8483 Ops/s | 52.4749 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.7953ms | 18.9652ms | 52.7281 Ops/s | 52.5432 Ops/s | |
test_to_module_speed[True] | 1.0605ms | 0.9646ms | 1.0367 KOps/s | 1.0457 KOps/s | |
test_to_module_speed[False] | 1.2834ms | 0.9551ms | 1.0470 KOps/s | 1.0591 KOps/s | |
test_tc_init | 62.1910μs | 35.7775μs | 27.9505 KOps/s | 28.1726 KOps/s | |
test_tc_init_nested | 0.1563ms | 72.2190μs | 13.8468 KOps/s | 14.3370 KOps/s | |
test_tc_first_layer_tensor | 3.8643μs | 0.7071μs | 1.4142 MOps/s | 1.2606 MOps/s | |
test_tc_first_layer_nontensor | 39.5700μs | 2.2217μs | 450.1067 KOps/s | 450.6666 KOps/s | |
test_tc_second_layer_tensor | 18.8753μs | 1.4151μs | 706.6870 KOps/s | 706.4613 KOps/s | |
test_tc_second_layer_nontensor | 0.2034ms | 2.9569μs | 338.1888 KOps/s | 339.4571 KOps/s | |
test_unbind | 0.2180s | 10.0211ms | 99.7894 Ops/s | 145.6810 Ops/s | |
test_full_like | 10.1893ms | 9.0884ms | 110.0308 Ops/s | 108.7642 Ops/s | |
test_zeros_like | 11.5338ms | 8.5979ms | 116.3079 Ops/s | 234.6017 Ops/s | |
test_ones_like | 5.1121ms | 4.3269ms | 231.1121 Ops/s | 235.2688 Ops/s | |
test_clone | 6.6580ms | 6.3468ms | 157.5607 Ops/s | 110.0591 Ops/s | |
test_squeeze | 58.0920μs | 9.9025μs | 100.9844 KOps/s | 105.6169 KOps/s | |
test_unsqueeze | 0.1691ms | 72.2314μs | 13.8444 KOps/s | 13.5176 KOps/s | |
test_split | 0.3738ms | 0.1573ms | 6.3584 KOps/s | 6.2375 KOps/s | |
test_permute | 0.2324ms | 0.1832ms | 5.4596 KOps/s | 5.4333 KOps/s | |
test_stack | 50.2610ms | 50.0133ms | 19.9947 Ops/s | 19.9679 Ops/s | |
test_cat | 50.4678ms | 49.9448ms | 20.0221 Ops/s | 20.0147 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 26, 2025
ghstack-source-id: 8e47f46e83982d554237604f6ef7c845eeed1b50 Pull Request resolved: #1236
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):