-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Feature] Update batch-size #1237
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 25, 2025
ghstack-source-id: 220610f880d3b6198ac8620f62b6c1881f83aeef Pull Request resolved: #1237
vmoens
added a commit
that referenced
this pull request
Feb 25, 2025
ghstack-source-id: 4a0f8b1256739eb0a475b45214f94365453ece7e Pull Request resolved: #1237
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.7820μs | 21.1024μs | 47.3880 KOps/s | 47.9420 KOps/s | |
test_plain_set_stack_nested | 43.2810μs | 20.9606μs | 47.7085 KOps/s | 47.7913 KOps/s | |
test_plain_set_nested_inplace | 69.8700μs | 22.3650μs | 44.7127 KOps/s | 44.1670 KOps/s | |
test_plain_set_stack_nested_inplace | 72.2340μs | 22.3666μs | 44.7095 KOps/s | 43.8952 KOps/s | |
test_items | 18.9950μs | 4.2012μs | 238.0286 KOps/s | 239.3096 KOps/s | |
test_items_nested | 0.6712ms | 0.4075ms | 2.4541 KOps/s | 2.4631 KOps/s | |
test_items_nested_locked | 0.5519ms | 0.4108ms | 2.4343 KOps/s | 2.4379 KOps/s | |
test_items_nested_leaf | 0.1446ms | 77.3147μs | 12.9342 KOps/s | 12.7832 KOps/s | |
test_items_stack_nested | 0.6195ms | 0.4099ms | 2.4398 KOps/s | 2.4182 KOps/s | |
test_items_stack_nested_leaf | 0.1516ms | 77.8964μs | 12.8376 KOps/s | 12.8970 KOps/s | |
test_items_stack_nested_locked | 0.8394ms | 0.4055ms | 2.4663 KOps/s | 2.4404 KOps/s | |
test_keys | 29.1140μs | 3.4323μs | 291.3457 KOps/s | 284.1525 KOps/s | |
test_keys_nested | 0.2736ms | 0.1630ms | 6.1367 KOps/s | 6.0211 KOps/s | |
test_keys_nested_locked | 1.7240ms | 0.1687ms | 5.9273 KOps/s | 5.8130 KOps/s | |
test_keys_nested_leaf | 0.2584ms | 0.1422ms | 7.0317 KOps/s | 6.9378 KOps/s | |
test_keys_stack_nested | 0.3953ms | 0.1632ms | 6.1260 KOps/s | 5.9100 KOps/s | |
test_keys_stack_nested_leaf | 0.2493ms | 0.1429ms | 6.9971 KOps/s | 6.9314 KOps/s | |
test_keys_stack_nested_locked | 0.3398ms | 0.1682ms | 5.9439 KOps/s | 5.7796 KOps/s | |
test_values | 6.1354μs | 1.0425μs | 959.2782 KOps/s | 974.2341 KOps/s | |
test_values_nested | 0.1230ms | 62.2033μs | 16.0763 KOps/s | 16.0168 KOps/s | |
test_values_nested_locked | 0.1159ms | 62.4589μs | 16.0105 KOps/s | 15.9607 KOps/s | |
test_values_nested_leaf | 0.1548ms | 74.3969μs | 13.4414 KOps/s | 13.9714 KOps/s | |
test_values_stack_nested | 0.1227ms | 61.9749μs | 16.1356 KOps/s | 15.8839 KOps/s | |
test_values_stack_nested_leaf | 0.1640ms | 71.2866μs | 14.0279 KOps/s | 13.9039 KOps/s | |
test_values_stack_nested_locked | 0.1431ms | 63.0913μs | 15.8500 KOps/s | 15.9068 KOps/s | |
test_membership | 2.6820μs | 0.6875μs | 1.4546 MOps/s | 1.1431 MOps/s | |
test_membership_nested | 30.1760μs | 2.9042μs | 344.3339 KOps/s | 345.6056 KOps/s | |
test_membership_nested_leaf | 43.3300μs | 2.9331μs | 340.9419 KOps/s | 344.0575 KOps/s | |
test_membership_stacked_nested | 26.1790μs | 2.9007μs | 344.7423 KOps/s | 347.4212 KOps/s | |
test_membership_stacked_nested_leaf | 26.5190μs | 2.8828μs | 346.8815 KOps/s | 346.0047 KOps/s | |
test_membership_nested_last | 25.9380μs | 4.2975μs | 232.6910 KOps/s | 227.8457 KOps/s | |
test_membership_nested_leaf_last | 0.1344ms | 4.4066μs | 226.9319 KOps/s | 228.6534 KOps/s | |
test_membership_stacked_nested_last | 0.1825ms | 4.5022μs | 222.1122 KOps/s | 229.9165 KOps/s | |
test_membership_stacked_nested_leaf_last | 44.4820μs | 4.3353μs | 230.6663 KOps/s | 231.6057 KOps/s | |
test_nested_getleaf | 46.2360μs | 10.5248μs | 95.0134 KOps/s | 95.7564 KOps/s | |
test_nested_get | 50.3830μs | 10.2043μs | 97.9982 KOps/s | 100.1947 KOps/s | |
test_stacked_getleaf | 50.3540μs | 10.5817μs | 94.5032 KOps/s | 94.6996 KOps/s | |
test_stacked_get | 35.7260μs | 10.3299μs | 96.8066 KOps/s | 99.5334 KOps/s | |
test_nested_getitemleaf | 54.6320μs | 11.3064μs | 88.4453 KOps/s | 89.6599 KOps/s | |
test_nested_getitem | 0.3387ms | 10.7079μs | 93.3891 KOps/s | 92.5268 KOps/s | |
test_stacked_getitemleaf | 29.7950μs | 11.1584μs | 89.6188 KOps/s | 90.4044 KOps/s | |
test_stacked_getitem | 55.0220μs | 10.7850μs | 92.7215 KOps/s | 93.6894 KOps/s | |
test_lock_nested | 0.7488ms | 0.4067ms | 2.4586 KOps/s | 2.4518 KOps/s | |
test_lock_stack_nested | 0.5619ms | 0.4211ms | 2.3747 KOps/s | 2.3504 KOps/s | |
test_unlock_nested | 0.7339ms | 0.3325ms | 3.0079 KOps/s | 2.9764 KOps/s | |
test_unlock_stack_nested | 0.6184ms | 0.3393ms | 2.9468 KOps/s | 2.9292 KOps/s | |
test_flatten_speed | 0.1916ms | 99.6250μs | 10.0376 KOps/s | 9.9905 KOps/s | |
test_unflatten_speed | 1.0653ms | 0.5218ms | 1.9166 KOps/s | 1.9298 KOps/s | |
test_common_ops | 1.0136ms | 0.8113ms | 1.2326 KOps/s | 1.2670 KOps/s | |
test_creation | 40.9570μs | 2.5014μs | 399.7742 KOps/s | 407.4700 KOps/s | |
test_creation_empty | 36.7490μs | 12.3906μs | 80.7063 KOps/s | 81.7519 KOps/s | |
test_creation_nested_1 | 66.6150μs | 15.1277μs | 66.1037 KOps/s | 65.7514 KOps/s | |
test_creation_nested_2 | 55.1230μs | 19.5597μs | 51.1255 KOps/s | 51.0131 KOps/s | |
test_clone | 66.6240μs | 13.7512μs | 72.7209 KOps/s | 76.8391 KOps/s | |
test_getitem[int] | 0.7811ms | 12.7013μs | 78.7320 KOps/s | 79.9705 KOps/s | |
test_getitem[slice_int] | 0.1274ms | 24.5975μs | 40.6545 KOps/s | 41.4608 KOps/s | |
test_getitem[range] | 0.1652ms | 51.5781μs | 19.3881 KOps/s | 19.6270 KOps/s | |
test_getitem[tuple] | 0.1227ms | 20.4825μs | 48.8223 KOps/s | 50.3137 KOps/s | |
test_getitem[list] | 0.1531ms | 45.9530μs | 21.7614 KOps/s | 21.9084 KOps/s | |
test_setitem_dim[int] | 44.9030μs | 25.1302μs | 39.7928 KOps/s | 38.1304 KOps/s | |
test_setitem_dim[slice_int] | 94.3460μs | 51.7275μs | 19.3321 KOps/s | 19.1565 KOps/s | |
test_setitem_dim[range] | 0.1303ms | 77.7197μs | 12.8667 KOps/s | 12.7761 KOps/s | |
test_setitem_dim[tuple] | 72.2750μs | 41.0122μs | 24.3830 KOps/s | 23.8304 KOps/s | |
test_setitem | 86.2700μs | 21.5248μs | 46.4581 KOps/s | 48.2153 KOps/s | |
test_set | 83.5160μs | 21.1119μs | 47.3667 KOps/s | 49.3455 KOps/s | |
test_set_shared | 4.1846ms | 0.1820ms | 5.4954 KOps/s | 5.4243 KOps/s | |
test_update | 0.1192ms | 26.6314μs | 37.5497 KOps/s | 42.6315 KOps/s | |
test_update_nested | 0.1198ms | 42.4217μs | 23.5728 KOps/s | 29.5986 KOps/s | |
test_update__nested | 0.4427ms | 34.1743μs | 29.2618 KOps/s | 29.8542 KOps/s | |
test_set_nested | 80.8000μs | 23.1744μs | 43.1511 KOps/s | 45.1416 KOps/s | |
test_set_nested_new | 84.2860μs | 27.1999μs | 36.7648 KOps/s | 37.0853 KOps/s | |
test_select | 97.6010μs | 44.2706μs | 22.5884 KOps/s | 22.9895 KOps/s | |
test_select_nested | 0.1231ms | 63.0972μs | 15.8486 KOps/s | 15.8852 KOps/s | |
test_exclude_nested | 0.1483ms | 80.5668μs | 12.4121 KOps/s | 12.3482 KOps/s | |
test_empty[True] | 0.5177ms | 0.4065ms | 2.4602 KOps/s | 2.4321 KOps/s | |
test_empty[False] | 10.6873μs | 1.3743μs | 727.6631 KOps/s | 721.7556 KOps/s | |
test_unbind_speed | 0.6401ms | 0.2663ms | 3.7546 KOps/s | 3.7141 KOps/s | |
test_unbind_speed_stack0 | 0.4145ms | 0.2652ms | 3.7700 KOps/s | 3.7193 KOps/s | |
test_unbind_speed_stack1 | 0.1098s | 0.7391ms | 1.3530 KOps/s | 1.2113 KOps/s | |
test_split | 0.1022s | 1.7610ms | 567.8632 Ops/s | 573.4723 Ops/s | |
test_chunk | 97.8720ms | 1.7370ms | 575.6976 Ops/s | 630.4284 Ops/s | |
test_consolidate_njt[False-None] | 11.5692ms | 8.2663ms | 120.9735 Ops/s | 105.6233 Ops/s | |
test_creation[device0] | 0.2161ms | 91.4440μs | 10.9357 KOps/s | 10.8518 KOps/s | |
test_creation_from_tensor | 4.1434ms | 96.1458μs | 10.4009 KOps/s | 10.3885 KOps/s | |
test_add_one[memmap_tensor0] | 78.8670μs | 5.1592μs | 193.8274 KOps/s | 207.9121 KOps/s | |
test_contiguous[memmap_tensor0] | 24.8360μs | 0.5044μs | 1.9826 MOps/s | 1.9485 MOps/s | |
test_stack[memmap_tensor0] | 29.0640μs | 3.4435μs | 290.4032 KOps/s | 301.9203 KOps/s | |
test_memmaptd_index | 1.1775ms | 0.2290ms | 4.3677 KOps/s | 4.2535 KOps/s | |
test_memmaptd_index_astensor | 0.6761ms | 0.3149ms | 3.1757 KOps/s | 3.1276 KOps/s | |
test_memmaptd_index_op | 0.8074ms | 0.5905ms | 1.6935 KOps/s | 1.6712 KOps/s | |
test_serialize_model | 0.2248s | 0.1309s | 7.6385 Ops/s | 8.6029 Ops/s | |
test_serialize_model_pickle | 0.4592s | 0.3947s | 2.5335 Ops/s | 2.4571 Ops/s | |
test_serialize_weights | 0.1273s | 0.1182s | 8.4602 Ops/s | 8.8244 Ops/s | |
test_serialize_weights_returnearly | 0.1708s | 0.1582s | 6.3201 Ops/s | 5.8005 Ops/s | |
test_serialize_weights_pickle | 0.5983s | 0.4303s | 2.3239 Ops/s | 2.4799 Ops/s | |
test_serialize_weights_filesystem | 0.2413s | 0.1549s | 6.4543 Ops/s | 7.0182 Ops/s | |
test_serialize_model_filesystem | 0.1537s | 0.1434s | 6.9749 Ops/s | 6.5852 Ops/s | |
test_reshape_pytree | 74.4990μs | 25.9619μs | 38.5179 KOps/s | 38.0025 KOps/s | |
test_reshape_td | 76.4720μs | 32.1307μs | 31.1229 KOps/s | 30.8585 KOps/s | |
test_view_pytree | 73.1060μs | 25.8564μs | 38.6751 KOps/s | 38.3810 KOps/s | |
test_view_td | 0.1104ms | 39.1544μs | 25.5399 KOps/s | 24.8938 KOps/s | |
test_unbind_pytree | 78.0850μs | 29.2066μs | 34.2389 KOps/s | 33.8340 KOps/s | |
test_unbind_td | 0.3124ms | 40.1071μs | 24.9332 KOps/s | 25.3711 KOps/s | |
test_split_pytree | 67.8070μs | 28.5693μs | 35.0026 KOps/s | 34.5958 KOps/s | |
test_split_td | 0.4875ms | 45.3923μs | 22.0302 KOps/s | 22.3296 KOps/s | |
test_add_pytree | 91.4510μs | 36.1981μs | 27.6258 KOps/s | 28.4486 KOps/s | |
test_add_td | 0.1328ms | 55.7950μs | 17.9228 KOps/s | 17.6344 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1446ms | 68.7787μs | 14.5394 KOps/s | 14.9655 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3470ms | 0.1710ms | 5.8489 KOps/s | 5.7159 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2715ms | 46.7841μs | 21.3748 KOps/s | 22.1832 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2197ms | 0.1194ms | 8.3778 KOps/s | 8.5217 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 88.2240μs | 28.7654μs | 34.7640 KOps/s | 35.8270 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1414ms | 58.8423μs | 16.9946 KOps/s | 17.1753 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1722ms | 78.6327μs | 12.7174 KOps/s | 12.4111 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1369ms | 66.1390μs | 15.1197 KOps/s | 14.7569 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2294ms | 0.1093ms | 9.1502 KOps/s | 9.4173 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4204ms | 0.2162ms | 4.6254 KOps/s | 4.6533 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.3343ms | 47.8475μs | 20.8997 KOps/s | 21.7682 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1662ms | 66.8820μs | 14.9517 KOps/s | 14.9889 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1857ms | 0.1028ms | 9.7294 KOps/s | 10.0633 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3912ms | 0.2068ms | 4.8346 KOps/s | 5.0068 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4257ms | 0.2373ms | 4.2143 KOps/s | 4.3302 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2252ms | 0.1082ms | 9.2448 KOps/s | 9.1936 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1459ms | 62.3616μs | 16.0355 KOps/s | 15.9978 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1011ms | 48.4373μs | 20.6452 KOps/s | 20.4983 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2891ms | 0.1604ms | 6.2351 KOps/s | 6.3918 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1959ms | 0.1026ms | 9.7435 KOps/s | 9.9457 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 55.1030μs | 22.3457μs | 44.7513 KOps/s | 46.5112 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1404ms | 66.5311μs | 15.0306 KOps/s | 15.0057 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1878ms | 82.9481μs | 12.0557 KOps/s | 12.2811 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1393ms | 65.9737μs | 15.1576 KOps/s | 14.6207 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3187ms | 0.2182ms | 4.5830 KOps/s | 4.6459 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.6873ms | 1.3774ms | 726.0146 Ops/s | 728.4557 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3320ms | 0.2142ms | 4.6696 KOps/s | 4.7609 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3211ms | 0.8366ms | 1.1953 KOps/s | 1.2317 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8714ms | 0.4690ms | 2.1324 KOps/s | 2.1796 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.0772ms | 2.7074ms | 369.3591 Ops/s | 359.5748 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1087ms | 38.6174μs | 25.8951 KOps/s | 25.9378 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5530ms | 34.4664μs | 29.0138 KOps/s | 31.0271 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1392ms | 31.7279μs | 31.5180 KOps/s | 32.9233 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 70.7420μs | 23.2479μs | 43.0146 KOps/s | 43.4463 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1043ms | 31.9132μs | 31.3350 KOps/s | 32.0554 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.3292ms | 23.3388μs | 42.8472 KOps/s | 44.2338 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 99.3240μs | 52.7821μs | 18.9458 KOps/s | 18.9132 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3862ms | 20.0291μs | 49.9274 KOps/s | 50.3249 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 98.6830μs | 45.4663μs | 21.9943 KOps/s | 22.0454 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 90.6590μs | 18.4634μs | 54.1613 KOps/s | 54.2161 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 93.4740μs | 46.2581μs | 21.6178 KOps/s | 21.5428 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 60.6320μs | 18.1772μs | 55.0138 KOps/s | 53.8413 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1055ms | 54.9532μs | 18.1973 KOps/s | 18.4699 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1555ms | 19.8432μs | 50.3952 KOps/s | 51.4346 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1137ms | 46.8930μs | 21.3252 KOps/s | 21.6231 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 51.0950μs | 18.3217μs | 54.5799 KOps/s | 54.2102 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1159ms | 46.0445μs | 21.7181 KOps/s | 21.5868 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.5270μs | 18.4521μs | 54.1944 KOps/s | 54.5200 KOps/s | |
test_mod_add[eager] | 84.7770μs | 36.6059μs | 27.3180 KOps/s | 27.6588 KOps/s | |
test_mod_add[compile] | 0.1190ms | 64.2407μs | 15.5665 KOps/s | 15.1748 KOps/s | |
test_mod_add[compile-overhead] | 0.1162ms | 62.8950μs | 15.8995 KOps/s | 15.1266 KOps/s | |
test_mod_wrap[eager] | 0.4528ms | 0.2265ms | 4.4159 KOps/s | 4.5585 KOps/s | |
test_mod_wrap[compile] | 1.6325ms | 0.2265ms | 4.4145 KOps/s | 4.4063 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3426ms | 0.2236ms | 4.4732 KOps/s | 4.4887 KOps/s | |
test_mod_wrap_and_backward[eager] | 15.6512ms | 12.6512ms | 79.0437 Ops/s | 77.0078 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.6367ms | 11.2872ms | 88.5957 Ops/s | 87.4800 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.3952ms | 11.1602ms | 89.6044 Ops/s | 89.1992 Ops/s | |
test_seq_add[eager] | 0.2115ms | 0.1197ms | 8.3516 KOps/s | 8.2842 KOps/s | |
test_seq_add[compile] | 0.1568ms | 75.9560μs | 13.1655 KOps/s | 13.1172 KOps/s | |
test_seq_add[compile-overhead] | 0.1401ms | 74.0360μs | 13.5069 KOps/s | 13.4115 KOps/s | |
test_seq_wrap[eager] | 0.6542ms | 0.4504ms | 2.2201 KOps/s | 2.2194 KOps/s | |
test_seq_wrap[compile] | 0.3543ms | 0.2413ms | 4.1443 KOps/s | 4.1693 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4408ms | 0.2382ms | 4.1983 KOps/s | 4.1917 KOps/s | |
test_func_call_runtime[False-eager] | 0.6751ms | 0.5468ms | 1.8289 KOps/s | 1.9087 KOps/s | |
test_func_call_runtime[False-compile] | 0.7304ms | 0.4429ms | 2.2578 KOps/s | 2.2841 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5625ms | 0.4415ms | 2.2648 KOps/s | 2.2840 KOps/s | |
test_func_call_runtime[True-eager] | 0.9485ms | 0.7588ms | 1.3179 KOps/s | 1.3655 KOps/s | |
test_func_call_runtime[True-compile] | 0.5730ms | 0.4655ms | 2.1484 KOps/s | 2.1784 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8364ms | 0.4697ms | 2.1291 KOps/s | 2.1780 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9251ms | 0.5386ms | 1.8566 KOps/s | 1.8877 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5464ms | 0.4442ms | 2.2512 KOps/s | 2.2684 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9295ms | 0.4438ms | 2.2531 KOps/s | 2.2873 KOps/s | |
test_func_call_cm_runtime[True-eager] | 2.0291ms | 0.9052ms | 1.1048 KOps/s | 1.1424 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2075ms | 0.8109ms | 1.2332 KOps/s | 1.2624 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1359ms | 0.8110ms | 1.2331 KOps/s | 1.2571 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4980ms | 1.9365ms | 516.3995 Ops/s | 521.5083 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6639ms | 0.5416ms | 1.8463 KOps/s | 1.8740 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9595ms | 0.5396ms | 1.8531 KOps/s | 1.8736 KOps/s | |
test_distributed | 0.2927ms | 0.1232ms | 8.1139 KOps/s | 7.8184 KOps/s | |
test_tdmodule | 85.1680μs | 28.4335μs | 35.1698 KOps/s | 35.0720 KOps/s | |
test_tdmodule_dispatch | 70.0800μs | 49.6598μs | 20.1370 KOps/s | 19.4398 KOps/s | |
test_tdseq | 48.6410μs | 29.4879μs | 33.9122 KOps/s | 33.3305 KOps/s | |
test_tdseq_dispatch | 0.1066ms | 56.3472μs | 17.7471 KOps/s | 17.7019 KOps/s | |
test_instantiation_functorch | 1.7848ms | 1.5173ms | 659.0657 Ops/s | 653.2233 Ops/s | |
test_exec_functorch | 0.4152ms | 0.1804ms | 5.5422 KOps/s | 5.5548 KOps/s | |
test_exec_functional_call | 0.3183ms | 0.1705ms | 5.8663 KOps/s | 5.8720 KOps/s | |
test_exec_td_decorator | 0.4792ms | 0.2320ms | 4.3109 KOps/s | 4.3650 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0249ms | 0.6833ms | 1.4634 KOps/s | 1.5074 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1199ms | 0.6704ms | 1.4915 KOps/s | 1.5061 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8593ms | 0.5431ms | 1.8414 KOps/s | 1.8698 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6841ms | 0.5440ms | 1.8382 KOps/s | 1.8657 KOps/s | |
test_to_module_speed[True] | 2.1777ms | 1.3357ms | 748.6658 Ops/s | 745.9697 Ops/s | |
test_to_module_speed[False] | 1.8069ms | 1.2983ms | 770.2349 Ops/s | 770.1279 Ops/s | |
test_tc_init | 79.4180μs | 46.4744μs | 21.5172 KOps/s | 21.5043 KOps/s | |
test_tc_init_nested | 0.2094ms | 93.6418μs | 10.6790 KOps/s | 10.8170 KOps/s | |
test_tc_first_layer_tensor | 13.8260μs | 1.5405μs | 649.1237 KOps/s | 632.9214 KOps/s | |
test_tc_first_layer_nontensor | 45.8150μs | 4.7423μs | 210.8672 KOps/s | 209.3304 KOps/s | |
test_tc_second_layer_tensor | 38.0500μs | 2.8730μs | 348.0667 KOps/s | 346.3609 KOps/s | |
test_tc_second_layer_nontensor | 0.1256ms | 6.0567μs | 165.1075 KOps/s | 162.7102 KOps/s | |
test_unbind | 0.2573s | 14.0226ms | 71.3137 Ops/s | 70.6906 Ops/s | |
test_full_like | 9.8143ms | 7.9821ms | 125.2809 Ops/s | 130.6380 Ops/s | |
test_zeros_like | 11.5573ms | 3.0500ms | 327.8708 Ops/s | 355.1023 Ops/s | |
test_ones_like | 4.7205ms | 3.3597ms | 297.6457 Ops/s | 316.3742 Ops/s | |
test_clone | 6.8610ms | 5.1970ms | 192.4205 Ops/s | 136.8937 Ops/s | |
test_squeeze | 62.5770μs | 12.5020μs | 79.9872 KOps/s | 79.8474 KOps/s | |
test_unsqueeze | 0.3066ms | 92.3953μs | 10.8231 KOps/s | 10.4063 KOps/s | |
test_split | 0.3605ms | 0.1949ms | 5.1297 KOps/s | 5.0326 KOps/s | |
test_permute | 0.4727ms | 0.2057ms | 4.8604 KOps/s | 4.8004 KOps/s | |
test_stack | 31.3461ms | 25.3638ms | 39.4262 Ops/s | 39.2075 Ops/s | |
test_cat | 30.7351ms | 25.1481ms | 39.7645 Ops/s | 38.7323 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 33.2500μs | 12.4023μs | 80.6302 KOps/s | 78.3379 KOps/s | |
test_plain_set_stack_nested | 55.9100μs | 12.5742μs | 79.5277 KOps/s | 76.9144 KOps/s | |
test_plain_set_nested_inplace | 39.4410μs | 13.5418μs | 73.8454 KOps/s | 71.5265 KOps/s | |
test_plain_set_stack_nested_inplace | 90.9220μs | 13.4016μs | 74.6181 KOps/s | 71.9709 KOps/s | |
test_items | 23.8300μs | 2.8889μs | 346.1490 KOps/s | 349.8119 KOps/s | |
test_items_nested | 0.4235ms | 0.3640ms | 2.7471 KOps/s | 2.7531 KOps/s | |
test_items_nested_locked | 0.4199ms | 0.3639ms | 2.7478 KOps/s | 2.7184 KOps/s | |
test_items_nested_leaf | 91.3420μs | 60.7460μs | 16.4620 KOps/s | 16.6554 KOps/s | |
test_items_stack_nested | 0.4116ms | 0.3657ms | 2.7347 KOps/s | 2.7498 KOps/s | |
test_items_stack_nested_leaf | 84.9310μs | 60.2190μs | 16.6061 KOps/s | 16.6181 KOps/s | |
test_items_stack_nested_locked | 0.4418ms | 0.3649ms | 2.7404 KOps/s | 2.7290 KOps/s | |
test_keys | 26.4000μs | 3.4448μs | 290.2888 KOps/s | 292.9828 KOps/s | |
test_keys_nested | 0.1231ms | 88.2558μs | 11.3307 KOps/s | 11.4207 KOps/s | |
test_keys_nested_locked | 0.7358ms | 93.8198μs | 10.6587 KOps/s | 10.8188 KOps/s | |
test_keys_nested_leaf | 96.7810μs | 79.8286μs | 12.5268 KOps/s | 12.7161 KOps/s | |
test_keys_stack_nested | 0.1204ms | 88.8353μs | 11.2568 KOps/s | 11.5023 KOps/s | |
test_keys_stack_nested_leaf | 0.1132ms | 79.4414μs | 12.5879 KOps/s | 12.7056 KOps/s | |
test_keys_stack_nested_locked | 0.1421ms | 93.3923μs | 10.7075 KOps/s | 10.7602 KOps/s | |
test_values | 5.3018μs | 0.8510μs | 1.1751 MOps/s | 1.1778 MOps/s | |
test_values_nested | 66.0810μs | 37.1940μs | 26.8861 KOps/s | 27.0521 KOps/s | |
test_values_nested_locked | 85.8510μs | 39.1211μs | 25.5616 KOps/s | 25.7913 KOps/s | |
test_values_nested_leaf | 0.1357ms | 42.0859μs | 23.7609 KOps/s | 23.7861 KOps/s | |
test_values_stack_nested | 65.3410μs | 37.3113μs | 26.8016 KOps/s | 26.9897 KOps/s | |
test_values_stack_nested_leaf | 73.0410μs | 42.1483μs | 23.7257 KOps/s | 23.6335 KOps/s | |
test_values_stack_nested_locked | 68.8910μs | 39.2510μs | 25.4770 KOps/s | 25.6439 KOps/s | |
test_membership | 9.1361μs | 0.4984μs | 2.0063 MOps/s | 1.9965 MOps/s | |
test_membership_nested | 31.7910μs | 2.1086μs | 474.2570 KOps/s | 487.4818 KOps/s | |
test_membership_nested_leaf | 18.4105μs | 2.0332μs | 491.8270 KOps/s | 501.7132 KOps/s | |
test_membership_stacked_nested | 38.6100μs | 2.0935μs | 477.6626 KOps/s | 475.3054 KOps/s | |
test_membership_stacked_nested_leaf | 32.7100μs | 2.1018μs | 475.7861 KOps/s | 487.9396 KOps/s | |
test_membership_nested_last | 34.4500μs | 3.1471μs | 317.7557 KOps/s | 322.0314 KOps/s | |
test_membership_nested_leaf_last | 34.3400μs | 3.1245μs | 320.0540 KOps/s | 327.6313 KOps/s | |
test_membership_stacked_nested_last | 41.9200μs | 3.0888μs | 323.7521 KOps/s | 328.0513 KOps/s | |
test_membership_stacked_nested_leaf_last | 44.9610μs | 3.0775μs | 324.9374 KOps/s | 327.7344 KOps/s | |
test_nested_getleaf | 45.3010μs | 6.2185μs | 160.8107 KOps/s | 159.0348 KOps/s | |
test_nested_get | 48.8610μs | 5.8441μs | 171.1113 KOps/s | 167.3499 KOps/s | |
test_stacked_getleaf | 38.1810μs | 6.1248μs | 163.2717 KOps/s | 162.2130 KOps/s | |
test_stacked_get | 31.3100μs | 5.7519μs | 173.8551 KOps/s | 172.0435 KOps/s | |
test_nested_getitemleaf | 34.6500μs | 6.4314μs | 155.4862 KOps/s | 156.3746 KOps/s | |
test_nested_getitem | 29.9210μs | 6.1716μs | 162.0337 KOps/s | 164.0984 KOps/s | |
test_stacked_getitemleaf | 39.7200μs | 6.4120μs | 155.9573 KOps/s | 158.2337 KOps/s | |
test_stacked_getitem | 27.7300μs | 6.0238μs | 166.0069 KOps/s | 168.6239 KOps/s | |
test_lock_nested | 0.4050ms | 0.3381ms | 2.9575 KOps/s | 2.8525 KOps/s | |
test_lock_stack_nested | 0.4774ms | 0.3488ms | 2.8669 KOps/s | 2.8387 KOps/s | |
test_unlock_nested | 0.3519ms | 0.2882ms | 3.4700 KOps/s | 3.5032 KOps/s | |
test_unlock_stack_nested | 0.3613ms | 0.2885ms | 3.4662 KOps/s | 3.4810 KOps/s | |
test_flatten_speed | 0.1099ms | 76.7665μs | 13.0265 KOps/s | 12.8635 KOps/s | |
test_unflatten_speed | 0.3648ms | 0.3228ms | 3.0983 KOps/s | 3.1292 KOps/s | |
test_common_ops | 0.8801ms | 0.6644ms | 1.5050 KOps/s | 1.5392 KOps/s | |
test_creation | 72.9510μs | 1.7399μs | 574.7474 KOps/s | 576.4006 KOps/s | |
test_creation_empty | 51.3110μs | 8.3363μs | 119.9568 KOps/s | 106.5890 KOps/s | |
test_creation_nested_1 | 0.1289ms | 10.1274μs | 98.7417 KOps/s | 90.2036 KOps/s | |
test_creation_nested_2 | 48.8810μs | 12.6715μs | 78.9170 KOps/s | 71.6079 KOps/s | |
test_clone | 62.2510μs | 11.1298μs | 89.8490 KOps/s | 89.5468 KOps/s | |
test_getitem[int] | 1.2670ms | 10.8578μs | 92.0993 KOps/s | 91.2568 KOps/s | |
test_getitem[slice_int] | 0.1145ms | 21.0541μs | 47.4967 KOps/s | 46.3641 KOps/s | |
test_getitem[range] | 0.1335ms | 40.5742μs | 24.6462 KOps/s | 24.4327 KOps/s | |
test_getitem[tuple] | 0.1170ms | 19.6629μs | 50.8572 KOps/s | 53.7756 KOps/s | |
test_getitem[list] | 0.1239ms | 34.6278μs | 28.8785 KOps/s | 28.6402 KOps/s | |
test_setitem_dim[int] | 51.9500μs | 21.2131μs | 47.1407 KOps/s | 49.7673 KOps/s | |
test_setitem_dim[slice_int] | 85.2310μs | 42.6234μs | 23.4613 KOps/s | 25.4875 KOps/s | |
test_setitem_dim[range] | 0.1786ms | 58.5054μs | 17.0924 KOps/s | 17.8966 KOps/s | |
test_setitem_dim[tuple] | 61.8510μs | 34.4499μs | 29.0276 KOps/s | 28.2211 KOps/s | |
test_setitem | 0.2022ms | 17.1486μs | 58.3137 KOps/s | 60.7098 KOps/s | |
test_set | 91.8110μs | 15.3611μs | 65.0997 KOps/s | 62.8221 KOps/s | |
test_set_shared | 0.5111ms | 0.1597ms | 6.2613 KOps/s | 6.0965 KOps/s | |
test_update | 0.4248ms | 20.6213μs | 48.4935 KOps/s | 50.8150 KOps/s | |
test_update_nested | 77.3910μs | 28.7442μs | 34.7896 KOps/s | 38.9185 KOps/s | |
test_update__nested | 0.4961ms | 26.8828μs | 37.1985 KOps/s | 38.2232 KOps/s | |
test_set_nested | 0.1488ms | 16.5700μs | 60.3502 KOps/s | 57.4696 KOps/s | |
test_set_nested_new | 52.1910μs | 20.0300μs | 49.9250 KOps/s | 50.6702 KOps/s | |
test_select | 0.1831ms | 32.8621μs | 30.4302 KOps/s | 30.9802 KOps/s | |
test_select_nested | 89.1510μs | 43.8908μs | 22.7838 KOps/s | 22.6131 KOps/s | |
test_exclude_nested | 0.1733ms | 63.2527μs | 15.8096 KOps/s | 15.7826 KOps/s | |
test_empty[True] | 0.3967ms | 0.2913ms | 3.4332 KOps/s | 3.4361 KOps/s | |
test_empty[False] | 3.6820μs | 0.8187μs | 1.2214 MOps/s | 1.2274 MOps/s | |
test_to | 90.4110μs | 56.8464μs | 17.5913 KOps/s | 17.2573 KOps/s | |
test_to_nonblocking | 0.1007ms | 50.1998μs | 19.9204 KOps/s | 21.5083 KOps/s | |
test_unbind_speed | 0.4026ms | 0.2640ms | 3.7872 KOps/s | 4.0282 KOps/s | |
test_unbind_speed_stack0 | 0.3295ms | 0.2516ms | 3.9741 KOps/s | 4.0739 KOps/s | |
test_unbind_speed_stack1 | 95.0645ms | 0.7444ms | 1.3433 KOps/s | 1.3282 KOps/s | |
test_split | 97.2283ms | 1.6051ms | 623.0291 Ops/s | 616.1161 Ops/s | |
test_chunk | 98.7518ms | 1.6207ms | 617.0078 Ops/s | 614.8887 Ops/s | |
test_consolidate[False-None] | 97.4986ms | 3.0435ms | 328.5662 Ops/s | 333.1507 Ops/s | |
test_consolidate[default-None] | 1.8522ms | 1.7249ms | 579.7357 Ops/s | 572.9070 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8917ms | 1.7614ms | 567.7272 Ops/s | 555.3391 Ops/s | |
test_consolidate_njt[False-None] | 6.8002ms | 6.4500ms | 155.0384 Ops/s | 152.9350 Ops/s | |
test_to[False-False-None] | 1.8819ms | 1.7211ms | 581.0095 Ops/s | 581.4957 Ops/s | |
test_to[True-False-None] | 1.5658ms | 1.3548ms | 738.0941 Ops/s | 724.1887 Ops/s | |
test_to[within-False-None] | 4.7387ms | 4.2163ms | 237.1757 Ops/s | 232.7361 Ops/s | |
test_to[True-default-None] | 5.5401ms | 5.3448ms | 187.0987 Ops/s | 192.8248 Ops/s | |
test_to_njt[False-False-None] | 7.4225ms | 7.0994ms | 140.8562 Ops/s | 144.7297 Ops/s | |
test_to_njt[True-False-None] | 5.8523ms | 5.5449ms | 180.3469 Ops/s | 180.6868 Ops/s | |
test_to_njt[within-False-None] | 12.4536ms | 12.2490ms | 81.6395 Ops/s | 82.5738 Ops/s | |
test_creation[device0] | 0.4451ms | 83.2468μs | 12.0125 KOps/s | 11.7182 KOps/s | |
test_creation_from_tensor | 0.5476ms | 86.5993μs | 11.5474 KOps/s | 11.3382 KOps/s | |
test_add_one[memmap_tensor0] | 86.5410μs | 7.0275μs | 142.2971 KOps/s | 140.9419 KOps/s | |
test_contiguous[memmap_tensor0] | 5.3841μs | 0.4270μs | 2.3421 MOps/s | 2.3815 MOps/s | |
test_stack[memmap_tensor0] | 32.6600μs | 4.6573μs | 214.7173 KOps/s | 207.1545 KOps/s | |
test_memmaptd_index | 1.4941ms | 0.2447ms | 4.0862 KOps/s | 3.9969 KOps/s | |
test_memmaptd_index_astensor | 0.4385ms | 0.3044ms | 3.2852 KOps/s | 3.1838 KOps/s | |
test_memmaptd_index_op | 0.7299ms | 0.5820ms | 1.7183 KOps/s | 1.6508 KOps/s | |
test_serialize_model | 0.1316s | 0.1311s | 7.6273 Ops/s | 7.6209 Ops/s | |
test_serialize_model_pickle | 1.3509s | 1.1868s | 0.8426 Ops/s | 0.8400 Ops/s | |
test_serialize_weights | 0.1319s | 0.1305s | 7.6652 Ops/s | 7.6026 Ops/s | |
test_serialize_weights_returnearly | 0.3485s | 55.2986ms | 18.0836 Ops/s | 23.3958 Ops/s | |
test_serialize_weights_pickle | 1.3750s | 1.2167s | 0.8219 Ops/s | 0.8242 Ops/s | |
test_reshape_pytree | 54.0810μs | 22.3285μs | 44.7857 KOps/s | 45.8557 KOps/s | |
test_reshape_td | 0.1541ms | 27.3629μs | 36.5459 KOps/s | 37.9590 KOps/s | |
test_view_pytree | 67.1410μs | 22.0967μs | 45.2555 KOps/s | 46.2264 KOps/s | |
test_view_td | 76.3710μs | 31.2411μs | 32.0091 KOps/s | 30.8025 KOps/s | |
test_unbind_pytree | 85.9310μs | 28.4654μs | 35.1304 KOps/s | 35.8248 KOps/s | |
test_unbind_td | 0.8470ms | 38.2284μs | 26.1585 KOps/s | 26.9433 KOps/s | |
test_split_pytree | 0.1590ms | 30.1338μs | 33.1853 KOps/s | 34.1242 KOps/s | |
test_split_td | 1.0059ms | 38.6332μs | 25.8845 KOps/s | 25.5686 KOps/s | |
test_add_pytree | 69.6600μs | 35.1398μs | 28.4578 KOps/s | 28.6154 KOps/s | |
test_add_td | 0.1670ms | 49.8865μs | 20.0455 KOps/s | 19.9528 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2709ms | 0.1237ms | 8.0859 KOps/s | 8.0150 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2906ms | 0.1361ms | 7.3477 KOps/s | 7.6766 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2462ms | 97.7001μs | 10.2354 KOps/s | 10.4834 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.4293ms | 0.1523ms | 6.5658 KOps/s | 6.6712 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1873ms | 34.4003μs | 29.0695 KOps/s | 43.4093 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1170ms | 28.9765μs | 34.5107 KOps/s | 33.8366 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4761ms | 64.0470μs | 15.6135 KOps/s | 15.8186 KOps/s | |
test_compile_copy_nested[pytree-eager] | 99.2020μs | 48.5864μs | 20.5819 KOps/s | 20.5932 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2357ms | 0.1430ms | 6.9927 KOps/s | 7.0769 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3529ms | 0.2183ms | 4.5800 KOps/s | 4.6600 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2417ms | 98.7179μs | 10.1299 KOps/s | 10.4078 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2180ms | 58.3525μs | 17.1372 KOps/s | 18.4333 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2769ms | 0.1445ms | 6.9189 KOps/s | 7.3398 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6981ms | 0.4947ms | 2.0214 KOps/s | 2.0622 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3954ms | 0.2613ms | 3.8276 KOps/s | 3.8388 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2950ms | 0.1461ms | 6.8442 KOps/s | 7.0500 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2587ms | 70.3457μs | 14.2155 KOps/s | 15.1430 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1472ms | 0.1011ms | 9.8950 KOps/s | 10.0379 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6196ms | 0.4221ms | 2.3689 KOps/s | 2.4368 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1836ms | 0.1365ms | 7.3247 KOps/s | 7.4907 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1961ms | 18.9809μs | 52.6844 KOps/s | 51.1662 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.2054ms | 31.0005μs | 32.2575 KOps/s | 31.7948 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1131ms | 70.0451μs | 14.2765 KOps/s | 14.4954 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1542ms | 51.4831μs | 19.4238 KOps/s | 19.1192 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6777ms | 0.4005ms | 2.4970 KOps/s | 2.2324 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8988ms | 2.6617ms | 375.6933 Ops/s | 370.5610 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6403ms | 0.4447ms | 2.2489 KOps/s | 2.2838 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8702ms | 2.6948ms | 371.0815 Ops/s | 373.1878 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2792ms | 0.1168ms | 8.5641 KOps/s | 8.8786 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5912ms | 80.8391μs | 12.3703 KOps/s | 12.4941 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2630ms | 0.1124ms | 8.8983 KOps/s | 9.4679 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2223ms | 72.2211μs | 13.8464 KOps/s | 14.2659 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2966ms | 0.1157ms | 8.6459 KOps/s | 9.3858 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2229ms | 72.7442μs | 13.7468 KOps/s | 14.7394 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2406ms | 0.1034ms | 9.6688 KOps/s | 9.9543 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2374ms | 17.3312μs | 57.6993 KOps/s | 58.7673 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2442ms | 97.0515μs | 10.3038 KOps/s | 10.2938 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1006ms | 15.7928μs | 63.3199 KOps/s | 63.5440 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2470ms | 0.1006ms | 9.9433 KOps/s | 10.2285 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1491ms | 15.6216μs | 64.0140 KOps/s | 57.7881 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2335ms | 0.1065ms | 9.3908 KOps/s | 9.8717 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6010ms | 17.2393μs | 58.0069 KOps/s | 56.9002 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2476ms | 99.9180μs | 10.0082 KOps/s | 10.2800 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 49.4100μs | 15.7176μs | 63.6230 KOps/s | 64.0819 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2467ms | 99.7056μs | 10.0295 KOps/s | 10.2547 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1603ms | 15.7275μs | 63.5829 KOps/s | 63.7773 KOps/s | |
test_mod_add[eager] | 0.1959ms | 39.6936μs | 25.1930 KOps/s | 24.8848 KOps/s | |
test_mod_add[compile] | 0.2274ms | 82.6139μs | 12.1045 KOps/s | 12.1249 KOps/s | |
test_mod_add[compile-overhead] | 0.3357ms | 0.1691ms | 5.9128 KOps/s | 5.6948 KOps/s | |
test_mod_wrap[eager] | 0.4317ms | 0.2565ms | 3.8981 KOps/s | 3.7958 KOps/s | |
test_mod_wrap[compile] | 0.4845ms | 0.2853ms | 3.5048 KOps/s | 3.4545 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0646ms | 3.7569ms | 266.1744 Ops/s | 261.4097 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5930ms | 1.3911ms | 718.8323 Ops/s | 690.3346 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3996ms | 1.2914ms | 774.3684 Ops/s | 726.2191 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3933ms | 0.9305ms | 1.0747 KOps/s | 929.7138 Ops/s | |
test_seq_add[eager] | 0.2939ms | 0.1175ms | 8.5096 KOps/s | 8.3172 KOps/s | |
test_seq_add[compile] | 0.2065ms | 91.6315μs | 10.9133 KOps/s | 11.1644 KOps/s | |
test_seq_add[compile-overhead] | 0.2554ms | 0.1307ms | 7.6523 KOps/s | 7.6539 KOps/s | |
test_seq_wrap[eager] | 0.5729ms | 0.4320ms | 2.3148 KOps/s | 2.2365 KOps/s | |
test_seq_wrap[compile] | 0.4488ms | 0.3040ms | 3.2894 KOps/s | 3.1596 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3767ms | 0.2274ms | 4.3972 KOps/s | 4.3434 KOps/s | |
test_func_call_runtime[False-eager] | 0.9572ms | 0.7667ms | 1.3043 KOps/s | 1.3467 KOps/s | |
test_func_call_runtime[False-compile] | 1.0240ms | 0.7544ms | 1.3256 KOps/s | 1.3320 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5064ms | 0.3678ms | 2.7189 KOps/s | 2.7055 KOps/s | |
test_func_call_runtime[True-eager] | 1.0327ms | 0.9069ms | 1.1027 KOps/s | 1.0993 KOps/s | |
test_func_call_runtime[True-compile] | 0.9820ms | 0.7772ms | 1.2867 KOps/s | 1.2767 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5480ms | 0.3910ms | 2.5574 KOps/s | 2.5669 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8801ms | 0.7384ms | 1.3543 KOps/s | 1.3375 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9075ms | 0.7575ms | 1.3201 KOps/s | 1.3122 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4826ms | 0.3671ms | 2.7240 KOps/s | 2.6921 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2287ms | 1.0094ms | 990.7113 Ops/s | 976.3917 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1415ms | 0.9991ms | 1.0009 KOps/s | 994.7598 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2260ms | 1.0305ms | 970.3893 Ops/s | 991.0205 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5463ms | 2.1137ms | 473.1031 Ops/s | 470.4754 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9844ms | 0.8250ms | 1.2121 KOps/s | 1.1469 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5643ms | 0.4207ms | 2.3767 KOps/s | 2.3825 KOps/s | |
test_distributed | 2.6389ms | 0.2528ms | 3.9565 KOps/s | 7.8485 KOps/s | |
test_tdmodule | 39.7210μs | 20.6520μs | 48.4215 KOps/s | 47.2521 KOps/s | |
test_tdmodule_dispatch | 0.3133ms | 36.8986μs | 27.1013 KOps/s | 26.2086 KOps/s | |
test_tdseq | 40.9500μs | 21.1988μs | 47.1725 KOps/s | 46.2214 KOps/s | |
test_tdseq_dispatch | 60.4110μs | 39.1511μs | 25.5421 KOps/s | 24.6052 KOps/s | |
test_instantiation_functorch | 1.9200ms | 1.5453ms | 647.1348 Ops/s | 632.2418 Ops/s | |
test_exec_functorch | 0.2173ms | 0.1466ms | 6.8197 KOps/s | 6.7155 KOps/s | |
test_exec_functional_call | 0.5410ms | 0.1431ms | 6.9873 KOps/s | 6.9949 KOps/s | |
test_exec_td_decorator | 0.4098ms | 0.2005ms | 4.9871 KOps/s | 5.1264 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1076ms | 0.7114ms | 1.4056 KOps/s | 1.4481 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1144ms | 0.6950ms | 1.4389 KOps/s | 1.4102 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.0271ms | 0.6175ms | 1.6194 KOps/s | 1.6010 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0189ms | 0.6026ms | 1.6594 KOps/s | 1.6066 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.0732ms | 19.5188ms | 51.2326 Ops/s | 51.2764 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.9327ms | 19.4875ms | 51.3149 Ops/s | 51.7396 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.5949ms | 19.4997ms | 51.2827 Ops/s | 50.5930 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.2915ms | 19.6066ms | 51.0033 Ops/s | 51.3773 Ops/s | |
test_to_module_speed[True] | 1.5180ms | 0.9624ms | 1.0391 KOps/s | 1.0380 KOps/s | |
test_to_module_speed[False] | 1.4261ms | 0.9453ms | 1.0579 KOps/s | 1.0637 KOps/s | |
test_tc_init | 0.1772ms | 34.0621μs | 29.3582 KOps/s | 28.1250 KOps/s | |
test_tc_init_nested | 0.4685ms | 67.8163μs | 14.7457 KOps/s | 13.7500 KOps/s | |
test_tc_first_layer_tensor | 24.5500μs | 0.7925μs | 1.2618 MOps/s | 1.2432 MOps/s | |
test_tc_first_layer_nontensor | 25.8510μs | 2.2340μs | 447.6303 KOps/s | 459.0809 KOps/s | |
test_tc_second_layer_tensor | 9.2378μs | 1.3967μs | 715.9950 KOps/s | 708.7822 KOps/s | |
test_tc_second_layer_nontensor | 0.3875ms | 2.9389μs | 340.2643 KOps/s | 343.6123 KOps/s | |
test_unbind | 0.2354s | 10.0940ms | 99.0688 Ops/s | 145.4760 Ops/s | |
test_full_like | 10.3245ms | 9.7083ms | 103.0044 Ops/s | 103.5905 Ops/s | |
test_zeros_like | 5.4436ms | 4.4110ms | 226.7079 Ops/s | 114.5448 Ops/s | |
test_ones_like | 9.3673ms | 7.2734ms | 137.4874 Ops/s | 228.3569 Ops/s | |
test_clone | 7.5651ms | 6.7695ms | 147.7211 Ops/s | 147.1171 Ops/s | |
test_squeeze | 98.9710μs | 9.8520μs | 101.5018 KOps/s | 102.0188 KOps/s | |
test_unsqueeze | 0.2065ms | 77.2383μs | 12.9469 KOps/s | 12.6867 KOps/s | |
test_split | 0.6664ms | 0.1635ms | 6.1145 KOps/s | 6.1189 KOps/s | |
test_permute | 0.3862ms | 0.1930ms | 5.1817 KOps/s | 5.1854 KOps/s | |
test_stack | 52.6359ms | 51.5933ms | 19.3824 Ops/s | 19.5552 Ops/s | |
test_cat | 52.0645ms | 51.5279ms | 19.4069 Ops/s | 23.2624 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 26, 2025
ghstack-source-id: e26f02bf3d8487332caa0066024fc40093de23b1 Pull Request resolved: #1237
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):