Skip to content

[Feature] Add VLA token policy and tracking primitives#3874

Merged
vmoens merged 5 commits into
gh/vmoens/286/basefrom
gh/vmoens/286/head
Jun 18, 2026
Merged

[Feature] Add VLA token policy and tracking primitives#3874
vmoens merged 5 commits into
gh/vmoens/286/basefrom
gh/vmoens/286/head

Conversation

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3874

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7fadab0 with merge base 6364a19 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added 2 commits June 17, 2026 17:20
[ghstack-poisoned]
[ghstack-poisoned]
@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 7fadab0c vs main 8a0e35a1

Benchmark run: https://github.com/pytorch/rl/actions/runs/27750668303

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 192 benchmarks. Regressions over 5%: 11. Improvements over 5%: 17.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,038 3,717 +22.34%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 828.73 1,000 +20.71%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 121.42 140.30 +15.55%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 47.81 55.12 +15.29%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,037 3,491 +14.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 480.46 544.31 +13.29%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] 23.63 20.54 -13.09%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 372.76 420.70 +12.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,370 2,952 -12.40%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,967 2,204 +12.08%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,071 2,290 +10.60%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 260.37 284.88 +9.41%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,414 3,720 +8.98%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 841.88 768.83 -8.68%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 8,006 7,341 -8.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,449 3,176 -7.92%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,094 3,310 +6.98%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 392.87 419.30 +6.73%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] 213.02 226.71 +6.43%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 375.03 398.96 +6.38%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 536.28 503.63 -6.09%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 12,539 11,848 -5.51%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 617.89 651.75 +5.48%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 776.62 735.10 -5.35%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 5,083 4,812 -5.33%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-gru] 1.3485 1.2781 -5.22%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 119.70 125.89 +5.17%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 480.47 455.87 -5.12%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4491 1.3771 -4.97%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.4136 1.3441 -4.92%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 237.79 248.31 +4.42%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,306 6,986 -4.38%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 24,331 23,293 -4.27%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 24,744 23,748 -4.02%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 23,464 22,524 -4.00%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,432 3,568 +3.98%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[True-None] 217.65 226.29 +3.97%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 679.85 705.80 +3.82%
benchmarks/test_envs_benchmark.py::test_transformed 0.9116 0.9451 +3.67%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-constant] 4,361 4,202 -3.64%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 55.76 53.78 -3.56%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] 0.8513 0.8219 -3.45%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 29.57 30.59 +3.43%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 141.41 136.60 -3.40%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 1.9307 1.9950 +3.33%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,793 1,853 +3.32%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 30,886 31,890 +3.25%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 671.69 649.99 -3.23%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 535.45 552.71 +3.22%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] 341.13 352.08 +3.21%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 59.21 57.35 -3.14%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-backward] 55.67 53.94 -3.11%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 43,791 42,442 -3.08%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 39,445 38,246 -3.04%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 100.62 97.61 -3.00%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 24.57 25.30 +2.98%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,225 4,350 +2.95%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 275.12 283.17 +2.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.46 24.14 +2.92%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 176.97 171.89 -2.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 39,736 38,596 -2.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 21,806 21,184 -2.85%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 82.26 84.57 +2.81%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 29.19 28.39 -2.74%
benchmarks/test_envs_benchmark.py::test_parallel 0.9985 0.9730 -2.55%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 322.61 330.46 +2.43%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,735 29,429 +2.42%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 113.04 115.77 +2.42%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,934 21,438 +2.41%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,326 3,406 +2.40%
benchmarks/test_collectors_benchmark.py::test_single 9.1193 9.3326 +2.34%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 304.04 311.14 +2.33%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 0.5181 0.5300 +2.30%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 468.78 479.47 +2.28%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] 122.46 125.26 +2.28%
benchmarks/test_envs_benchmark.py::test_serial 0.5865 0.5997 +2.26%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 20,495 20,035 -2.24%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 26.36 26.94 +2.23%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 753.88 737.14 -2.22%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 20,444 19,993 -2.21%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 168.05 171.69 +2.16%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 172.41 175.93 +2.04%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] 3.0091 3.0697 +2.01%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 81.81 83.45 +2.01%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 38.56 37.79 -2.00%
benchmarks/test_collectors_benchmark.py::test_single_with_rb 8.8145 8.9880 +1.97%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 76.55 78.05 +1.96%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 173.90 177.14 +1.86%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] 274.04 279.03 +1.82%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 170.54 173.60 +1.79%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 36,367 35,719 -1.78%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 196.80 200.27 +1.76%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 43,867 43,097 -1.76%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] 49.12 49.98 +1.75%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 37.49 38.13 +1.72%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 697.70 709.69 +1.72%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 562.52 552.88 -1.71%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-None] 161.30 164.06 +1.71%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 375,070 381,394 +1.69%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 0.6920 0.7032 +1.63%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 209.07 212.46 +1.62%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 188.30 191.28 +1.58%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 90.84 92.28 +1.58%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 33,313 33,831 +1.55%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 176.03 178.73 +1.54%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 25.92 25.53 -1.53%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 66,154 65,147 -1.52%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 4.1149 4.1771 +1.51%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] 258.67 262.57 +1.51%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,758 1,785 +1.49%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 27.31 27.72 +1.49%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 557.25 565.51 +1.48%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 272.22 276.13 +1.44%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 112.27 113.86 +1.41%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 0.6033 0.6118 +1.40%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6070 0.6154 +1.38%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 59.51 58.69 -1.37%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 44,848 45,461 +1.37%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 35,542 35,061 -1.35%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 79,281 80,352 +1.35%
... ... ... Showing 120 of 192 comparisons, sorted by absolute change.

GPU

Compared 202 benchmarks. Regressions over 5%: 10. Improvements over 5%: 11.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 44.53 187.36 +320.71%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 196.50 46.75 -76.21%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 78.11 102.50 +31.22%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,948 3,784 +28.35%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,472 3,070 +24.19%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 3,993 4,585 +14.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 717.26 800.30 +11.58%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.3832 4.7612 -11.56%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 347.96 381.07 +9.51%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 303.04 276.42 -8.78%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 472.54 512.44 +8.44%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,263 2,082 -8.02%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 523.56 482.67 -7.81%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 377.24 406.39 +7.73%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 806.47 749.58 -7.05%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 53,418 50,010 -6.38%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,323 2,185 -5.92%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 60,730 57,141 -5.91%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 241.58 255.34 +5.70%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 135.07 142.25 +5.32%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 580.71 551.38 -5.05%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 47.47 49.79 +4.89%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,104 3,255 +4.86%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 35,823 34,101 -4.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,888 3,025 +4.77%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,942 2,029 +4.49%
benchmarks/test_envs_benchmark.py::test_simple 1.2603 1.2055 -4.35%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,435 3,577 +4.15%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 57,078 54,741 -4.10%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 24,106 23,129 -4.05%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 274.12 284.87 +3.92%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 755.55 726.84 -3.80%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 21,327 20,549 -3.65%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 696.55 720.90 +3.50%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 35,574 34,350 -3.44%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 22.86 23.65 +3.42%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 33,478 32,379 -3.28%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-constant] 4,956 4,794 -3.28%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,512 3,625 +3.23%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 48.56 50.02 +3.01%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] 75.81 78.09 +3.01%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 49.34 50.80 +2.96%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 347.24 357.37 +2.92%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-constant] 4,856 4,994 +2.84%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 897.62 922.97 +2.82%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.83 22.44 +2.81%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cuda] 2,299 2,236 -2.76%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 20.36 20.92 +2.76%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 48.58 49.86 +2.63%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 53.25 54.58 +2.50%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 38,643 37,678 -2.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 23.59 24.17 +2.44%
benchmarks/test_envs_benchmark.py::test_transformed 0.7210 0.7035 -2.43%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-None] 639.54 654.88 +2.40%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 509.60 521.50 +2.33%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] 6.5619 6.7119 +2.29%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.82 23.34 +2.26%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,871 2,936 +2.24%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 222.95 227.88 +2.21%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,377 4,473 +2.20%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 12,290 12,020 -2.19%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3307 1.3595 +2.16%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 134.00 136.89 +2.16%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 657.33 670.97 +2.08%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 21.88 22.32 +2.01%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 31.04 30.42 -1.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 64,716 63,448 -1.96%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 758.43 743.95 -1.91%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 31,044 30,458 -1.89%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] 5.4972 5.5993 +1.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,637 3,705 +1.86%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 22,503 22,093 -1.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 39,040 38,332 -1.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 470.69 479.06 +1.78%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.8843 9.0418 +1.77%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 29,705 29,198 -1.70%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 0.2280 0.2241 -1.69%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 20,138 19,801 -1.67%
benchmarks/test_collectors_benchmark.py::test_async_pixels 10.80 10.97 +1.66%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 401.90 395.28 -1.65%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 18,379 18,078 -1.64%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 368.81 374.85 +1.64%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 263.62 267.87 +1.61%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 64,714 63,669 -1.61%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,000 1,016 +1.60%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,927 1,897 -1.58%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 389.80 395.95 +1.58%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 319.16 324.17 +1.57%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 808.39 820.99 +1.56%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5360 0.5280 -1.50%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] 9.0032 8.8730 -1.45%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 370,860 376,158 +1.43%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 55.02 55.80 +1.42%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 871.13 883.36 +1.40%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 830.25 841.89 +1.40%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 453.38 459.70 +1.39%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 23,178 23,499 +1.39%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.72 13.53 -1.37%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 12.29 12.46 +1.36%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 71.04 71.99 +1.35%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1,287 1,304 +1.35%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.56 15.35 -1.31%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 85.00 86.09 +1.29%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 89.33 88.21 -1.26%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 35,302 34,868 -1.23%
benchmarks/test_envs_benchmark.py::test_serial 0.4255 0.4307 +1.22%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 309.96 313.70 +1.21%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 155.02 156.84 +1.18%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 0.6909 0.6830 -1.14%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 21,014 20,777 -1.13%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 19,054 18,840 -1.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 32,670 32,315 -1.09%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 100.76 99.71 -1.05%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 72.55 73.31 +1.05%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 167.52 165.76 -1.05%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6064 0.6126 +1.03%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-False] 1.6204 1.6040 -1.01%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 183.95 182.09 -1.01%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 26.77 26.51 -0.97%
benchmarks/test_collectors_benchmark.py::test_sync 10.54 10.45 -0.87%
... ... ... Showing 120 of 202 comparisons, sorted by absolute change.

vmoens added 2 commits June 17, 2026 18:55
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jun 18, 2026
@vmoens vmoens merged commit 7fadab0 into gh/vmoens/286/base Jun 18, 2026
110 checks passed
@vmoens vmoens deleted the gh/vmoens/286/head branch June 18, 2026 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Feature New feature Integrations/torch_geometric Integrations Modules Transforms tutorials/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant