Skip to content

AMD ROCm™ Software

AMD ROCm software is AMD's Open Source stack for GPU computation.

To learn more about ROCm, check out our Documentation, Examples, and Developer Hub.

If you have questions or need help, reach out to us on GitHub.

Popular repositories Loading

  1. ROCm ROCm Public

    AMD ROCm™ Software - GitHub Home

    Shell 4.9k 403

  2. HIP HIP Public

    HIP: C++ Heterogeneous-Compute Interface for Portability

    C++ 3.9k 549

  3. MIOpen MIOpen Public

    AMD's Machine Intelligence Library

    Assembly 1.1k 241

  4. tensorflow-upstream tensorflow-upstream Public

    Forked from tensorflow/tensorflow

    TensorFlow ROCm port

    C++ 690 97

  5. HIPIFY HIPIFY Public

    HIPIFY: Convert CUDA to Portable C++ Code

    C++ 545 79

  6. ROCm-docker ROCm-docker Public

    Dockerfiles for the various software layers defined in the ROCm software platform

    Shell 446 70

Repositories

Showing 10 of 301 repositories
  • rocprofiler-compute Public

    Advanced Profiling and Analytics for AMD Hardware

    ROCm/rocprofiler-compute’s past year of commit activity
    Python 139 MIT 51 49 11 Updated Feb 6, 2025
  • rocm-core Public
    ROCm/rocm-core’s past year of commit activity
    CMake 6 MIT 7 0 1 Updated Feb 6, 2025
  • jax Public Forked from jax-ml/jax

    Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

    ROCm/jax’s past year of commit activity
    Python 19 Apache-2.0 2,990 0 16 Updated Feb 6, 2025
  • rocprofiler Public

    ROC profiler library. Profiling with perf-counters and derived metrics.

    ROCm/rocprofiler’s past year of commit activity
    C 133 MIT 49 6 21 Updated Feb 6, 2025
  • amdsmi Public

    AMD SMI

    ROCm/amdsmi’s past year of commit activity
    C++ 52 MIT 31 7 8 Updated Feb 6, 2025
  • composable_kernel Public

    Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

    ROCm/composable_kernel’s past year of commit activity
    C++ 338 145 22 (1 issue needs help) 53 Updated Feb 6, 2025
  • pytorch Public Forked from pytorch/pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    ROCm/pytorch’s past year of commit activity
    Python 220 23,867 61 44 Updated Feb 6, 2025
  • flash-attention Public Forked from Dao-AILab/flash-attention

    Fast and memory-efficient exact attention

    ROCm/flash-attention’s past year of commit activity
    Python 153 BSD-3-Clause 1,452 23 6 Updated Feb 6, 2025
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    ROCm/vllm’s past year of commit activity
    Python 63 Apache-2.0 5,587 5 25 Updated Feb 6, 2025
  • rocSHMEM Public

    rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.

    ROCm/rocSHMEM’s past year of commit activity
    C++ 48 MIT 12 8 6 Updated Feb 5, 2025