Preview release of Intel® Extension for MLIR (IMEX)
Fixes / Improvements
Highlights
- XeTile Dialect : XeTile dialect supports the tile-based programming model and decomposes the GEMM kernel to large pre-defined tile sizes at the subgroup and workgroup level.
- XeGPU Dialect: The XeGPU dialect models Xe instructions like DPAS and 2D block load/store.
- Lowering from subgroup level XeTile to XeGPU VC Mode.
- Lowering from XeGPU to SPIR-V.
- XeGPU VC Mode to VC Intrinsics (Covers all XeGPU ops)
- XeGPU to GenISA Intrinsics & Joint Matrix (Ops supported : create_nd_descriptor, update_nd_offset, load_nd, store_nd, dpas)
- Dialect/Op, conversion & integration test cases for XeTile & XeGPU.
- High performance end-to-end GEMM code example based on SPIR-V dialect.
- RFC summarizing XeTile & XeGPU design
List of all changes
Dependencies revisions
Project | Revision |
---|---|
LLVM Project | 49af650 |
Supported System configurations
- Ubuntu 22.04 LTS
- x86 CPU
- Intel® Data Center GPU Max Series
Supported data types for GPU
- FP32
- FP16
- BF16
- I32
- I16
- I8
Limitations
- For GEMM end-to-end test cases we only support FP16 & BF16 data types.
- IMEX v0.3 does not support TF32 & FP64.
- When the input program uses multiple tile_mma op, tensor used as A matrix in one tile_mma can not be used as B matrix in another.
- XeTile fusion use case is not fully supported.
- XeGPU doesn't support SLM use case.
Dependencies for GPU execution
GPU execution supports two different wrapper libraries for interacting with GPU. Level Zero and Sycl wrapper libraries.
- oneAPI Level Zero: https://github.com/oneapi-src/level-zero (Required for both Level Zero and Sycl wrapper)
- Intel® oneAPI DPC++/C++ Compiler: https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#dpcpp-cpp (Required for Sycl wrapper)