GitHub - aschrein/vulkenstein: Toy software vulkan ICD implementation

Toy software Vulkan driver

Features

Naive software rasterization
- Tiled
- Triangles
Spirv compilation on CPU
- Packetization(multiple instances compiled into one kernel)
  - SIMD1/SIMD4/SIMD64
  - Control flow vectorization
- Vertex shaders
  - Minimal viable
- Pixel shaders
  - Minimal viable
2D Texture sampling
- Only nearest neighbor
Read/write 1D buffers

TODO

Software rasterization
- Lines
- AVX2
  - Experiment with int16 fixed point
- Multi-threading
Spirv compilation on CPU
- Optimizations
  - Tune for AVX2
  - Data transformation AOS->AOSOA
- Vertex shaders
  - Clipping distance
- Pixel shaders
  - Discards
  - Write depth
2D Texture sampling
- Implicit Mip selection with derivatives
Read/write 2D Textures

Control flow vectorization

Came up with the following algorithm for starters. It's not super efficient but easy to implement.

Calculate dominator tree.
Split edges into forward/backward.
Sort CFG using topological partial ordering defined by forward edges. Prioritize children within the same strongly connected component.
Allocate a mask register per basic block(uint64_t).
Each basic block clears its mask register on exit.
Jumping on a basic block is setting bits for active lanes.
For each basic block create a dispatch node that jumps to the basic block if any mask bit set or to the next dispatch node otherwise. That creates a 'dispatch chain' that skips basic blocks with all mask bits clears.
Conditional jumps are replaced with jumps to dispatch chains.
Back edges are unconditional jumps to the loop header

Example

HLSL Source:

[[vk::binding(0, 0)]] RWBuffer <uint> g_buf_0;
[[vk::binding(1, 0)]] RWBuffer <uint> g_buf_1;

uint get_num(uint t) {
  if (t < 888) {
     while (true) {
       t = (t ^ (t << 1)) + 1;
       if ((t & 7) == 7)
         continue;
       t = t * (t - 1) + 1;
       if (t > 200)
         return t;
       if ((t & 8) != 0)
         break;
     }
    t = (t << 2) + 1;
    return t;
  } else {
    return t + t * t * t;
  }
}

[numthreads(4, 1, 1)]
void main(uint3 tid : SV_DispatchThreadID)
{
  g_buf_0[tid.x] = get_num(g_buf_1[tid.x]);
}

Initial SPIRV CFG:

Linearized CFG:

LLVM IR for SIMD4 mode:

Run tests

cd vulkenstein
python3 tests/run_all_tests.py

Build

LLVM Version: 11.* commit 0d3149f43173967d6f4e4c5c904a05e1022071d4

Used for JIT code generation

Vulkan SDK: 1.2.135.0

Used for headers, spirv disassembly

LibPFC

Used for microbenchmarking on Linux

cd 3rdparty/libpfc
make
su
echo 0 > /proc/sys/kernel/nmi_watchdog
echo 2 > /sys/bus/event_source/devices/cpu/rdpmc
insmod pfc.ko

Vulkenstein

mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=1 .. && make

Reference

Whole-Function Vectorization
Introducing Control Flow into Vectorized Code
Aart J. C. Bik. The Software Vectorization Handbook. Intel Press, 2004.
Automatic SIMD Vectorization of SSA-based Control Flow Graphs
Loops: Presentation
CS447:CodeOptimization
EECS 583 – Class 2 Control Flow Analysis LLVM Introduction
Solving the structured control flow problem once and for all
NIR Docs
Formalizing Structured Control Flow Graphs

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
3rdparty		3rdparty
readme		readme
spv_stdlib		spv_stdlib
tests		tests
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Readme.md		Readme.md
icd_template.json		icd_template.json
ll.cpp		ll.cpp
ll_stdlib.cpp		ll_stdlib.cpp
raster.cpp		raster.cpp
simplefont.h		simplefont.h
spirv_to_llvm.cpp		spirv_to_llvm.cpp
spv_dump.hpp		spv_dump.hpp
uedit.cpp		uedit.cpp
utils.hpp		utils.hpp
vk.cpp		vk.cpp
vk.hpp		vk.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toy software Vulkan driver

Features

TODO

Control flow vectorization

Example

Run tests

Build

LLVM Version: 11.* commit 0d3149f43173967d6f4e4c5c904a05e1022071d4

Vulkan SDK: 1.2.135.0

LibPFC

Vulkenstein

Reference

About

Releases

Packages

Languages

aschrein/vulkenstein

Folders and files

Latest commit

History

Repository files navigation

Toy software Vulkan driver

Features

TODO

Control flow vectorization

Example

Run tests

Build

LLVM Version: 11.* commit 0d3149f43173967d6f4e4c5c904a05e1022071d4

Vulkan SDK: 1.2.135.0

LibPFC

Vulkenstein

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages