This example covers the following features on top of what was shown in the basic example:
- defining
__device__
functionsptx_add()
ptx_lop3()
- using C++ templates with
__device__
and__global__
functionsptx_lop3()
kernelLop3()
- using inline PTX Assembly
asm(...);
blocksptx_add()
ptx_lop3()
Build and run the example by following the general instructions.
PTX instructions used:
---8<--- "public/examples/src/ptx/ptx.cu"
---8<--- "public/examples/src/ptx/CMakeLists.txt"