It's a tensor library in C++.
- Somewhat fast, on macs atleast.
- Somewhat educational, with clean code (gripe with ggml).
- Capable of loading an LLM.
- Trainable, with 0 dependencies (Mostly for learning).
- Make sure metal compiler is installed.
make DEBUG=1 RUN_METAL=1
ormake DEBUG=1 RUN_METAL=1 rebuild
for a fresh build../run_tests
- Tensors are by default lazy if not present on CPU. They can be realized and printed by moving to the CPU.
- Example of a tensor addition -
#include "tensor.hpp"
using namespace tensorlib;
// Inside function body
// --- Contents ---, -Shape-
Tensor t1(vector<int>{1, 2, 3, 4, 5, 6}, {2, 3});
t1.to("gpu");
Tensor t2(vector<int>{4, 5, 6, 7, 8, 9}, {2, 3});
t2.to("gpu");
// If tensors on GPU, result will be on GPU
Tensor result = t1 + t2;
result.to("cpu");
std::cout << result << std::endl;
// Expected out - "Tensor([[5,7,9],[11,13,15]], dtype=i32, device=cpu)"
- Shape tracking and inference.
- More ops and efficient kernels.
- Backprop.
- Better kernels for metal.
- Treat CPU as an accelerator, avx and all that [WIP].
- Graph creation/opt - Ast2IR, DCE, CSE [WIP].