v1.14
Backend updates
- llama-cpp-python: bump to 0.2.89.
- Transformers: bump to 4.44.
Other changes
- Model downloader: use a single session for all downloaded files to reduce the time to start each download.
- Add a
--tokenizer-dir
flag to be used withllamacpp_HF
.