ChadGPT

OpenAI GPT Link

Centeral Iowa Linux Users Group
19 April 2023

News for April 2023

CFP open for 2023 Linux Plumbers Conference (November 13-15, Richmond VA, USA)
Kernel Samepage Merging patch set
Python 3.12 support for perf tool.
Chrome ships WebGPU
StableLM released today.
Rust foundation meltdown - Primeagen stream - Oxide Computer discussion

Presentation: Self-hosting large language models

What is a large language model(LLM)?
What can I do with a LLM?
Run your own LLM - llama.cpp and web-llm

What is a large languge model (LLM)?

A LLM is a data structure. Given k tokens, it outputs the probability of the next token.
ChatGPT4 is approching Library of Congress training size. We are running out of data.
New data will probably come from reflection - prompting an LLM with it's own output for deeper insight.
BabyGPT - a three bit LLM
Tokenize - Train - Infer trained model - Quantize model to shrink size
OpenAI TickToken - high performance tokenizer.
LLM Training can cost millions - OpenAI burned spare GPU at Azure WDM after Bitcoin crash as tax writeoff.
LLM training and inference involve mostly tensor (matrix) operations.
Cerebras wafer scale chips - Tenstorrent accelerator chips
Once you have a LLM - fine-tuning can cost as low as $3
vast.ai - lambdalabs - Google Colab Notebooks
PicoGPT
NanoGPT - video tutorial

What can I do with an LLM?

ChatGPT4 Demo - GitHub Copilot demo if you want
whisper.cpp - use OpenAI whisper to transcribe audio to text - WANTED live transcription for meetings.
AutoGPT - BabyAGI - use GPT and scripts to drive other GPTs and scripts.
Reddit GPT has good weekly briefings.

Run your own LLM - llama.cpp and web-llm

llama.cpp - a fork of whisper.cpp - most widely used C++ code to host your own LLM.
Huggingface - stores open models as Git LFS.
web-llm - uses WebGPU to run in the LLM in your browser.

Linux Predictions

Linux 7 will have a LLM of various sizes and an SMT solver to prove responses correct.
CGROUPS3 - closer to AWS Zelkova and AWS IAM
Kernel LLM will be used as a dictionary for data compression.
Oxide Computer size racks will have distrubuted linux schedulers. Kubernetes goes extinct.
More systems code like compilers will run on GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github/workflows		.github/workflows
codeint_toolbox		codeint_toolbox
workflows		workflows
CIALUG_December2023.md		CIALUG_December2023.md
CIALUG_September2024.md		CIALUG_September2024.md
LICENSE		LICENSE
README.md		README.md
test.ipynb		test.ipynb
wc.c		wc.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChadGPT

ARCHIVE

Centeral Iowa Linux Users Group
19 April 2023

News for April 2023

Presentation: Self-hosting large language models

What is a large languge model (LLM)?

What can I do with an LLM?

Run your own LLM - llama.cpp and web-llm

Linux Predictions

About

Releases

Packages

Languages

License

chadbrewbaker/ChadGPT

Folders and files

Latest commit

History

Repository files navigation

ChadGPT

ARCHIVE

Centeral Iowa Linux Users Group 19 April 2023

News for April 2023

Presentation: Self-hosting large language models

What is a large languge model (LLM)?

What can I do with an LLM?

Run your own LLM - llama.cpp and web-llm

Linux Predictions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Centeral Iowa Linux Users Group
19 April 2023

Packages