Skip to content

llama : correctly report GGUFv3 format #3818

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Oct 27, 2023

Conversation

cebtenzzre
Copy link
Collaborator

@cebtenzzre cebtenzzre commented Oct 27, 2023

Follow-up to #3552.

Before:

llm_load_print_meta: format           = unknown

After:

llm_load_print_meta: format           = GGUFv3 (latest)

Will GGUFv2 be deprecated like GGUFv1 was?

edit: I guess it doesn't matter since for little-endian it's just a version bump AFAIK.

@cebtenzzre cebtenzzre requested a review from ggerganov October 27, 2023 17:12
@cebtenzzre cebtenzzre merged commit 6d459cb into ggml-org:master Oct 27, 2023
mattgauf added a commit to mattgauf/llama.cpp that referenced this pull request Oct 27, 2023
* master: (350 commits)
  speculative : ensure draft and target model vocab matches (ggml-org#3812)
  llama : correctly report GGUFv3 format (ggml-org#3818)
  simple : fix batch handling (ggml-org#3803)
  cuda : improve text-generation and batched decoding performance (ggml-org#3776)
  server : do not release slot on image input (ggml-org#3798)
  batched-bench : print params at start
  log : disable pid in log filenames
  server : add parameter -tb N, --threads-batch N (ggml-org#3584) (ggml-org#3768)
  server : do not block system prompt update (ggml-org#3767)
  sync : ggml (conv ops + cuda MSVC fixes) (ggml-org#3765)
  cmake : add missed dependencies (ggml-org#3763)
  cuda : add batched cuBLAS GEMM for faster attention (ggml-org#3749)
  Add more tokenizer tests (ggml-org#3742)
  metal : handle ggml_scale for n%4 != 0 (close ggml-org#3754)
  Revert "make : add optional CUDA_NATIVE_ARCH (ggml-org#2482)"
  issues : separate bug and enhancement template + no default title (ggml-org#3748)
  Update special token handling in conversion scripts for gpt2 derived tokenizers (ggml-org#3746)
  llama : remove token functions with `context` args in favor of `model` (ggml-org#3720)
  Fix baichuan convert script not detecing model (ggml-org#3739)
  make : add optional CUDA_NATIVE_ARCH (ggml-org#2482)
  ...
brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 17, 2023
olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 30, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants