Skip to content

Add quantize script for batch quantization #92

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 5 commits into from
Mar 13, 2023
Merged

Add quantize script for batch quantization #92

merged 5 commits into from
Mar 13, 2023

Conversation

prusnak
Copy link
Collaborator

@prusnak prusnak commented Mar 13, 2023

Alternative to #17 suggested in #17 (comment)

@ggerganov ggerganov merged commit d1f2247 into master Mar 13, 2023
@ggerganov ggerganov deleted the quantize-sh branch March 13, 2023 16:15
@Jettford
Copy link

I wrote up a basic python implementation of the same script for Windows users, would it be worth making a pull request to replace the batch script?

import os
import re
import sys

def print_usage():
    print("Usage: llama-quantize.py 7B|13B|30B|65B [--remove-f16]")
    exit(0)

if not len(sys.argv) > 1:
    print_usage()
    
regex_test = re.compile("^[0-9]{1,2}B$")

if not regex_test.match(sys.argv[1]):
    print_usage()
    
model_directory = f"./models/{sys.argv[1]}/"

if not os.path.exists(model_directory):
    print("Failed to find model directory")
    
    print_usage()
    
for file in os.listdir(model_directory):
    if not "ggml-model-f16.bin" in file:
        continue
        
    file = os.path.join(model_directory, file)
    
    new_name = file.replace("f16", "q4_0")
    
    binary_name = "./quantize"
    
    if sys.platform == "win32":
        binary_name += ".exe"
        
        binary_name = binary_name[2:]
        
    os.system(f"{binary_name} {file} {new_name} 2")
    
    if len(sys.argv) > 2:
        if sys.argv[2] == "--remove-f16":
            os.remove(file)

@ghost

This comment was marked as outdated.

@tmzncty
Copy link

tmzncty commented Mar 16, 2023

我觉得,直接编译出quantize.exe,然后在CMD中运行就好了。
‘‘‘quantize.exe ggml-model-f16.bin ggml-model-q4.bin 2’’’
image
image
image
image

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants