Quantize Llama models with GGUF and llama.cpp GGML vs. GPTQ vs. NF4

카테고리 없음

Quantize Llama models with GGUF and llama.cpp GGML vs. GPTQ vs. NF4

bryan9 2024. 2. 13. 12:00

https://mlabonne.github.io/blog/posts/Quantize_Llama_2_models_using_ggml.html

Quantize Llama models with GGUF and llama.cpp
GGML vs. GPTQ vs. NF4