Promising stuff from their repo, claiming “exceptional performance, achieving a [HumanEval] pass@1 score of 57.3, surpassing the open-source SOTA by approximately 20 points.”
Promising stuff from their repo, claiming “exceptional performance, achieving a [HumanEval] pass@1 score of 57.3, surpassing the open-source SOTA by approximately 20 points.”
On The Bloke’s hugging face repo, it says the GGML quants are not compatible with llama.cpp, anyone know why?
It’s a different type of model. llama.cpp only supports LLaMA models while GGML (the machine learning library llama.cpp is based on) has examples of various models with different architectures. WizardCoder, MPT, Bloom, probably very soon Falcon. Also some separate projects use GGML to support other models (including some of the ones I listed). For example the Rust “llm” project can support LLaMA models, MPT, BLOOM.