Testing the Limits: My GTX 1070 Rig vs Mistral Small 22B

Smokeydope@lemmy.world · 2 months ago

Testing the Limits: My GTX 1070 Rig vs Mistral Small 22B

brucethemoose@lemmy.world · edit-2 2 months ago

Good! Try the IQM, XS, and XSS quantizations as well, especially if you try a 14B, as they “squeeze” the model into less space better than the Q3_K quantizations.

Yeah I’m liking the 32B as well. If you are looking for speed just for ultilitarian Q/A, you might want to keep a Deepseek Lite V2 Code GGUF on hand, as it’s uber fast partially offloaded.