lunarwingorg@lemmy.worldEnglish · 2 days agoLunarWing — self-hostable AI agent framework built in Rust, focused on privacy and real secret managementplus-squareimagemessage-square3linkfedilinkarrow-up111arrow-down14
arrow-up17arrow-down1imageLunarWing — self-hostable AI agent framework built in Rust, focused on privacy and real secret managementplus-squarelunarwingorg@lemmy.worldEnglish · 2 days agomessage-square3linkfedilink
cicadagen@ani.socialEnglish · edit-27 days agoQwen 3.6 27B running at 46 tok/s on an RX 9070 XT (llama.cpp + MTP Speculative Decoding is basically magic)plus-squaremessage-squaremessage-square19linkfedilinkarrow-up163arrow-down12
arrow-up161arrow-down1message-squareQwen 3.6 27B running at 46 tok/s on an RX 9070 XT (llama.cpp + MTP Speculative Decoding is basically magic)plus-squarecicadagen@ani.socialEnglish · edit-27 days agomessage-square19linkfedilink
troed@fedia.io · 7 days agoOpencode llama-server prefill/generation stats pluginplus-squarecodeberg.orgexternal-linkmessage-square3linkfedilinkarrow-up117arrow-down10
arrow-up117arrow-down1external-linkOpencode llama-server prefill/generation stats pluginplus-squarecodeberg.orgtroed@fedia.io · 7 days agomessage-square3linkfedilink
ikt@aussie.zoneEnglish · 8 days agoOopsplus-squareimagemessage-square12linkfedilinkarrow-up194arrow-down13
arrow-up191arrow-down1imageOopsplus-squareikt@aussie.zoneEnglish · 8 days agomessage-square12linkfedilink
troed@fedia.io · 10 days agoNorth Mini Code v1.0 - a Qwen 3.6 35B MoE alternativeplus-squarehuggingface.coexternal-linkmessage-square5linkfedilinkarrow-up114arrow-down11
arrow-up113arrow-down1external-linkNorth Mini Code v1.0 - a Qwen 3.6 35B MoE alternativeplus-squarehuggingface.cotroed@fedia.io · 10 days agomessage-square5linkfedilink
Schilling2304English · 13 days agoMy models don't have reasoning ability in llama-b9543 server but have in llama-cliplus-squaremessage-squaremessage-square2linkfedilinkarrow-up18arrow-down11
arrow-up17arrow-down1message-squareMy models don't have reasoning ability in llama-b9543 server but have in llama-cliplus-squareSchilling2304English · 13 days agomessage-square2linkfedilink
HelloRoot@lemy.lolEnglish · 14 days agoI Put a Datacenter GPU in My Gaming PC for £200plus-squareblog.tymscar.comexternal-linkmessage-square15linkfedilinkarrow-up1114arrow-down15
arrow-up1109arrow-down1external-linkI Put a Datacenter GPU in My Gaming PC for £200plus-squareblog.tymscar.comHelloRoot@lemy.lolEnglish · 14 days agomessage-square15linkfedilink
potatoguy@mbin.potato-guy.space · 15 days agoGemma 4 QAT models: Optimizing model compression for mobile and laptop efficiencyplus-squareblog.googleexternal-linkmessage-square0linkfedilinkarrow-up120arrow-down11
arrow-up119arrow-down1external-linkGemma 4 QAT models: Optimizing model compression for mobile and laptop efficiencyplus-squareblog.googlepotatoguy@mbin.potato-guy.space · 15 days agomessage-square0linkfedilink
robber@lemmy.mlEnglish · 16 days agoGemma4 12b released with "unified" approach to multi-modalityplus-squarehuggingface.coexternal-linkmessage-square14linkfedilinkarrow-up123arrow-down12
arrow-up121arrow-down1external-linkGemma4 12b released with "unified" approach to multi-modalityplus-squarehuggingface.corobber@lemmy.mlEnglish · 16 days agomessage-square14linkfedilink
cm0002@lemy.lolEnglish · 18 days agoI Tried This Open Source ChatGPT Alternative [Jan AI] on Linux, But Went Back to Ollamaplus-squareitsfoss.comexternal-linkmessage-square11linkfedilinkarrow-up118arrow-down13
arrow-up115arrow-down1external-linkI Tried This Open Source ChatGPT Alternative [Jan AI] on Linux, But Went Back to Ollamaplus-squareitsfoss.comcm0002@lemy.lolEnglish · 18 days agomessage-square11linkfedilink
troed@fedia.io · 22 days agoDon't skimp on the quant when using MoEplus-squareunsloth.aiexternal-linkmessage-square3linkfedilinkarrow-up132arrow-down12
arrow-up130arrow-down1external-linkDon't skimp on the quant when using MoEplus-squareunsloth.aitroed@fedia.io · 22 days agomessage-square3linkfedilink
pepperfree@sh.itjust.worksEnglish · 23 days agoInfinity-Parser2 - Multimodal Document Parserplus-squarehuggingface.coexternal-linkmessage-square0linkfedilinkarrow-up18arrow-down11
arrow-up17arrow-down1external-linkInfinity-Parser2 - Multimodal Document Parserplus-squarehuggingface.copepperfree@sh.itjust.worksEnglish · 23 days agomessage-square0linkfedilink
sp3ctre@feddit.orgEnglish · 29 days agoYour best local LLM for low-VRAM (6GB)?plus-squaremessage-squaremessage-square14linkfedilinkarrow-up133arrow-down13
arrow-up130arrow-down1message-squareYour best local LLM for low-VRAM (6GB)?plus-squaresp3ctre@feddit.orgEnglish · 29 days agomessage-square14linkfedilink
ikt@aussie.zoneEnglish · 1 month agoDystopiaBench - AI Ethics Stress Testplus-squaredystopiabench.comexternal-linkmessage-square15linkfedilinkarrow-up17arrow-down11
arrow-up16arrow-down1external-linkDystopiaBench - AI Ethics Stress Testplus-squaredystopiabench.comikt@aussie.zoneEnglish · 1 month agomessage-square15linkfedilink
SuspiciousCarrot78@aussie.zoneEnglish · edit-21 month agoClaude? No. Cucumbers? Yes!plus-squaremessage-squaremessage-square3linkfedilinkarrow-up116arrow-down11
arrow-up115arrow-down1message-squareClaude? No. Cucumbers? Yes!plus-squareSuspiciousCarrot78@aussie.zoneEnglish · edit-21 month agomessage-square3linkfedilink
TheCornCollector@piefed.zipEnglish · edit-21 month agoLlama.cpp MTP Support merged - up to 2.5x speed increaseplus-squaregithub.comexternal-linkmessage-square3linkfedilinkarrow-up145arrow-down11
arrow-up144arrow-down1external-linkLlama.cpp MTP Support merged - up to 2.5x speed increaseplus-squaregithub.comTheCornCollector@piefed.zipEnglish · edit-21 month agomessage-square3linkfedilink
BB84@mander.xyzEnglish · edit-21 month agoOrthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distributionplus-squaregithub.comexternal-linkmessage-square4linkfedilinkarrow-up110arrow-down10
arrow-up110arrow-down1external-linkOrthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distributionplus-squaregithub.comBB84@mander.xyzEnglish · edit-21 month agomessage-square4linkfedilink
SuspiciousCarrot78@aussie.zoneEnglish · edit-21 month ago"The cost of running LLMs is just too damn high"plus-squaremessage-squaremessage-square11linkfedilinkarrow-up139arrow-down11
arrow-up138arrow-down1message-square"The cost of running LLMs is just too damn high"plus-squareSuspiciousCarrot78@aussie.zoneEnglish · edit-21 month agomessage-square11linkfedilink
SuspiciousCarrot78@aussie.zoneEnglish · 1 month agoToken Speed visualiserplus-squaremikeveerman.github.ioexternal-linkmessage-square0linkfedilinkarrow-up110arrow-down11
arrow-up19arrow-down1external-linkToken Speed visualiserplus-squaremikeveerman.github.ioSuspiciousCarrot78@aussie.zoneEnglish · 1 month agomessage-square0linkfedilink
XiELEd@piefed.socialEnglish · edit-21 month ago<8B multilingual models for language learning chatbotsplus-squaremessage-squaremessage-square5linkfedilinkarrow-up111arrow-down12
arrow-up19arrow-down1message-square<8B multilingual models for language learning chatbotsplus-squareXiELEd@piefed.socialEnglish · edit-21 month agomessage-square5linkfedilink