Thanks capitalism for doing the stupidest implementation of this technology possible

  • poster596 [kit/kit's]@hexbear.net
    link
    fedilink
    English
    arrow-up
    71
    ·
    8 days ago

    a lotta yall still dont get it

    ape holders can use multiple slurp juices on a single ape

    so if you have 1 astro ape and 3 slurp juices you can create 3 new apes

  • KuroXppi [they/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    53
    ·
    edit-2
    8 days ago

    Where I work just had a consultant present to the board, and the senior exec, in two different sessions, on how to integrate AI across the workplace. I saw the slides. It was literally ‘here’s how to prompt AI, and here’s some freeware on creating your own agent’.

    Now, we’re likely to go on the hook for $45 $30 AUD* per user per month for copilot m365 (not copilot free, that’s entirely different, 365 is the ‘enterprise’ version which does the same thing while promisingcatgirl-sorry it won’t use company data for training)

    The threat is that 'if your workers don’t use the copilot from your tenant, they will be putting company data into public LLMs, so you have to cough up. It’s a direct threat, as Microsoft integrates the free version of copilot across all its apps. Insidious planet destroying criminal stuff which is a big stick being presented as this productivity-enhancing carrot

    Edit: *(I had assumed it was USD and did a rough conversion into my currency, it’s actually 30 AUD, ~22usd. It’s still more than we can afford and infinitely more than it’s worth)

    • BelieveRevolt [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      27
      ·
      8 days ago

      That’s something I’ve been thinking about ever since ChatGPT went public.

      Years ago, it was revealed that some online translation site had a bunch of documents from various companies stored because people kept pasting them in, there’s no way company documents aren’t being put into the slop machines all the time.

    • ClimateStalin [they/them, he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      19
      ·
      8 days ago

      I’m honestly shocked my employer (and basically every university system) hasn’t considered switching because of the huge liability this exposes them to

      Copilot is 100% going to get someone in huge trouble for exposing protected health information, and it should be considered malware on any computer in a health system

  • LeeeroooyJeeenkiiins [none/use name]@hexbear.net
    link
    fedilink
    English
    arrow-up
    37
    ·
    8 days ago

    realized last night in my drug fueled haze that they want AI to be a digital personal Epstein in the pocket of every american, by which I mean it’ll make CSAM and also track everything everyone says to it specifically to use to blackmail people, it’s the digital widening of the “big club”

  • LaughingLion [any, any]@hexbear.net
    link
    fedilink
    English
    arrow-up
    36
    arrow-down
    1
    ·
    8 days ago

    the main problem is if you have a good computer (ie the average , run of the mill gaming rig these days) you can run a model that will the “ai assistant” role about 90% as good as the best paid saas models for free on the computer you already have with the addition that your local model is abliterated (jailbroken) to talk about restricted shit. all you need is like 32gb of ram and 8gb of vram minimum which the average pc gamer is running these days.

    if you are a developer and have 64gb of ram and 16+ of vram then you can run a claude-level local ai as well. ive not fucked with image or video generation but those models are available as well

    if anyone needs a tutorial for the technically uninclined i can write one up because thats the only barrier is that its all in techbro and dev-speak

      • LaughingLion [any, any]@hexbear.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        Also, 16gb VRAM? You’ll be able to load a better model like https://huggingface.co/mradermacher/Skyfall-31B-v4-i1-GGUF which is a little stronger than the ones I linked in the guide. If the “i1-Q4_K_S” is too large then try the “i1-IQ4_XS” quant.

        Probably try it offloading just the down tensors (top option) in the guide. Make sure your KV Batch size is 1024 (or better) so the context gets offloaded on the GPU faster to cut down on response times. Otherwise everything else in the guide is good for you. If you find you have a little bit of VRAM space at 16k context and 1024 batch size try upping the context a little until your VRAM is like 15GB utilized or better.

      • LaughingLion [any, any]@hexbear.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        The guide is the same you’ll just need to follow the install instructions for Linux for Koboldcpp and SilltyTavern that is documented on the github pages linked.

        The performance guide is the same as well as the link on how to set up the “character” to act as a GM. If you run into any hitches maybe I can help but to be clear I haven’t touched Linux in 10 years.

    • MeetMeAtTheMovies [they/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      9
      ·
      8 days ago

      Do you need open weight models for this? Any recommendations for which one(s)? Do you need to download a huggingface client or something? I’m familiar with AI stuff, but not running locally.

      • You’d download ollama, and then get the model via ollama install I believe. Although the 90% as good mark is pushing it, as open weight models below 32b parameters (what you could reasonably run on those machines) benchmark around 40% less than Opus 4.6 for software, and the difference is night and day for general reasoning.

      • LaughingLion [any, any]@hexbear.net
        link
        fedilink
        English
        arrow-up
        4
        ·
        8 days ago

        If you are running at home as a hobby then just use Koboldcpp and maybe SillyTavern if you want extra functionality. In the former you can offload down and potentionally up tensors to save VRAM space if needed. For models it depends on need.

        A 24-31B model is generally more than fine for most @home use cases, and they are quite “smart”, though that doesn’t mean anything in regards to AI. It’s a vibe, basically. A 32gb RAM / 8GB VRAM can use a 24B model to generate about 5 tokens per second, which is fine for an agent who is designed to give you short replies to answer questions.

        You’ll most likely want to grab a GGUF quantization from hugging face, yes. Any 4bit quant is fine, really. The merges are all quasi-abliterated models for people who want slutty AI girlfriends/boyfriends. The models directly from companies like GLM7 or Kimi or whatever are more standard and generally run more efficiently.

        People in development are likely going to want a 70-100B model. Claude, I think, is a 100B model. You can run those on about 64gb of ram and 32gb of VRAM.

        If you want settings for Koboldcpp I can give you the rundown on how to optimize.

      • BountifulEggnog [she/her]@hexbear.net
        link
        fedilink
        English
        arrow-up
        3
        ·
        8 days ago

        It depends massively on what hardware you have. I’ve heard good things about glm 4.7 flash and it’s easy enough to run. Also depends on what you want to use it for.

    • RamenJunkie@midwest.social
      link
      fedilink
      English
      arrow-up
      5
      ·
      8 days ago

      I would love to know the best way to get jailbroken ChatGPT functionality/experience locally. Including the random image generstion and adjustments.

      I have tried a few things and images always seem to be seperate. And when I tried a chatbot with stories it was terrrible at keeping track of characters and what was going on.

      • LaughingLion [any, any]@hexbear.net
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        8 days ago

        If you are talking about multi-modal stuff then locally the best way to do it is still separately.

        If you just want to run a local adventure bot “game master” then a Koboldcpp and SillyTavern is the way to go. I’m a tabletop gaming nerd so this is what I use it for when I’m sitting on the couch sometimes.

        Simple guide for Windows:

        1. Go grab Koboldcpp Normal exe for NVidia or the NOCUDA if you have AMD and want a smaller download.
        2. Go download Nodejs and install that.
        3. Download SillyTavern {Go to the green “Code” box in the top-right and click it then select download.}
        4. Extract SillyTavern into a folder that has no spaces (Spaces in the folder name will break their install script)
        5. Browse to the /SillyTavern-Launcher/ folder and run the Installer.bat file and run through the simple install.
        6. Go download a jailbroken model. If your RAM situation looks desperate like 16/6 then get something like Rocinante or if you have a little more RAM like 32/12 then go with something like Goetia
        7. Run Koboldcpp.exe. Load the model you downloaded. Press Launch. Wait for it to say its waiting for connection at endpoint. By default it’ll give you the local web launch page and you can actually test things there.
        8. If everything looks good and SillyTavern isn’t already loaded then launch that from the Launcher.bat.
        9. In SillyTavern you click the plug button at the top and and make sure you can connect to the endpoint. Might have to change the port from 5000 to 5001.
        10. Technically, you are done here and it’s just a matter of finetuning your settings and setting up a character card to be a game master.

        Finetuning:

        1. First rule of performance is the more of the model you can fit into VRAM the better. ALL is best.
        2. The second rule is to offload ffn_ tensors. Down, then up, then if you are desperate, gate.
        3. Third rule, if you are REALLY desperate is to offload KV cache. At this point your models is running SLOW.
        4. What I would do is download the model. Load it in Kobold. Set your GPU layers to 99 and your context to 16384 (Minimum for text roleplayers)
        5. Click the left Hardware tab. Check Use MMAP, Use mlock, High Priority. Up the threads to one lower than your logical CPU core count. Turn off Launch Browser if you want.
        6. Click the left Context tab. Check No BOS Token (SillyTavern already does this).
        7. Save that config and let the model load. Then use Task Manager to look at your GPU VRAM usage in the performance tab. Are you using all of it?
        8. If “no” then close it and give yourself more context. The ultimate goal for text roleplay geeks like us is 32k. You can also up the Batch Size in the Hardware tab to 1024 or 2048 but realize this scales exponentially the KV cache and hogs VRAM. More KV cache helps your prompt load quicker, which starts to become a problem with larger cache sizes. Your speeds on a fully VRAM loaded model should be like 50tps or better (tokens per second)
        9. If “yes” then run some prompts through the assistant or Seraphina default character cards and check your speeds. They are probably slow.
        10. Close the Koboldcpp window and relaunch it. Load your previous config (you saved it didn’t you?) and plug in a tensor offload command (bottom of my comment) into the Override Tensors field in the Hardware Tab. Then return to step 7. Keep doing this until you get speeds of at least 5tps and can have a decent context size (16k+).
        11. If you still can’t get enough speed because of VRAM constraints try a 3bit quant of the model you got or go to a less complex, small model. Sorry bubs, welcome to the club. We VRAM poor in this house.
        12. If you are REALLY desperate you can try offloading the KV cache in the Hardware tab but the slowdown is MASSIVE.

        Setting up a “character” card and “persona” in SillyTavern and some other settings

        1. Use this guide to start
        2. Don’t use his character template, it’s kind of bad. Instead consider using a guide like the character foundry template. I’ll share a character I made to format characters this way below. Works pretty well. Feed it as much information you want about the character and it makes a nice entry for the Lorebook.
        3. Adjustments I made are to reduce the context to anywhere from 160-220 depending on the model. I don’t restrict TopK (leave at zero for off). I use the XTC settings over the repetition setting because it works better on longer adventure (like 500+ responses). If you are running a long adventure and the model starts omitting pronouns and words like “the” then you probably have an issue with your repetition settings being too strict or incorrect.
        4. Learn to love the LoreBook. I wrap my supporting “characters” in the lorebook with <character>{}</character> and that seems to help models with that. There are also extensions to help with other lorebook entries like this WREC, though I use an inline summarizer which I like.

        Character Creator:

        You will create a character card considering the context provided by {{user}}. Adhere to the following format strictly, restructuring the context to fit this format:
        
        # {The Character name}
        - **Age:** {age in years}
        - **Gender:** {male, female, nonbinary, or transgendered}
        - **Height:** {height in feet and inches}
        - **Race/Species:** {the race or species of a character such as human or elf or other fantasy species}
        
        ### Appearance
        {Describe the person's basic physical appearance.} {Describe briefly how their body has shaped their life, if it gives them the attention they like or dislike and if they have insecurities about their body.} {Write a sentence or two about their clothing preferences and what kind of image they are trying to project about themselves.}
        
        ### Background
        {Write a sentence describing where this character grew up and how that culture shaped them.} {Write a few sentences about experiences they had growing up that may have shaped who they are.}
        
        ### Personality
        {A sentence or two detailing where this character feels they are in their life and if they are satisfied or not with their current situation.} {Describe a few things the character enjoys such as music, movies, games, or other activities.} {Describe briefly the character's innermost desire.} {Describe briefly the character's innermost fear.} {Describe a lie the character believes about themselves.} {Describe one irrational thing the character does or believes.}
        
        ### Speech
        [{Write about how this person talks considering their accent and the slang and idioms they use. Give examples.}]
        
        ### Relationships
        {Write a sentence or two about how this character expresses attraction or fondness for others.} {Write a sentence or two about boundaries this character has in relationships.} {Create one issue that makes relationships or intimacy difficult for this character.}
        

        Some basic Tensor offloads, try the top one first, second one if you still need VRAM space, last is a big slowdown:

        blk\.\d+\.ffn_(down*)=CPU
        blk\.\d+\.((ffn_down*)|(ffn_up*))=CPU
        blk\.\d+\.((ffn_down*)|(ffn_up*)|(ffn_gate*))=CPU
        
    • plinky [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      3
      ·
      8 days ago

      local image generation is more involved as well, there is some fun in contorting models with controlling networks which don’t make sense for a prompt, and it producing something bizarre in response, but i got bored in a month.

      but it’s kinda mechanical fun, like playing with frequencies in some audio software.

    • NominatedNemesis@reddthat.com
      link
      fedilink
      arrow-up
      3
      ·
      8 days ago

      I would like to agree with you, but in my experience I cannot. I usually use local models, on my work computer, and have access to pro models payed by company. There is a great difference.

      I have to use AI. It is in the KPI and my salary raise depends on it… so stupid, just got a mail that We cannot replace our computer for the foreseeable future because of ram and ssd shortage… Meanwhile I fight with “developers” that are generating code… which does not even work!

      So I use local AI because I am forced to use AI and I am in the terminal anyway. Also f×ck the great companies pushing their bullsh×it, they already demonstated that they will use the data they get from paying companies as well.

      AI hase an usecase, LLM is not the sentient sh×t they want us to beleive. I want to go back before the hype…

      I actually programmed an AI and trained to do repetitive but not well defineable tasks for me in c++ with openCV and some tensor library, after a week it worked better than any human. Also helped a research group to optimalize an image recognition AI to help doctors identify cancerous cells.

      • LaughingLion [any, any]@hexbear.net
        link
        fedilink
        English
        arrow-up
        3
        ·
        8 days ago

        You’re right. Getting a 24B model locally isn’t going to be as powerful as a 600B model, for sure. You’re also right thar they don’t think. They absolutely don’t.

        But the local ones are pretty powerful and can do a lot more than most think. Even some simple vibe coding can be done with local AI. I think for the average gamer type if they wanted to mess with it local is more than enough tbh.

        • NominatedNemesis@reddthat.com
          link
          fedilink
          arrow-up
          3
          ·
          8 days ago

          To be fair, local is better in many ways than cloud solutions, it keeps the data private, and does not lock into a vendor (which they desperately want). Also Lora is an option for finetuneing, but thats way advanced for an average user.

  • MrPiss [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    32
    ·
    8 days ago

    I’m honestly glad that 84% of people haven’t used “AI”. Despite all the hype and propaganda, they haven’t been able to get a lot of people to really engage with it. Most of those people probably did one or two prompts and realized it was dogshit and stopped.

    • Most of those people probably did one or two prompts and realized it was dogshit and stopped.

      Me. I only got deepseek when it came out to have a laugh. Now I’m lumped in with the percentage that have “used ai” even tho I dont use it and actively make fun of it lmao

  • RamenJunkie@midwest.social
    link
    fedilink
    English
    arrow-up
    28
    ·
    8 days ago

    Insaw a comment from one of thse jokers about how “people use AI and don’t even realize it,” then cited like, phone auto correct.

    Like, OK buddy.

  • MarmiteLover123 [comrade/them, any]@hexbear.net
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    edit-2
    8 days ago

    This tweet appears to be a gross misrepresentation of the data in this October 2025 report, which was compiled into this graphic

    In other words, the tweet is a straight up lie.

    The over a billion people that “use AI” have had to have used a standalone AI chat bot application within the past month to have counted as “using AI”. Thus, this is not a measurement of those who have used AI, it’s a measurement of those who have used a standalone AI chat interface/tool within the past month, active monthly users. Thus, it’s not “84% of people have never used AI”, it’s “84% of people have not used a standalone AI chat interface or tool within the past month”. According to the criteria established in the beginning, all of this was not included in the “use AI” stat:

    AI Overviews in Google search results

    The use of AI capabilities within software and tools such as Gmail, Microsoft Office, Canva, Adobe Photoshop, or Grammarly

    The use of “AI companion” chatbots such as Character.AI

    Use of Meta AI within apps such as Facebook, WhatsApp, and Instagram, because such use is indistinguishable from these platforms’ generic search functions.

    For instance, Gemini is now included in Google search, anyone who has used Google has used AI. But this is not counted in the “Use AI” stat, which is only measuring the use of standalone AI chat interfaces within the past month

    Over a billion people using standalone AI tools within the past month is an absolutely massive market and the opposite of what the tweet is implying. If you were to add in all the “passive AI users” that were not included in the stats, billions of people would have used AI in the past month, I’d guess a vast majority of PC and smartphone users, because it’s integrated by default into everything, and only a small minority are going to manually turn everything off. AI has already achieved massive market penetration with over a billion active monthly users and billions of passive users. That has already been achieved according to these stats. If that means AI is a bubble or not, I don’t know. But the implication that there is a huge amount of untapped growth or that the amount of AI users are small and no one uses the technology, these are both incorrect, based off of incorrect interpretations of statistics.

    • RamenJunkie@midwest.social
      link
      fedilink
      English
      arrow-up
      25
      ·
      8 days ago

      Being forcibly shown a useless AI box on your search isn’t exactly “using AI” either, and if it vanished tomorrow, I would bet more people would be happy than sad about losing the useless clutter that just slows everything down when loading the page.

      • KuroXppi [they/them]@hexbear.net
        link
        fedilink
        English
        arrow-up
        13
        ·
        edit-2
        8 days ago

        I’ve observed the opposite, most people I see google things take the gemini response as-is and don’t even click into the first result

        Edit: that is, if they don’t already actively use an LLM to seek out the answer in the first instance

    • Sodium_nitride@lemmygrad.ml
      link
      fedilink
      arrow-up
      12
      ·
      8 days ago

      All of this extra use only means that AI companies are burning absurd amounts of cash propping up even more AI services that people aren’t paying for. But in a world with insane investor cash flows and huge support from the government, the AI industry can hobble along for a good amount of time.

  • Thordros [he/him, comrade/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    18
    ·
    8 days ago

    If only 14% of people use a product, and only 0.3% of those people think it’s worth paying for, that means there’s enormous growth potential. stonks-up

    If everybody is a paid user of a product, and it’s incredibly popular, that means revenue is through the roof. stonks-up

    • BlueOctopus@feddit.uk
      link
      fedilink
      arrow-up
      7
      ·
      8 days ago

      This was what was good about old old Reddit comments, probably 15 + years ago before the enshitification began there would be a source link up dooted. Duck duck go found this which sot of matches up https://biggo.com/news/202507011842_AI_Usage_Boom_Faces_Payment_Problem dunno who any of these ppls are or if it is to be trusted, I am drunk this is my yearly post go fuck your selves and do more of these comments pls lemmy community. I do use ai for powershelll scripts it has made that easier but you still have to argue with the bugger to get it to do what you want, I pay for nothing. I am the problem not the solution. I am crab.