• redtea@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 days ago

    Thanks. That helps me understand things better. I’m guessing you need all the data initially to set up the graph (model). Then you only need that?

        • KnilAdlez [none/use name]@hexbear.net
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          2 days ago

          That’s a great question! The models come in different sizes, where one ‘foundational’ model is trained, and that is used to train smaller models. US companies generally do not release the foundational models (I think) but meta, Microsoft, deepseek, and a few others will release smaller ones available on ollama.com. A rule of thumb is that 1 billion parameters is about 1 gigabyte. The foundational models are hundreds of billions if not trillions of parameters, but you can get a good model that is 7-8 billion parameters, small enough to run on a gaming gpu.