• Jhex@lemmy.world
    link
    fedilink
    arrow-up
    12
    arrow-down
    2
    ·
    2 days ago

    Not necessarily… if I gave you my “faster car” for you to run on your private 7 lane highway, you can definitely squeeze every last bit of the speed the car gives, but no more.

    DeepSeek works as intended on 1% of the hardware the others allegedly “require” (allegedly, remember this is all a super hype bubble)… if you run it on super powerful machines, it will perform nicer but only to a certain extend… it will not suddenly develop more/better qualities just because the hardware it runs on is better

    • merari42@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      1 day ago

      Didn’t deepseek solve some of the data wall problems by creating good chain of thought data with an intermediate RL model. That approach should work with the tried and tested scaling laws just using much more compute.

    • PlutoniumAcid@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      2 days ago

      This makes sense, but it would still allow a hundred times more people to use the model without running into limits, no?