• tellmeaboutit@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 days ago

    That might change now that companies are creating “reasoning” models like DeepSeek R1. They aren’t really all that different architecturally but they produce longer outputs which just requires more compute.