it will loose its ability to differentiate between there and their and its and it’s.

    • driving_crooner@lemmy.eco.br
      link
      fedilink
      arrow-up
      2
      ·
      9 months ago

      It’s about the counting subreddit. It was used on the token generation database, but then removed on the training. This user posted so much on that subreddit that a token with its username was created, but then it had nothing associated with it in the training and the model dosen’t know how to act when the token is present.