So this has been going around my head for a while now: What if they do not care about their users per se but want the few users they get to exploit the federation to shamelessly crawl the fediverse?

I mean… they get enough users that will subscribe to enough of the fediverse to make instances of every shape and size proactively deliver them our post and interaction data with free shipping, right?

So is defederating in the end not only a prevention against company controlled content that might flood the fediverse, but a measure to protect the users on the fediverse right now from ending up in Meta’s databases just in the same way they would if they just had used facebook in the first place?

  • Dr. Moose@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    I don’t think anyone needs consent to do research using your public posts though. You can literally scrape the whole Twitter and run sentiment analysis and nobody can do anything about it for example.

    • Norgur@discuss.tchncs.deOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Yes, you can. Yet, that will not give you the interaction history (who liked what and such) and is way less convenient to do compared to “set up ActivityPub in own app real quick and have the whole fediverse send shit to me nicely formatted with interaction data ready to be used”. Legal issues arise in some spots when doing web-scraping-things like when you copy and use copyrighted imagery or happen to scrape stuff you weren’t allowed to see for some reason.

      All of those hurdles are out of the way automatically when you literally just use the inner workings of the service the data is from. No user can complain when Mate collects data sent to them via ActivityPub. That is literally what this protocol is used to do and the inner core of any application running it. If you don’t want your data to be sent to other instances around the world: Don’t use the protocol, right?

      They can get the data in many different ways, this is just the most convenient one.