With all the talk about posts being lost due to people deleting comments, posts, and subs going private or generally protesting, I wondered if people have an appetite for doing some work to move key bits of reddit history over to kbin/Lemmy for posterity?
A dedicated instance sounds better than a magazine though not sure who’d be willing to take on the expense of taking on that volume of data.
The easiest grab method would be using the API, which provides about a week to get dev approval and to copy all Reddit data via the API without getting banned.
Actually even subreddits are affected by the 1000 indexing limit IIUC. So we would have limitations on what content we could discover without an external source.
I guess we could grab from the pushshift torrents and use API access to grab as much as we can of the last couple of months? (Pushshift lost access at the start of May iirc so that’s where the gap would start.) Also getting stuff from subs still protesting as private would be a problem.
Basically not a fan of the API approach.