• otacon239@lemmy.world
    link
    fedilink
    English
    arrow-up
    38
    ·
    3 days ago

    I’ve said it before and I’ll say it again.

    Self-censorship is the worst kind because you’re not even initially trying to get the original message out. You’re doing the advertiser’s work for them when you make your posts friendly to the algorithm.

  • chicken@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    5
    ·
    3 days ago

    Ok so how do you actually evade algorithmic censorship then? I assume the embedding still is based on transcribed text, so word choice should matter on some level even if it isn’t the be all end all right? The metric of other content preferences of people who prefer your content does seem harder to get around though.

    Anyway here is the link to the paper the video mentions: https://concetticontrastivi.org/wp-content/uploads/2023/01/1369118x.2016.1154086.pdf

    • regdog@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 hours ago

      Ok so how do you actually evade algorithmic censorship then?

      You don’t. Express yourself however you want to and ignore that shit.

      • chicken@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 hours ago

        And just accept that the company and your government gets to decide which expressions go down the memory hole? Even with totally ineffective cargo cult methods, at least they are trying.

    • lepinkainen@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 days ago

      Nope. There are studies with vector databases that show that even language doesn’t matter, the words start grouping together automatically based on relevance just by the way the math works.

      In theory your could try inventing a fake language so weird that it doesn’t match anything existing, but at that point just start encrypting your stuff

      • chicken@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 days ago

        the words start grouping together automatically based on relevance just by the way the math works

        Sure but isn’t it still the words that are grouping together? The guy in the OP video seems to be claiming that the fact that he used certain words does not matter, which does not make sense to me, since the depth of understanding these algorithms have of what is being said is still somewhat shallow.

        I would guess that it should be possible to engineer a sentence that communicates a particular message, but is phrased in such a way that it targets a location in vector space that is not associated with that message (until the other parts of their system make that association).

        • AdrianTheFrog@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          If you can give ChatGPT the transcript and it can say “yes that’s about ____”, then that means it’s certainly possible for them to do the same. I would expect that anything trained specifically for that should only get better from there, although obviously they’re not going to throw ChatGPT-sized compute at it.

          • chicken@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 days ago

            although obviously they’re not going to throw ChatGPT-sized compute at it.

            I’m not entirely sure whether and what more fundamental distinctions between embeddings and LLMs may exist, but smaller LLMs really struggle with comprehension if things are phrased in an unexpected way, and embeddings use comparatively very few resources. Maybe a circumvention training tool could work like this: a writing game where the goal is to produce text about a topic such that the embedding fails to associate it with that topic, but a more powerful LLM succeeds (the idea being that maybe a human would be able to tell also). The biggest advantage these systems have is probably just the way people do not get direct feedback about how their work is being interpreted.

  • Soupbreaker@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    3 days ago

    It’d be great if people just stopped submitting to these algorithms. Every time I hear someone on a podcast talking about “their” algorithm, as if it were some benign, cutesy thing, I want to puke. Just quit the corporate bullshit and come absorb depressing content on Lemmy like the rest of us, jeez.