• 4 Posts
  • 218 Comments
Joined 2 years ago
cake
Cake day: August 29th, 2023

help-circle


  • So this blog post was framed positively towards LLM’s and is too generous in accepting many of the claims around them, but even so, the end conclusions are pretty harsh on practical LLM agents: https://utkarshkanwat.com/writing/betting-against-agents/

    Basically, the author has tried extensively, in multiple projects, to make LLM agents work in various useful ways, but in practice:

    The dirty secret of every production agent system is that the AI is doing maybe 30% of the work. The other 70% is tool engineering: designing feedback interfaces, managing context efficiently, handling partial failures, and building recovery mechanisms that the AI can actually understand and use.

    The author strips down and simplifies and sanitizes everything going into the LLMs and then implements both automated checks and human confirmation on everything they put out. At that point it makes you question what value you are even getting out of the LLM. (The real answer, which the author only indirectly acknowledges, is attracting idiotic VC funding and upper management approval).

    Even as critcal as they are, the author doesn’t acknowledge a lot of the bigger problems. The API cost is a major expense and design constraint on the LLM agents they have made, but the author doesn’t acknowledge the prices are likely to rise dramatically once VC subsidization runs out.


  • Is this “narrative” in the room with us right now?

    I actually recall recently someone pro llm trying to push that sort of narrative (that it’s only already mentally ill people being pushed over the edge by chatGPT)…

    Where did I see it… oh yes, lesswrong! https://www.lesswrong.com/posts/f86hgR5ShiEj4beyZ/on-chatgpt-psychosis-and-llm-sycophancy

    This has all the hallmarks of a moral panic. ChatGPT has 122 million daily active users according to Demand Sage, that is something like a third the population of the United States. At that scale it’s pretty much inevitable that you’re going to get some real loonies on the platform. In fact at that scale it’s pretty much inevitable you’re going to get people whose first psychotic break lines up with when they started using ChatGPT. But even just stylistically it’s fairly obvious that journalists love this narrative. There’s nothing Western readers love more than a spooky story about technology gone awry or corrupting people, it reliably rakes in the clicks.

    The call narrative is coming from inside the house forum. Actually, this is even more of a deflection, not even trying to claim they were already on the edge but that the number of delusional people is at the base rate (with no actual stats on rates of psychotic breaks, because on lesswrong vibes are good enough).


  • Some of the comments are, uh, really telling:

    The main effects of the sort of “AI Safety/Alignment” movement Eliezer was crucial in popularizing have been OpenAI, which Eliezer says was catastrophic, and funding for “AI Safety/Alignment” professionals, whom Eliezer believes to predominantly be dishonest grifters. This doesn’t seem at all like what he or his sincere supporters thought they were trying to do.

    The irony is completely lost on them.

    I wasn’t sure what you meant here, where two guesses are “the models/appeal in Death with Dignity are basically accurate, but, should prompt a deeper 'what went wrong with LW or MIRI’s collective past thinking and decisionmaking?, '” and “the models/appeals in Death with Dignity are suspicious or wrong, and we should be halt-melting-catching-fire about the fact that Eliezer is saying them?”

    The OP replies that they meant the former… the later is a better answer, Death with Dignity is kind of a big reveal of a lot of flaws with Eliezer and MIRI. To recap, Eliezer basically concluded that since he couldn’t solve AI alignment, no one could, and everyone is going to die. It is like a microcosm of Eliezer’s ego and approach to problem solving.

    “Trigger the audience into figuring out what went wrong with MIRI’s collective past thinking and decision-making” would be a strange purpose from a post written by the founder of MIRI, its key decision-maker, and a long-time proponent of secrecy in how the organization should relate to outsiders (or even how members inside the organization should relate to other members of MIRI).

    Yeah, no shit secrecy is bad for scientific inquiry and open and honest reflections on failings.

    …You know, if I actually believed in the whole AGI doom scenario (and bought into Eliezer’s self-hype) I would be even more pissed at him and sneer even harder at him. He basically set himself up as a critical savior to mankind, one of the only people clear sighted enough to see the real dangers and most important question… and then he totally failed to deliver. Not only that he created the very hype that would trigger the creation of the unaligned AGI he promised to prevent!








  • Here’s a LW site dev whining about the study, he was in it and i think he thinks it was unfair to AI

    There a complete lack of introspection. It seems like the obvious conclusion to draw from a study showing people’s subjective estimates of their productivity with LLMs were the exact opposite of right would inspire him to question his subjectively felt intuitions and experience but instead he doubles down and insists the study must be wrong and surely with the latest model and best use of it it would be a big improvement.







  • I think we mocked this one back when it came out on /r/sneerclub, but I can’t find the thread. In general, I recall Yudkowsky went on a mini-podcast tour a few years back. I think the general trend was that he didn’t interview that well, even by lesswrong’s own standards. He tended to simultaneously assume too much background familiarity with his writing such that anyone not already familiar with it would be lost and fail to add anything actually new for anyone already familiar with his writing. And lots of circular arguments and repetitious discussion with the hosts. I guess that’s the downside of hanging around within your own echo chamber blog for decades instead of engaging with wider academia.


  • For purposes of something easily definable and legally valid that makes sense, but it is still so worthy of mockery and sneering. Also, even if they needed a benchmark like that for their bizarre legal arrangements, there was no reason besides marketing hype to call that threshold “AGI”.

    In general the definitional games around AGI are so transparent and stupid, yet people still fall for them. AGI means performing at least human level across all cognitive tasks. Not across all benchmarks of cognitive tasks, the tasks themselves. Not superhuman in some narrow domains and blatantly stupid in most others. To be fair, the definition might not be that useful, but it’s not really in question.