• 6 Posts
  • 498 Comments
Joined 3 years ago
cake
Cake day: August 29th, 2023

help-circle
  • Not when what they want contradicts the basic limits of reality and logistics!

    Ed Zitron has done a breakdown on building normal sized data centers vs. the current target size of AI data centers, and on the bigger end normal data centers are 10s of MWs up to 100s of MWs. After 2 years Stargate Abilene has only turned on its first 200-300 MW. So I think even if regulators roll over on using twice the power of the entire state this project would take 2-3 years just to turn on the first few hundred megawatts then stall out.


  • Every author named as writing a paper bears full responsibility for the paper.

    This has the nice added bonus that it will likely catch PI’s that put their name on their grad students paper without actually doing the mentoring they were supposed to. It will also catch professors that coast (or at least inflate their citation index) by getting their name on papers they barely contributed to.

    I am quite convinced that, under these arxive guidelines, every single major PI in the field will be banned within a few years.

    Catching a lot of PIs that have allowed and even encouraged slop submission is a good thing in my book.



  • he just posted an entirely unnecessary amount of words

    taking a quick look at it… it’s actually short by Scott’s standards, but still overly long, given that the only point he makes is claiming Lindy’s Law is applicable to predicting AI progress in absence of other information. Edit: glancing at it again… its not that short, I kinda skimmed until I got to Scott’s actual point my first time glancing at it. You can’t blame me for not reading it.

    you-can’t-really-knows

    Yeah, he straw-mans AI critics/skeptics as trying to make an argument from ignorance, then tries to argue against that strawman using Lindy’s Law (which assumes ignorance and a pareto distribution). He completely ignores that AI critics are actually making detailed arguments about LLM companies consuming all the good and novel training data, hitting the limits on what compute costs they can afford, running into problems of the long lead time for building datacenters, etc. Which is pretty ironic given his AI 2027 makes a nominal claim to accounting for all that stuff (in actuality it basically all rests on METR’s task horizons, and distorts even that already questionable dataset).




  • after scrolling through a couple days and accounts on the extended db0verse

    I too was drawn with a morbid fascination to see what they were saying… db0 tried labeling all of awful.systems as dgerard’s collection of sockpuppets acting as his PR machine with maybe a few other humans. …I feel like I’m not living up to the hype, I’m not generating nearly enough PR for dgerard! Also, labeling us as a bunch of trolls. Which yes, one of our is literally called sneerclub, but since we don’t actually go onto lesswrong (or other rat communities) to stir up shit (we can find plenty pre-stirred) I don’t think trolls actually fits as a label.





  • libertarian socialist ideals

    wtf are these ideals supposed to be?

    If its “libertarian, but doesn’t want people dying in the streets or dying of preventable diseases”, I will give them the tiniest modicum of credit for being better than standard libertarians, but being libertarian at all still leaves them deep in the negative on credibility in my books.

    pro-non-corporate GenAI technology

    By which I assume they mean corporate models that got released as open weight models (and at-best/at-most got a touch of fine-tuning from a community effort), but still ultimately originated from mass plagiarism and are still not useful beyond generating slop…




  • Even Scott’s fantasy dream scenario for what prediction markets could be like and what questions they could answer feels… …deliberately naive? …like libertarian brainrot? …disconnected from reality?

    Ask yourself: what are the big future-prediction questions that important disagreements pivot around? When I try this exercise, I get things like:

    Will the AI bubble pop? Will scaling get us all the way to AGI? Will AI be misaligned?

    Huge amounts of money are being dumped into a bubble based on hype, so to hope a predict market would or could make better predictions than the existing business-idiot VCs funding this bubble feels hopelessly naive in a libertarian kind of way. There is already a method of aggregating the wisdom of the crowd and it is failing to incredibly lazy hype and PR.

    Will Trump turn America into a dictatorship? Make it great again? Somewhere in between?

    Again, there is already a mechanism for aggregating wisdom of the crowds, its called an election, and its also failed to get a answer predicated on reality or truth, so again, it seems incredibly naive to expect prediction markets to do better!

    Will YIMBY policies lower rents? How much?

    I mean, the councils and communities making these decision already ignore or overlook longer-term broader predictions of economic impact in favor of immediate home-owner value, I don’t see why Scott would expect prediction markets to help decision making go better here.

    Overall, it feels like Scott is overlooking the way decision making often already ignores science and experts. Society doesn’t have a problem making decent predictions compared to the problems it has communicating expert opinions to the public effectively and crafting policy aligned with the public interest.






  • You’ve described the problem with generalization yes. Well, you could maybe sort of train it not to generate “all men are cats”, but then that might also prevent it from making the more correct generalization “all cats are mortal” or even completely valid generalizations like combing “all men are mortal” and “Socrates is man” to get “Socrates is mortal”.

    The problem with monofacts is a bit more subtle. Let’s say the fact that “John Smith was born in Seattle in 1982, earned his PhD from Stanford in 2008, and now leads AI research at Tech Corp,” appears only once in the training data set. Some of the other words the model will have seen multiple times and be able to generate tokens in the right way for. Like Seattle as a location in the US, Stanford as a college, 2008 as a date, etc. But the combination describing a fact about John Smith appearing uniquely trains the model to try to generate facts that are unique combinations of data. So the model might try to make up a fact like “Jane Doe was born in Omaha in 1984, earned her master from Caltech in 2006, and is now CEO of Tech Corp” because it fits the pattern of a unique fact that was in its training data set.