☆ Yσɠƚԋσʂ ☆

  • 16.4K Posts
  • 12.1K Comments
Joined 6 years ago
cake
Cake day: March 30th, 2020

help-circle



















  • That’s what I’m thinking too. There’s no reason why you couldn’t make a chip like this for a full blown Deepseek model, and then when new models come out you just print new chips for them. The really nice part is that their approach doesn’t need DRAM either because the state of each transistor acts as memory, it just needs a bit of SRAM which we don’t have a shortage of.

    I’m fully convinced that the whole AI as a service business model is going to be very short lived. Ultimately, nobody really likes their data going out to some company, and to have to pay subscription fees to use the models. If we start getting these kinds of specialized chips, they’re going to be a game changer.


  • Right, languages can help us provide a lot of guard rails, and Go is a pretty good candidate being a fairly simple language with types keeping the code on track. I’ve played a bit with LLMs writing it, and results seem pretty decent overall. But then there’s the whole architecture layer on top of that, and that seems to be an area that’s largely unexplored right now.

    I think the key is focusing on the contract. The human has to be able to tell that the code is doing what’s intended, and the agent needs clear requirements and fixed context to work in. Breaking the program up into small isolated steps seems like a good approach for getting both these things. You can review the overall logic of the application by examining the graph visually, and then you can check the logic of each step independently without needing a lot of context for what’s happening around it.

    I’ve actually been playing a bit with the idea a bit. Here’s an example of what this looks like in practice. The graph is just a data structure showing how different steps connect to each other:

    and each node is a small bit of code with a spec around its input/output that the LLM has to follow:

    It’s been a fun experiment to play with so far.




  • And another aspect is that, at least in the realm of coding, we’re trying to get these models to write code in a way humans do it. But I’d argue that it’s not really an optimal approach because models have different strength. The biggest limitation they have is that they struggle with large contexts, but if given a small and focused task, even small models can handle it well. So, we could move to structuring programs out of small isolated components that can be reasoned about independently. There are already tools like workflow engines that do this sort of stuff, they just never caught on with human coders because they require more ceremony. But I think that viewing a program as a state graph would be a really nice way for humans to be able to tell whether the semantics are correct, and then the LLM could implement each node in the graph as a small isolated task that can be verified fairly easily.



  • I run local models on a macbook pro incidentally. A 32bln param model can do a lot of useful stuff I find. The progress on making the models smaller and faster has been very rapid, and I fully expect that we’ll get to a point where you’d be able to run the equivalent of current frontier models on a local machine within a few years. On top of that, we see things like ASIC chips being developed that implement the model in hardware. These could become similar to GPU chips you just plug in your computer.

    The tech industry has gone through many cycles of going from mainframe to personal computer over the years. As new tech appears, it requires a huge amount of computing power to run initially. But over time people figure out how to optimize it, hardware matures, and it becomes possible to run this stuff locally. I don’t see why this tech should be any different.