ByteDance officially launches its latest Doubao large model 1.5 Pro (Doubao-1.5-pro), which demonstrates outstanding comprehensive capabilities in various fields, successfully surpassing the well-known GPT-4o and Claude3.5Sonnet in the industry. The release of this model marks an important step forward for ByteDance in the field of artificial intelligence. Doubao 1.5 Pro adopts a novel sparse MoE (Mixture of Experts) architecture, utilizing a smaller set of activation parameters for pre-training. This design's innovation...
I think power usage is a legitimate concern still, but on top of that the unreliability is a huge factor. LLMs hallucinate all the time, by design, so if you are using them for anything where it’s important to be correct you are bound for failure. There will always be hallucination lest we overfit the model, but a model that’s overfit with no hallucination just reproduces its training data and therefore has no more functionality than a search engine but with vastly higher energy requirements. These things have applications, but really only for approximating stuff where no other approach could do it well. IMO any other use case is a mistake
People are actively working on different approaches to address reliability. One that I like in particular is neurosymbolic type of model where deep neural networks are used to classify data and find patterns, and a symbolic logic engine is used to actually reason about it. This basically gives you the best of both worlds. https://arxiv.org/abs/2305.00813
I think power usage is a legitimate concern still, but on top of that the unreliability is a huge factor. LLMs hallucinate all the time, by design, so if you are using them for anything where it’s important to be correct you are bound for failure. There will always be hallucination lest we overfit the model, but a model that’s overfit with no hallucination just reproduces its training data and therefore has no more functionality than a search engine but with vastly higher energy requirements. These things have applications, but really only for approximating stuff where no other approach could do it well. IMO any other use case is a mistake
People are actively working on different approaches to address reliability. One that I like in particular is neurosymbolic type of model where deep neural networks are used to classify data and find patterns, and a symbolic logic engine is used to actually reason about it. This basically gives you the best of both worlds. https://arxiv.org/abs/2305.00813