ByteDance just dopped Doubao-1.5-pro tht uses sparse MoE architecture, it matches GPT 4o benchmarks while being 50x cheaper to run, and it's 5x cheaper than DeepSeek

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 5 months ago

ByteDance just dopped Doubao-1.5-pro tht uses sparse MoE architecture, it matches GPT 4o benchmarks while being 50x cheaper to run, and it's 5x cheaper than DeepSeek

xiaohongshu [none/use name]@hexbear.net · edit-2 5 months ago

I think this kind of statement needs to be more elaborate to have proper discussions about it.

LLMs can really be summarized as “squeezing the entire internet into a black box that can be queried at will”. It has many use cases but even more potential for misuse.

All forms of AI (artificial intelligence in the literal sense) as we know it (i.e., not artificial general intelligence or AGI) are just statistical models that do not have the capacity to think, have no ability to reason and cannot critically evaluate or verify a certain piece of information, which can equally come from legitimate source or some random Reddit post (the infamous case of Google AI telling you to put glue on your pizza can be traced back to a Reddit joke post).

These LLM models are built by training on the entire internet’s datasets using a transformer architecture that has very good memory retention, and more recently, with reinforcement learning with human input to reduce their tendency to produce incorrect output (i.e. hallucinations). Even then, these dataset require extensive tweaking and curation and OpenAI famously employ Kenyan workers at less than $2 per hour to perform the tedious work of dataset annotation used for training.

Are they useful if you just need to pull up a piece of information that is not critical in the real world? Yes. Is it useful if you don’t want to do your homework and just let the algorithm solve everything for you? Yes (of course, there is an entire discussion about future engineers/doctors who are “trained” by relying on these AI models and then go on to do real things in the real world without developing the capacity to think/evaluate for themselves). Would you ever trust it if your life depends on it (i.e. building a car, plane or a house, or treating an illness)? Hell no.

A simple test case is to ask yourself if you would ever trust an AI model over a trained physician to treat your illness? A human physician has access to real-world experience that an AI will never have (no matter how much medical literature it can devour on the internet), has the capacity to think and reason and thus the ability to respond to anomalies which have never been seen before.

An AI model needs thousands of images to learn the difference between a cat and a dog, a human child can learn that with just a few examples. Without a huge input dataset (helped annotated by an army of underpaid Kenyan workers), the accuracy is simply crap. The fundamental process of learning is very different between the two, and until we have made advances on AGI (which is as far as you could get from the current iterations of AI), we’ll always have to deal with the potential misuses of AI in our lives.

SkingradGuard [he/him, comrade/them]@hexbear.net · 5 months ago

are just statistical models that do not have the capacity to think, have no ability to reason and cannot critically evaluate or verify a certain piece of information, which can equally come from legitimate source or some random Reddit post

I really hate how techbros have convinced people that it’s something magical. But all they’ve done is convinced themselves and everyone else that every tool is a hammer

ByteDance just dopped Doubao-1.5-pro tht uses sparse MoE architecture, it matches GPT 4o benchmarks while being 50x cheaper to run, and it's 5x cheaper than DeepSeek

ByteDance just dopped Doubao-1.5-pro tht uses sparse MoE architecture, it matches GPT 4o benchmarks while being 50x cheaper to run, and it's 5x cheaper than DeepSeek

ByteDance Releases Doubao Large Model 1.5 Pro, Performance Surpassing GPT-4o and Claude3.5Sonnet