Apple engineers have shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models.
Apple engineers have shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models.
Because of the diminishing returns in using larger and larger models, I’m hopeful that this could lead to more efficient LLM implementations that aren’t so harmful to the environment. Just maybe.
“We made Siri faster but each request now drains a lake.”
“Hey, Siri, give me a real-time count of all existing lakes and update it every second.”
Siri: “Let me search it on the web.”