Bend the Rules with Blended Models 🌪️
So essentially,
Blended model of 25B parameters is better than 175B parameter GPT
Paper: Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM (9 pages)
Researchers from the University of Cambridge, University College London, and Chai Research are interested in improving the quality of LLM responses. Their research considers the hypothesis of whether smaller chat AI models, when combined, can deliver comparable or even superior performance to their larger counterparts, without the computational burden.
Hmm..What’s the background?
Previous research in ensembling approaches for combining models has faced challenges in applying these methods to generative language tasks. Issues arise due to the sequential nature of the outputs and limited black-box access to increasingly common large language models (LLMs). However, Minimum Bayes' Risk (MBR) decoding has been effectively applied to Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) tasks and enables the selection of the "best" system output.
Ok, So what is proposed in the research paper?
The main proposal of the research is to introduce Blended, a straightforward yet effective method of integrating multiple chat AIs. Blended randomly selects responses from different chat AIs, enabling a group of smaller models to collaboratively achieve superior performance compared to a singular large model.
Blended, with three 6-13B parameter LLMs, outperforms OpenAI's 175B+ parameter ChatGPT in terms of user retention and engagement.
Blended offers significant performance gains in engagement and user retention while maintaining inference speeds similar to small chat AIs.
By blending multiple smaller open-source systems, it is possible to drastically improve user conversational experiences without increasing inference costs.
And what’s next?
The next steps would be to expand the selection set of component chat AIs and to explore the impact of increasing the selection set on the overall quality of conversations. Future work will also explore methodologies to design and train a classifier to allow for a more optimal distribution of the component chat AI.
So essentially,
Blended model of 25B parameters is better than 175B parameter GPT