DBRX: A New State-of-the-Art Open LLM for Predictive AI

3 min readApr 18, 2024

Source: https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

Introduction

In this article, we will introduce DBRX, a new state-of-the-art open LLM created by Databricks. We will explore the features and capabilities of DBRX, comparing it to established models such as GPT-3.5 and Gemini 1.0 Pro. We will also delve into the technical aspects of DBRX, including its fine-grained mixture-of-experts (MoE) architecture, which allows for marked improvements in training and inference performance.

DBRX: A New State-of-the-Art Open LLM

DBRX is an open, general-purpose LLM created by Databricks that sets a new state-of-the-art for established open LLMs across a range of standard benchmarks1. It provides the open community and enterprises building their own LLMs with capabilities that were previously limited to closed model APIs. According to Databricks’ measurements, DBRX surpasses GPT-3.5 and is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B on programming, in addition to its strength as a general-purpose LLM.

DBRX’s state-of-the-art quality comes with marked improvements in training and inference performance. DBRX advances the state-of-the-art in efficiency among open models thanks to its fine-grained mixture-of-experts (MoE) architecture. Inference is up to 2x faster than LLaMA2–70B, and DBRX is about 40% of the size of Grok-1 in terms of both total and active parameter-counts1.

DBRX in Practice

DBRX has been evaluated on two composite benchmarks: the Hugging Face Open LLM Leaderboard and the Databricks Model Gauntlet. On the Hugging Face Open LLM Leaderboard, DBRX Instruct and peers were evaluated on the average of ARC-Challenge, HellaSwag, MMLU, TruthfulQA, WinoGrande, and GSM8k. On the Databricks Model Gauntlet, a suite of over 30 tasks spanning six categories: world knowledge, commonsense reasoning, language understanding, reading comprehension, symbolic problem solving, and programming1.

DBRX is only one example of the powerful and efficient models being built at Databricks for a wide range of applications, from internal features to ambitious use-cases for their customers. As with any new model, the journey with DBRX is just the beginning, and the best work will be done by those who build on it: enterprises and the open community. This is also just the beginning of their work on DBRX, and you should expect much more to come1.

DBRX’s Contributions

DBRX was measured by Databricks using the EleutherAI Harness with the same older commit that is used by the EleutherAI team for their public benchmarks. This ensures a fair comparison between DBRX and other models. The results show that DBRX outperforms GPT-3.5 and is competitive with Gemini 1.0 Pro, making it a valuable contribution to the field of LLMs1.

Conclusion

DBRX is a significant contribution to the field of LLMs, setting a new state-of-the-art for established open LLMs and providing the open community and enterprises with capabilities that were previously limited to closed model APIs. Its fine-grained mixture-of-experts (MoE) architecture allows for marked improvements in training and inference performance, making it a powerful and efficient model for a wide range of applications. As with any new model, the journey with DBRX is just the beginning, and the best work will be done by those who build on it: enterprises and the open community. This is also just the beginning of Databricks’ work on DBRX, and you should expect much more to come1.