Join top executives in San Francisco July 11-12 to hear how leaders are integrating and optimizing AI investments for success. Learn more
MosaicML, a San Francisco-based AI startup, today announced the release of its revolutionary language model, MPT-30B. The new model, trained at a fraction of the cost of its competitors, promises to revolutionize the field of artificial intelligence in business applications.
Naveen Rao, CEO and co-founder of MosaicML, said in an interview with VentureBeat that MPT-30B was trained at a cost of $700,000, far less than the tens of millions of dollars required to train GPT-3. The MPT-30B’s lower cost and smaller size could make it more attractive to companies looking to implement natural language processing (NLP) models in applications such as dialog systems, code completion, and text summarization.
“MPT-30B adds better capabilities for summarizing and entering more data into the prompt and (model) reason on that data,” Rao said. “So if that’s a requirement for you, who cares less about service economics, then perhaps the 30B is a better fit (versus our 7B model).”
Roa said that MosaicML used various techniques to optimize the model, for example Alibi AND FlashWarning mechanisms that allow for long context lengths and high GPU computation usage. He also said that MosaicML was one of the few labs to have access to NVIDIA H100 GPUs, which increased throughput per GPU by more than 2.4X and translated to faster finish times.
“We want to get as many people involved in the technology as possible,” Rao said. “This is our goal. It’s not to be exclusive. It’s not being elitist. It’s to get more people to use it.
Enable companies to create custom templates for cheaper
MosaicML allows companies to train models on their data using the company’s model architectures and then deploy models via its inference API. Rao said that while he couldn’t disclose many customer examples due to confidentiality, startups have used MosaicML’s templates and tools to build natural language frontends and search systems.
According to Rao, MosaicML’s release of MPT-30B and its model deployment tools highlight the company’s goal of making advanced AI more accessible. “I think the big problem is really just empowering more people with technology. And this has been our goal from the beginning: to be truly transparent about costs, times and difficulties”.
The availability of MPT-30B as an open source model and the model tuning and implementation services of MosaicML position the startup to challenge OpenAI for dominance in the large language model technologies market. With more advanced models and tools slated for release in the coming months according to Rao, the race is on for leadership in the next generation of AI.
The future of TO THE involves many custom LLMs
The company’s vision for the future of generative AI is to create a tool that can assist experts in various fields, accelerating their work without replacing them. “I think the future, at least for the next five years, is about adopting these techniques and improving anyone who is already an expert,” Rao explained.
In addition to making AI technology more accessible, MosaicML is focusing on improving data quality for better model performance. They are developing tools to help users overlay domain-specific data during the pre-training process. This ensures a diverse and high-quality mix of data, which is essential for building effective AI models.
With the release of the MPT-30B, MosaicML is poised to make significant advances in the AI space, offering a more affordable and powerful option for enterprises. Their dedication to open source technology and empowering more people with AI tools has the potential to unlock a wealth of untapped innovations, making AI a valuable resource for businesses around the world.
As businesses continue to adopt and invest in AI technology, MosaicML’s MPT-30B could very well be the catalyst driving a new era of more accessible and impactful AI solutions in business.
VentureBeat’s mission it is to be a digital city square for technical decision makers to gain insights into transformative business technology and transactions. Discover our Briefings.