Big data analytics and machine learning solution provider Databricks (USA) announced the release of Dolly 2.0, an open source next-generation generative artificial intelligence (AI) model that has similar capabilities to ChatGPT (OpenAI).
Dolly 2.0, like its predecessor Dolly, released a couple of weeks ago, uses a smaller data set than most large language models (LLMs) have. Dolly had 6 billion parameters, while Dolly 2.0 has twice as many – 12 billion. For comparison, GPT-3 has 175 billion parameters. Dolly 2.0 was reportedly built on a high quality data set.
A great feature of the new generative AI models is the ability to use your own training dataset to create cohesive sentences and answer user questions. And Dolly 2.0 can do this even with much less input than OpenAI models. This, in turn, allows you to use the model on your own servers without having to share data with third parties.
“We believe that models like Dolly will help democratize LLM from something that very few companies can afford to a commodity that every company can own and customize to improve their products,” Databricks said. A Databricks executive told SiliconANGLE that businesses “can monetize Dolly 2.0.”
Databricks offers Dolly 2.0 under a Creative Commons license, a fully open source, databricks-dolly-15k training dataset that contains 15,000 high-quality human-created query/response pairs. All this can be freely used, modified and supplemented, as well as used in commercial projects without paying anything to anyone. Researchers and developers can access Dolly 2.0 on Hugging Face and GitHub.
According to Databricks, the Dolly 2.0 is currently the only model that has no license restrictions. Other models, including Alpaca, Koala, GPT4All and Vicuna, cannot be used for commercial purposes due to the use of training data provided to them with certain conditions.
The original version of Dolly was trained on Stanford Alpaca data using the OpenAI API, so it could not be used for commercial purposes, since in this case the licenses prohibit the creation of competing models. Therefore, Databricks decided to create its own model, using only the answers of its employees. Tasks for them included, for example, requests to speak on the topic “Why do people like comedies?”, Summarize information from Wikipedia, write love letters, poems, and even songs.
If you notice an error, select it with the mouse and press CTRL + ENTER. | Can you write better? We are always glad to new authors.