- Brain Scriblr
- Posts
- LoRA-Land and efficient small language models
LoRA-Land and efficient small language models
News
The initial launch of Google Gemini did not go so well. After the tool created several severe historical inaccuracies in images the tool was taken offline by Google.
Mistral has announced it is producing a chatbot to compete with OpenAI and Claude.
Stability AI has announced the release of Stable Diffusion 3. In a statement from the company, they said they are committed to keeping their tools from misuse.
DatologyAI is a platform for curating AI training datasets has escaped stealth mode with a new $11.4 million in funding.
AI Research
Google research has been working on making LLMs more responsive to individual interactions with GPTs
Work in Pre-Instruction Tuning(PIT) at the University of Washington and Carnegie Mellon Univ. PIT instruction tunes LLMs in QA pairs before a retraining run which improves the ability of the LLM to generate new knowledge.
Predibase released LoRA-Land, a set of 25 fine-tuned Mistral models. More on this below.
AI Tool
Syllaby uses AI to create ideas, write blog posts, clone voices for videos, and give you social media analytics. Syllaby is an all-in-one media marketing platform for social media and SEO content. It is especially useful for video creation for a faceless YouTube channel.
HeyGen creates professional AI videos in minutes. These videos look and feel more realistic than most AI video tools out there. Useful for explainer videos or video tutorials.
Book Recommendation
Black Box Society was published in 2015, yet is still relevant today. This book is especially precise in its examination of how online algorithms with search are manipulated. Sounds like a tell-all tale, but it is not that. I found it helped me to understand how decisions are made in financial matters, and in all matters online.
Midjourney prompt » blustery day outside, while two people sit in their kitchen discussing poetry.
Open source vs closed source in LLMs
When comparing proprietary and open-source solutions, it's evident that ChatGPT, Gemini, and Claude lead the pack in popularity and performance according to widely recognized benchmarks.
These proprietary models are backed by substantial investments, enabling them to offer highly competitive pricing. However, there's a catch that extends beyond financial considerations.
With proprietary models, you relinquish control over the technology. This could result in unexpected updates that necessitate changes to your implementations. Moreover, you must rely on the assurances of these companies that they won't misuse the sensitive data that's transmitted to these models.
It's worth noting that these companies have significant motivation to leverage your data to enhance their models further.
Conversely, open-source models such as LLaMa or Mixtral8×7B, despite typically providing lower quality outcomes, grant you complete control over the technology.
This ensures that your confidential data, a critical asset in today's business environment, remains secure. Furthermore, the quality disparity between proprietary and open-source models can be narrowed through fine-tuning.
Even though open-source models indeed lag behind today, with enough fine-tuning on a particular task, you can increase the performance of your open-source model far beyond what ChatGPT can offer.
It is possible to build a custom GPT for specific use cases one at a time and achieve results that are better than generalized GPTs such as OpenAI Chatgpt or ClaudeAI.
Predibase is one org that has shown the possibilities that can be achieved using smaller specialized language models when compared to general-purpose GPTs. Their GPTs are more efficient and effective when working alongside OpenAI tools.
In other words, to get the most from GPTs we need to just fine-tune and optimize the meaningful portion of the language model. When Orca-2 was tested it was shown that the model could perform math problems at nearly the same efficiency as OpenAI larger models could.
I think the future use of AI is not in large general-purpose AI tools. It will be in smaller footprint specialized AI tools. Imagine an AI chatbot trained on your company's voice and tone. G