Thursday, July 20, 2023

Llama 2: A New Open Source Large Language Model from Meta AI

Introduction

In this video, I will be discussing the recent release of Llama 2, a new open source large language model from Meta AI. I will cover the key features of Llama 2, its performance, and how you can use it today.

Key Features of Llama 2

Llama 2 is a significant improvement over previous open source large language models. It has a larger parameter count, which allows it to generate more complex and detailed text. It also has a number of new features, such as the ability to generate different creative text formats, translate languages, and write different kinds of creative content.

Performance of Llama 2

Llama 2 has been shown to outperform previous open source large language models on a number of benchmarks. For example, it has a lower violation rate than other models, which means that it is less likely to generate text that is harmful or offensive.

How to Use Llama 2 Today

Llama 2 is available to use today. You can download the models, weights, and code from Meta's hugging face repository. There are also fully hosted versions of the 7B and 13B models available.

Conclusion

Llama 2 is a powerful new open source large language model. It has a number of advantages over previous models, including its larger parameter count, its new features, and its lower violation rate. If you are interested in using large language models, I recommend checking out Llama 2.

Additional Information

In the video, I also discuss the following topics:

  • The partnership between Meta AI and Microsoft
  • The difference between open source and frontier models
  • The future of large language models

I hope you found this video helpful. If you have any questions, please leave a comment below.

llama 2 was just released this morning and it represents a massive Leap Forward in open source large language models and brings open source models that much closer to gpt4 Performance llama 2 is completely open source for both research and Commercial purposes well almost completely open source I'll talk about that in a minute I read over the entire 76 page white paper I read all the news and in today's video I'm going to share all of the most interesting things that I learned and at the end of the video I'll show you how you can start using llama 2 today let's go so as I mentioned llama 2 was just released this morning and according to meta AI it is a suitable substitute for closed Source models AKA Chachi PT and meta AI continues to contribute to the open source Community which to be honest really continues to surprise me look at this graph from the top tech companies in the world and their contributions to hugging faces open source community and this is especially true when considering the resources necessary to produce a model like this the smartest people in the world a ton of compute power and expensive data sets with some estimates putting the data sets alone at 25 million dollars

the Llama 2 white paper is huge and it spells out the entire recipe including the model details the training stages the hardware the data Pipeline and The annotation process so let's get some more specs out of the way now it comes in two flavors and three sizes they have the base llama 2 model and another llama 2 chat model specializing in dialogue both come in 7 billion 13 billion and 70 billion parameter sizes

they also created what many consider to be the sweet spot for large language model sizes which is 34 billion parameters but they didn't release it I'll talk more about that in a minute llama 2 was trained using a cluster of Nvidia a100 gpus and Nvidia continues to benefit from the AI wave going on right now meta trained llama 2 on a 40 larger data set and doubled the context size from two thousand to four thousand tokens now although four thousand still isn't that big subsequent fine-tuned models will likely greatly increase the size of the context window as it has done with the Llama one model they also use the newer technique called grouped query attention to help improve inference scalability for the larger models last something I find really interesting they actually talk about carbon emissions as part of their white paper and announcement

during the training process these models take an enormous amount of compute power and all that compute power is powered by electricity and of course there's going to be carbon emissions from production of that electricity so noting the efficiency and detriment to the environment I see as a good thing now one thing that I was surprised by and found incredibly interesting is that meta partnered with Microsoft on this and of course Microsoft made an enormous investment in open AI which is a completely closed Source large language model

so why did they do that why is Microsoft partnering on an open source model when it's clearly competitive with chat GPT well let's look at the announcement they say we offer developers choice in the types of models they build on supporting open and Frontier models and are thrilled to be meta's preferred partner as they release their new version of llama 2 to commercial customers for the first time now I want to point out a key word here the word frontier what they mean by that is the most Cutting Edge models AKA gpt4 so they're really making a clear distinction between open source models and the better model gpt4 and so this is really a fine balance between Microsoft investing and contributing to open source which because of Satya Nadella their CEO has been a core element of their culture and protecting their multi-billion dollar investment in open Ai and chat gbt now

let's talk about what I consider to be the most important aspect of llama2 going back to llama 1 it was an incredibly powerful model that was leaked from meta and spawned a wave of fine-tuned versions and lit a spark in the open source llm room but one major drawback of llama one was that it was not commercially viable you can use it for research purposes but you couldn't build products and companies on top of it but now llama 2 is commercially viable but remember

when I said llama 2 was almost completely open source well it turns out that there's one caveat to that if you have greater than 700 million users on a product built on top of llama2 you need to get meta's permission to use it now of course I can imagine that's one of those good problems to have as a company if you grow a product to have 700 million users you probably want to have that discussion or you're already investing in your own internal models so why did to do that they did that to protect their model against their biggest competitors

they don't want Google Microsoft Amazon taking llama 2 and building massive products on top of it so although it is commercially viable for 99.9 percent of cases I wouldn't say it is completely open source and commercially viable if I were building another company I'd probably risk building on top of llama 2 though and crossing the 700 million user bridge when I get to it now one thing that seems to be really missing from the research paper and the announcement is its coding ability and from what I've gathered it doesn't seem to have very strong coding ability

in fact I've seen it called out that GPT 4's coding ability far surpasses what is possible with even llama 2. now let's talk about safety which seems to be the primary focus of much of the work of llama 2. in fact almost half of the Llama 2 white paper is dedicated to talking about safety guard rails red teaming and evaluations so now let's go back to that 34 billion parameter model why didn't they release it they have the 7 billion parameter model the 13 billion and the 70 billion but they had the 34 billion and they just didn't release it it turns out that the 34 billion perimeter model was significantly less safe than the other versions of their model both larger and smaller and so what they said is they are delaying the 34 billion parameter model due to the lack of time to sufficiently red team and get the safety to a better place

let's take a look at this graph to understand how much safer llama2 is than other models on the left side in these dark blue these are llama 2 models on the right side these are both open source and closed Source models and this is violation percentage and the lower the better so basically how often did the large language model produce a result that violated its guidelines and if we look closely on the left side the 7 13 and 70 billion parameter model all perform about the same in terms of violation percent percentage but the 34 billion parameter model is double that of the other models and that is why they're delaying the release of the 34 billion parameter model but I'm personally very excited for that specific size because it's large enough to have great quality but small enough to fit on a high-end consumer grade GPU now Lama 2 is censored but if it's anything like llama one there are going to be fine-tuned versions of it that effectively remove the censorship altogether so talking about safety and helpfulness there has traditionally been a trade-off between these two things the more rewards that are given to safety

during training the less helpful a model becomes however one of the big advancements of this paper is that meta seems to have solved that problem with a two reward model approach one for helpfulness and one for saving now they haven't released these reward models but I really hope they do okay with all of that aside meta still does say that there is a significant

Prof performance gap between llama 2 and the frontier models and the frontier models are gpt4 by open Ai and palm 2 by Google okay so now the part that I know you want to hear how do I actually use this today well you can download the models the weights and the code at meta's hugging face repository and there are already fully hosted versions of the 7B and 13B models all of which I'll link to in the description below I plan on doing extensive testing on not only the base models and all the different sizes of them but all of the inevitable fine-tuned versions that come from the Llama 2 model I'll be running all of the versions through my llm rubric and I'm going to report the results to you if you like this video please consider giving me a like And subscribe and I'll see you in the next one

No comments:

Post a Comment