Meta released Llama 3, the most powerful open-source model in history

Home > MarketWatch > Industry News

Time：2024-04-28

25402843-QcaEAB.jpg?auth_key=1714319999-

On April 18, the AI circle welcomed another blockbuster news, Meta brought Llama, which is known as "the most powerful open-source model ever". 3 came up.

Meta has open-sourced Llama this time 3 The 8B and 70B models are available for free use by external developers, and in the coming months, Meta will launch a series of new models with multi-modal, multi-language conversations, longer context windows, and more. Among them, the large version of Llama 3 will have more than 400 billion parameters expected with Claude 3. "Compete with each other".

At the same time, Meta CEO Zuckerberg announced that based on the latest Llama 3 model, Meta The AI assistant now covers all apps such as Instagram, WhatsApp, Facebook, and has a separate website, as well as an image generator that generates images based on natural language prompts.

25402843-r95OVU.jpg?auth_key=1714319999-

|15 trillion tokens, large amount of training data and high quality

Llama 3. Superior performance is inseparable from the training of a huge dataset - 15 trillion tokens, almost Llama 2 times seven. Heaping is only the first step, and Meta also pays great attention to data quality when training, using many filtering methods. The use of synthetic data (AI-generated data) is an example of this.

Official website introduction:

We found previous generations of Llama to be very good at identifying high-quality data, so we used Llama 2 to generate training data to feed to Llama 3 text quality classifiers to support Llama 3. After receiving the "mass feed", the new version of Llama should be able to answer trivial questions more accurately, and it will also appear to be comfortable when it comes to history, STEM, engineering, and programming questions.

Meta also mentions, Llama 3 More than 5% of the pre-trained dataset comes from high-quality, non-English data. The purpose of adding this part is to better meet the needs of users from all over the world and with different language backgrounds.

Comprehensively optimize the training process, and the training efficiency is better than Llama 2 is 3 times higher

Meta also shared that in the process of training the model, compared with its previous two generations of models, it has made many process optimizations: data parallelization, model parallelization, and pipeline parallelization. When trained on a cluster of 16,000 GPUs, more than 400 GPUs per GPU were achieved Compute utilization of TFLOPS.

To maximize GPU uptime, Meta has developed an advanced training stack that automates error detection, handling, and maintenance.

Meta has also dramatically improved hardware reliability and silent data corruption detection mechanisms, and has developed new scalable storage systems to reduce checkpointing and rollback overhead. These improvements result in an overall effective training time of more than 95%. Taken together, these improvements make Llama 3 more training efficient than Llama 2 has improved about three times.

The most secure open source model in history

In response to the security issues that the outside world is most worried about about open source large models, Meta seems to have made sufficient preparations this time.

Meta has adopted a new system-level approach to the responsible development and deployment of Llama 3。 They will be Llama 3 Seen as part of a broader system, allowing developers to take full ownership of the model.

Instruction fine-tuning also plays an important role in ensuring the security of the model.

Meta's instruction fine-tuning model has been red-teamed both internally and externally. Meta's red team uses human experts and automated methods to generate adversarial prompts in an attempt to elicit problematic responses.

They conducted comprehensive testing to assess the risk of abuse associated with the model in chemical, biological, cybersecurity, and other risk areas. In addition, Meta also uses the industry's most advanced large-scale model security technology, which was born with Llama Guard 2、Code Shield and CyberSec Eval 2's new trust and safety tools ensure that models can't be easily jailbroken and output harmful content.

Llama 3What are the concerns in the field of investment?

Technological innovation and application potential: Llama 3As a powerful open-source model, it has advanced natural language processing capabilities, powerful generalization capabilities and efficient training mechanisms. Investors can pay attention to its breakthroughs in AI core technologies and its wide application potential in various application scenarios such as intelligent customer service, automated writing, intelligent recommendation, data analysis, code generation, and other fields.

Ecosystem Building: Open source means Llama 3 It may attract a large number of developers, enterprises and research teams to participate in the optimization and application development of the model, forming an ecosystem around the model. Investors can focus on the network effects brought about by ecosystem building, as well as the new business models and partnerships that may emerge.

Market competitive advantage: Compared with similar competing products, such as OpenAI's GPT series models, Llama 3 Advantages in certain performance indicators may lead to an increase in market share and brand influence, resulting in a higher return on investment.

Commercialization path exploration: despite Llama 3 is open source, but value-added services such as customized services, consulting services, education and training, and API interface authorization provided around the model may become its profit model. Investors can keep an eye on Llama 3. How the team behind it can turn its technical advantages into economic benefits.

[PREV]：Boston Dynamics unveils the all-electric Atlas robot

[NEXT]：State Council Report: Focus on Building a "National Team" in the Financial Industry

[Back to List]

TEL：

18117862238

Email：yumiao@jt-capital.com.cn
Address：20th floor, Taihe · international financial center, high tech Zone, Chengdu

LINKS

Wechat
Tel

18117862238
Top