
|OpenAI's commercialization journey
OpenAI's commercialization journey is divided into four stages:

After OpenAI makes a profit in the future, it will gradually return to investors:
1. Priority is given to ensuring that the first batch of investors of OpenAI will recover their initial capital;
2. Microsoft investment completed, OpenAI After the first investors of the LP recover their initial investment, Microsoft is entitled to OpenAI LP 75% profit;
3. Microsoft recovered $13 billion in investment from OpenAI After LP made a profit of $92 billion, its share of profits fell from 75% to 49%;
4、OpenAI After the profits generated by the LP reach $150 billion, the shares of Microsoft and other venture investors will be transferred to OpenAI for free LP's General Partner – OpenAI, a non-profit organization, Inc。
|OpenAI's technology development history

|GPT Iteration Process: GPT has iterated on five generations of models so far
GPT-1: In June 2018, OpenAI unveiled the first GPT model that combines transformers with unsupervised pre-training techniques. The GPT-1 model architecture is based on the Transformer model, which can make predictions on large-scale tasks by learning large amounts of unlabeled text data. The number of model parameters is 117 million.
GPT-2: In February 2019, OpenAI published a blog post titled Better Language Models and Their Implications", the GPT-2 model was officially announced. GPT-2 is a natural language processing model based on the Transformer architecture that uses unsupervised pre-training techniques that allow GPT-2 to learn language models from unlabeled text data. The number of model parameters is 1.5 billion.
GPT-3: In May 2020, researchers at OpenAI submitted a paper titled "Language Models are Few-Shot Learners announced the birth of GPT-3. GPT-3 has the ability to learn with fewer and no shots, that is, it can produce meaningful outputs without being trained on a specific task or domain. The number of model parameters is 175 billion.
ChatGPT: In November 2022, OpenAI officially launched ChatGPT, which is conversational and interactive. Compared with GPT-3, ChatGPT introduces reinforcement learning (RLHF) technology based on human feedback and a reward mechanism to improve the accuracy of the model.
GPT-4: In March 2023, OpenAI officially launched GPT-4, which has become the most advanced multimodal large model at present. GPT-4 has made progress mainly in recognition comprehension ability, creative writing ability, processing text volume, and iteration of custom identity attributes.

|GPT-1 of GPT Iterative Process: GPT-1 adopts a multi-layer Transformer architecture
GPT-1 adopts a multi-layer Transformer architecture, and the overall architecture is: input layer - > n Transformer blocks - > output layer. The input layer uses a byte-pair based encoding method to encode the original text into a fixed-length vector as the input to the model. Next, the model inputs these vectors into n Transformer blocks for processing, and each Transformer block contains several sub-layers, including a multi-head self-attention sublayer and a fully connected feedforward network sub-layer. These sub-layers form the main structure of the Transformer block, and each sub-layer processes different input information, where the multi-head self-attention sub-layer is used to calculate the importance of each word in context, and the fully connected feedforward network sub-layer is used to extract features and generate new representations. Finally, the model inputs the output vector of the last Transformer block into the output layer to generate a prediction for the next word. The whole process is called generative pre-training.
GPT-1 uses the BooksCorpus dataset, which contains 7,000 unpublished books. The authors chose this dataset for two reasons: 1) the dataset has longer contextual dependencies, which allows the model to learn more long-term dependencies, and 2) these books are difficult to see on downstream datasets because they have not been published, which can better verify the generalization ability of the model.
|GPT-2 in the GPT Iterative Process: Adopt a larger training set and try unsupervised training
The goal of GPT-2 is to train a word vector model with stronger generalization ability, and it does not make too much structural innovation and design of GPT-1's network, but only uses more network parameters and a larger data set. In the GPT-2 phase, OpenAI removed the supervised fine-tuning of the GPT-1 phase and became an unsupervised model. In its related paper, it achieved state-of-the-art results on 7 of the 8 test language modeling datasets.
GPT-2 has a tenfold increase in the number of parameters and the size of the training dataset, with 1.5 billion parameters, and the dataset is WebText: a corpus of eight million documents with a total size of 40 GB。 The texts were collected from the 45 million top-rated web pages on Reddit and included a variety of topics and sources, such as news, forums, blogs, Wikipedia, and social media, among others.

|GPT-3 in the GPT Iterative Process: A breakthrough in the AI revolution to further improve generalization capabilities
OpenAI published a paper on GPT-3 in May 2020, and the parameters have increased by more than two orders of magnitude compared to GPT-2, with 175 billion parameters and 570 A massive corpus of gigabytes of text containing about 400 billion tokens. This data mainly comes from CommonCrawl, WebText, English Wikipedia and two book corpora (Books1 and Books2) Improved algorithms, powerful computing power and increased data have promoted the AI revolution, making GPT-3 the most advanced language model at the time.
GPT-3 eliminates the need for large amounts of data that are expensive to label before being used to train language models. By using a pre-trained model, GPT-3 can generate enough responses by "using only a few labeled samples", allowing for greater cost and time efficiency in development.
|ChatGPT in GPT Iterative Process: Reinforcement Learning Added to Release the Fourth-Generation Model
The training process of ChatGPT is divided into three steps: fine-tuning the GPT3.5 model, training the payback model, and reinforcement learning to enhance the fine-tuning model.

|GPT-4 of the GPT Iterative Process: More creative and able to accept longer text inputs
On March 14, 2023, OpenAI released GPT-4, which caused a sensation in the entire tech world, and according to OpenAI itself, GPT-4 will be the company's landmark model. In its concept video, OpenAI explains how GPT-4 can solve more complex problems, write larger code, and generate text from images. In addition, OpenAI also promised that GPT-4 will be more secure and coordinated than previous models, including ChatGPT's previous application of GPT-3.5.
GPT-4 not only greatly improves the accuracy of its answers, but also has a higher level of image recognition ability, and can generate lyrics, creative texts, and achieve stylistic changes. In addition, GPT-4's text input limit has also been increased to 25,000 words, and there are more optimizations for language support other than English.
|GPT-4o of GPT Iterative Process: Real-time inference on audio, visual, and text
On May 13, 2024, OpenAI launched its new flagship model, GPT-4o, which can perform inference on audio, visual, and text in real time. GPT-4o's "o" stands for "omni", a word meaning "omni" and derived from the Latin word "omnis". In English, "omni" is often used as a root word to denote the concept of "all" or "all".
The new model is capable of emotionally engaging with the user and pretending to be excited, friendly, and even sarcastic, with a minimum response time of 232 milliseconds, similar to the response time of a human in a conversation.

|Sora: The latest capability of the Wensheng video model
According to OpenAI Sora technical report, this report has a total of 13 authors. Among them, Aditya Ramesh, Tim Brooks and Bill Peebles is a core member of the team, with Aditya, as the proposer of the image generation model DALL-E, leading three iterations of the DALL-E model from 2021 to 2023.

The development history of Google's large model

|Meta: Released LLaMA, a large language model
On February 25, 2023, Meta's official website announced a new large language model, LLaMA (Large Language Model Meta AI)。 In terms of parameter size, Meta provides LLaMA models with four parameter scales: 7 billion, 13 billion, 33 billion and 65 billion, which are trained in 20 languages and have the following characteristics:
The parameter scale is small and the computing power requirements are low. The LLaMA parameter scale is compared to ChatGPT's underlying model, OpenAI GPT-3 has 175 billion (175B) parameters, and the number of parameters in the LLaMA model is small.
There is a lot of training data. The training dataset for LLaMA includes the open data platform Common Crawl, English document dataset C4, code platform GitHub, Wikipedia, paper preprint platform ArXiv.
Outstanding AI capabilities. It is better than GPT-3 in logical reasoning and other aspects, and better than LaMDA and PaLM in code generation.

Character.AI Personality Character Generator
Character.AI is an AI-based chatbot app founded by Google's former LaMDA team. Users can create virtual characters through Character.AI, shape their personalities, set specific parameters, and then post to the community for others to use and chat with, with a social aspect. The beta version of CharacterAI opened to the public in September 2022 and reached 1.7 million downloads in its first week of launch.
Character.AI is built on deep learning and a scalable language model, and the content of responses is continuously improved through user ratings. In March 2023, Character.AI closed a $150 million funding round at a valuation of $1 billion. At present, the team has about 30 people. According to the latest media reports, Character.AI is currently in talks with venture capitalists for equity financing, which will value the company at more than $5 billion.
|Jasper: A company that defines itself as the content brain of the future
Jasper AI is a SaaS company for advertising marketers, self-media bloggers and other groups, mainly providing copywriting generation services. Its core product uses GPT-3's API interface, which can use AI to generate a variety of marketing copy.
Founded in 2021, the company peaked in its debut, with revenue exceeding $40 million in its first year, which doubled to $80 million in 2022 (ARR annual recurring revenue). In the most prosperous two years, VCs flocked to the market, and the media rushed to report that in October 2022, Jasper received $125 million in Series A financing, with a valuation of $1.5 billion.
But it wasn't until November 30, 2022, when ChatGPT appeared, that the situation suddenly took a turn for the worse. ChatGPT's powerful copywriting ability makes people suddenly realize that the "official" seems to be able to easily punch and kill the "second creation". Jasper's product is similar to the "skinned version of GPT-3", and there is almost no moat in the face of ChatGPT, which is built by OpenAI. Recently, Jasper founder Dave Roggenmoser announced layoffs and reshaping the team on LinkedIn.

|Replit: No threshold + collaboration + interaction, AI drives the coding revolution
Founded in 2016, Replit's core product is browser-based integrated development environment software, and in 2022 it launched the GhostWritterAI programming assistant, where users can develop, compile, run, and host applications in more than 50 languages.
On April 25, 2023, Replit announced that it had raised $97.4 million in Series B+ financing, with a post-investment valuation of approximately $1.2 billion. Replit's high valuation is due to its excellent statistics and relatively stable fee model. At the beginning of 2023, Replit had 22.5 million developers from more than 200 countries and 235 million projects created. 25 billion external visits per month to applications and websites hosted on Replit, plus 1 million concurrently running containers, 3 million sequential read/write disk operations. Replit also has 200,000 generative AI utilizing each of the main ones API for AI applications. Replit offers monthly packages ranging from free to $20/month, as well as app-based packages that are available in a variety of pricing models to suit all types of users.

|Cohere: An NLP service provider focusing on the ToB market
Cohere Canada is a natural language processing platform that provides large model solutions for ToB. Cohere currently adopts a differentiated competition strategy, does not take the popular "ChatGPT-like" route, but focuses on using the capabilities of large models to serve enterprise customers, and has designed a good payment model. At present, Cohere adopts a pay-as-you-go model, and sets different prices according to the different capabilities of the model, including text generation, text summarization, re-ranking, text classification and other capabilities.
Cohere is valued at $2 billion and has received investment from a number of giants/academic bulls. In May 2023, Cohere received $250 million in financing from Salesforce and other institutions, and its valuation has reached $2 billion, making it a unicorn. In addition to Salesforce, Cohere's investors include Tiger Global and Index Well-known investment institutions such as Ventures, giants in the AI ecosystem such as NVIDIA, and Turing Award winner Geoffrey Hinton, well-known AI researcher Feifei Li and Pieter Abbeel and other academic experts in the field of AI.





