Since the trend of large-scale models this year, vector databases have attracted industry attention due to their own attributes. On July 4th, Tencent Cloud officially released the AI Native vector database TencentCloudVectorDB. This database can be widely used in scenarios such as training, reasoning, and knowledge base supplementation of large models. It is the first vector database in China to provide full lifecycle AI from the access layer, computing layer, to storage layer. This means that Tencent Cloud, which has already launched an industry big model platform, will also launch a single point cloud service product specifically targeting the huge data demand of big models. This is still the first example in the domestic cloud market.
01 | What is a vector database? First, let's briefly introduce vector databases. It is achieved by quantifying data, then storing and querying it. Capable of high-speed processing of large-scale complex data, high-dimensional data (such as images, audio and video, etc.); Simultaneously supporting complex query operations, it can easily scale to multiple nodes to handle larger scale data. Specifically, in the field of large models, vector databases can effectively reduce training costs, supplement the model's "Long-term memory", update the knowledge base faster, and solve problems such as the complexity of prompt word engineering. Like Tencent Cloud Vector Database, it can support up to 1 billion levels of vector retrieval scale, with latency controlled at the millisecond level, which is 10 times larger than traditional plug-in database retrieval scale, and has a peak query per second (QPS) capability of millions of levels. What is this concept? Luo Yun, Vice General Manager of Tencent Cloud Database, explained that if you want to find 1 image with a dog in 1 billion images, Tencent Cloud Vector Database can support 1 million similar requests simultaneously, search in 1 billion scale images, and control the average latency within 100 milliseconds. In order to better meet the needs of the large model field, Tencent Cloud has redefined an AI Native development paradigm in this latest release. We will provide comprehensive AI solutions for access layer, computing layer, and storage layer. The change brought about is that it enables users to apply AI capabilities throughout the entire lifecycle of using vector databases. The direct benefit is that it originally took about a month for enterprises to connect to a large model. With Tencent Cloud Vector Library, it can take only 3 days, reducing the threshold for enterprises to use it.
02 | Embracing Big Models as a Hard Need for Enterprises. According to Tencent Cloud, vector databases can greatly improve efficiency and reduce costs by vectorizing data and storing and querying it. It can solve the problems such as high pre training cost of large models, lack of "Long-term memory", insufficient knowledge updating, complex prompt word engineering, break through the time and space constraints of large models, and accelerate the implementation of large models in the industry scene. Statistics show that using Tencent Cloud Vector Database for classification, deduplication, and cleaning of large model pre training data can achieve a 10 times improvement in efficiency compared to traditional methods. If the Vector Database is used as an external knowledge base for model inference, the cost can be reduced by 2-4 orders of magnitude. It is reported that at the access layer, Tencent Cloud Vector Database supports natural language text input, and adopts a "scalar+vector" query method, supporting full memory indexing and up to one million queries per second (QPS); At the computing level, the AINative development paradigm can achieve full data AI computing, providing a one-stop solution to the challenges of text segmentation, vectorization, and embedding when building a private domain knowledge base for enterprises; At the storage layer, Tencent Cloud Vector Database supports intelligent data storage and distribution, helping enterprises reduce storage costs by 50%.
03 | Jingtai's viewpoint: The market demand is already very obvious in terms of external factors. Not only has it seen the development trend in the field of vector databases, but Tencent Cloud believes that cloud vendors also have certain advantages in this area. Luo Yun stated that due to the importance attached to data, domestic enterprises will hope to be more stable and long-lasting when choosing data products and services. So in China, to
In the decision-making chain of B, public cloud vendors providing their own corresponding technical services will be very competitive. It is predicted that by 2030, the global vector database market is expected to reach 50 billion US dollars, and the domestic vector database market is expected to exceed 60 billion RMB. The trend of Tencent Cloud represents the efforts of cloud vendors in vector databases. In addition, we have also seen vector database vendors, including Zilliz, gradually updating and upgrading their products for large models. And some established database vendors (such as Oracle) are also releasing AI related businesses. At present, the industry is still in a relatively early stage, and the future trend depends on the specific trends of various manufacturers. But in short, driven by the trend of big models, the field of vector databases is still heating up.