GPUs are a very busy market. Playing around with GPU/graphics cards has also become fun for digital enthusiasts after a meal: "ultra-low power consumption, extreme color, tessellation", "mining card sales", "one card for a building, two cards destroy the earth, three cards" Card Galaxy, Four Card Genesis". It once challenged or even surpassed the CPU of the same period, it has made countless gamers crazy, and it has extended its tentacles to a deeper and wider field. Because of the long-term monopoly of foreign manufacturers, domestic expectations for independent GPUs are getting stronger and stronger.
Concepts that are easily confused
GPU (Graphics Processing Unit, graphics processor), also known as display core, visual processor, or display chip, is a microprocessor designed for parallel processing and is very good at handling a large number of simple tasks, including graphics and video rendering. GPUs can be used in desktops, laptops, workstations, game consoles, embedded devices, data centers, and other scenarios that require rendering graphics or high-performance computing.
In life, we generally call GPU a graphics card. But in fact, GPU and graphics cards have subtle differences in terminology. GPU refers to the chip responsible for processing various tasks, and graphics card refer to the board that integrates GPU chips, video memory, interfaces, etc. together.
According to the way of accessing the system, GPU is divided into two types: Integrated GPU (Integrated GPU, iGPU) and Discrete GPU (Discrete GPU, dGPU). In terms of discrete graphics cards, both types of GPUs have their own characteristics and usage scenarios.
Two classifications of GPU, tabulation丨 Nutshell Technology |
In an integrated GPU, the GPU is embedded next to the CPU and does not have a separate memory bank for graphics/video, and shares system memory with the CPU. Because the integrated GPU is built into the processor, it typically consumes less power and generates less heat, extending battery life.
Discrete GPUs appear entirely as discrete boards, usually connected in PCI Express slots, just as a motherboard contains a CPU. A discrete GPU includes, in addition to the GPU chip, a number of components required to allow the GPU to operate and connect to the rest of the system. Discrete GPUs have their own dedicated memory, as well as their own memory sources and power supplies, so they perform better than integrated GPUs. But because it is separated from the processor chip, it consumes more power and generates a lot of heat.
From dedicated to general to fusion
Modern GPUs have two main functions, one as a powerful graphics engine, and the other as a highly parallel programmable processor for various neural network or machine learning tasks.
Graphics computing is what GPUs are good at. When we drag the mouse, the GPU calculates the graphics content that needs to be displayed and presents it on the screen; when we open the player to watch a movie, the GPU decodes the compressed video information into raw data; when we play a game, the GPU will The game screen is calculated and generated. Behind the click of a mouse is a complex process, including vertex reading, vertex rendering, primitive assembly, rasterization, pixel rendering...etc.
Graphics GPUs are widely used in games, image processing, cryptocurrency, and other scenarios, focusing on parameters such as the number of frames in graphics, rendering fidelity, and real scene mapping.
Implementing different stages of hardware acceleration for the pipeline defined by the graphics API, Tabulation丨 Nutshell Technology, References丨 "Computer Architecture Fundamentals" |
General-purpose computing is the best embodiment of the advantages of GPU parallel computing. Scientists and engineers have found that as long as the data exists in the form of graphics and some general-purpose computing capabilities are added to the GPU, the GPU can be competent for various high-performance analog computing tasks, which is what the industry calls General-Purpose Graphics (GPGPU, General-Purpose Graphics). Processing Unit). A general-purpose GPU is essentially a GPU, but it is tailored and brought together for high-performance computing, AI development, and many other amazing breakthroughs, resulting in larger training sets, shorter training times, classification/prediction/inference Less power, and less infrastructure.
General-purpose GPUs are mainly used in large-scale artificial intelligence computing, data centers, and supercomputing scenarios to support larger data volumes and concurrent throughput. Behind the two functions is a long history of development.
In 1962, Ivan Sutherland's paper "SketchPad: Graphical Human-Computer Communication" and his sketchpad operation video became the basis for defining modern computer graphics. In the next 20 years, limited by the accuracy and operating strength, the graphics card at that time only translated the graphics generated by the CPU calculation into display signals, so it could only be called a graphics adapter (VGA Card). It was not until IBM launched two 2D graphics cards, MDA and CGA, in 1984 that it meant that the industry took shape. Although the two products can only be regarded as ugly ducklings, it marked the beginning of the GPU's journey to compete with the CPU.
The 1990s saw the rise of 3D graphics acceleration. After the advent of Voodoo, the first real 3D graphics accelerator card in history, S3 launched the first graphics card S3 Virge with both 2D and 3D graphics processing capabilities. Since then, the industry has begun to blossom, and NVIDIA has gradually been born. NV1, Matrox's Mlennium, Mystique, PowerVR's PCX1, and other excellent products once showed the grand occasion of a hundred schools of thought contending. After the prosperity, it was cruel mergers and acquisitions and industry consolidation, forming a pattern in which Nvidia and AMD dominated. Since then, the GPU has also embarked on a leap-forward iterative road.
Development history of the discrete graphics card, watchmaking丨nutshell technology, reference materials丨IEEE Computer SOCIETY, NVIDIA official website, public information |
The versatility of GPUs is gradually revealed in iterations. From the 1990s to the beginning of the 21st century, in order to deal with more complex and large-scale graphics computing problems, the GPU mode is no longer a fixed graphics pipeline mode, and the vertex processor, geometry processor, pixel and sub-pixel processor in the graphics pipeline. The programmability is enhanced, exhibiting general-purpose computing capabilities. Subsequently, in order to solve the GPU on-chip load balancing problem, the unified rendering processor (Shader Processor) replaced various programmable components, and the application of the stream processor (a computing system that fully considers concurrency and communication on the stream computing model) laid the foundation for the Fundamentals of GPU general-purpose computing.
The rapid growth of GPU in programmability and computing power has attracted the attention of a large number of research groups, competing to map a large number of computationally complex problems to GPU, and position GPU as the alternative to traditional microprocessors in future high-performance computer systems. Alternative.. The Tesla architecture developed by NVIDIA officially marks the development of GPUs toward general-purpose GPUs, laying the foundation for subsequent widespread applications in the field of deep learning.
GPU from graphics display to general computing road |
Two major functions and applications of GPU |
Although the GPU is easy to use, it is also inseparable from the CPU. On the one hand, the GPU cannot work alone and needs to rely on the CPU to control the calls; on the other hand, the architecture of the two is very different, and the purpose of the construction is also different.
The CPU will contain 4, 8, 16, or even more than 32 powerful cores, and at the same time, one core will encapsulate the arithmetic logic unit (ALU), floating point processing unit (FPU), address generation unit (AGU), memory Snap-in unit (MMU), etc. for almost all functions. Generally speaking, the computing unit ALU in the CPU is about 25%, the logic control is 25%, and the cache is 50%. In contrast, the computing unit ALU in the GPU usually reaches 95%, and the cache Cache is 5%.
Originally, GPUs were specialized hardware designed to help CPUs accelerate graphics processing. Graphics rendering is extremely parallel, requiring very intensive computation and huge data transfer bandwidth, so GPUs are designed to contain thousands of smaller cores. The cores of each GPU can perform some simple calculations in parallel. The core itself is not very intelligent, but unlike the "one core is difficult and eight cores onlookers" CPU, the GPU can use all the cores at the same time to perform convolution, ReLU, and pooling Deep learning computing such as chemistry. In addition, the GPU adopts a flexible storage hierarchy design and a two-level programming compilation model.
Differences between GPU and CPU |
Different structural designs make GPUs have their own specialties. The frequency of the GPU is only one-third of that of the CPU, but in each clock cycle, it can perform nearly 100 times more computations in parallel than the CPU. In a large number of parallel tasks, the GPU is much faster than the CPU. For very low tasks, the speed of manifestation will be much slower. In addition, GPUs usually have 5~10 times the memory bandwidth compared to CPUs, but have longer latency when accessing data, which causes GPUs to do better in predictable computations, but not in unpredictable computations. do worse.
It can be seen that CPU and GPU are complementary and non-conflicting, the former focuses on serial operations, and the latter focuses on parallel operations. For example, the CPU can be understood as a doctor. Not only is he knowledgeable, but he has also delved into many problems. Without him, many problems cannot be solved. The GPU is tens of thousands of junior high school students who can only do simple arithmetic, but no matter how powerful the doctor is, it is impossible to calculate tens of thousands of simple arithmetic operations in an instant.
Looking at the brief history of computing, a variety of digital chips have been born, each of which has a long history of development. Behind the computer is the calculation problem, which is nothing more than scalar, vector, matrix, and space data types. GPU and other digital chips will inevitably have intersections and overlaps. Now, the CPU is still the same CPU, but the GPU may not be the GPU.
For a long time, GPU, FPGA, and ASIC have been in constant controversy. They can constitute a heterogeneous computing system of "CPU+GPU", "CPU+FPGA" and "CPU+ASIC" respectively. At the same time, FPGA and ASIC manufacturers often compare their own products with GPU computing power. In a parallel comparison, for example, NVIDIA Tesla A100 often becomes a "combat power measurement unit", and CPU snatchers are all telling their own advantages.
Rationally, GPU, FPGA, and ASIC are all good players for computing with CPU. For manufacturers and downstream users, the characteristics of the three are completely different, although they may show stronger computing power or higher computing power in some application scenarios. Better power consumption, but the deployment process inevitably needs to comprehensively consider TCO (total cost of ownership), construction difficulty, system compatibility, etc. It is difficult to judge which is stronger or weaker.
Comparison of different computing devices, watchmaking丨 Nutshell Technology |
However, GPUs are relatively mature products, have excellent peak computing power, and at the same time have an unshakable position in the graphics display. It is logical to catch up with the semiconductor boom and become the darling of the market.
The data shows that in the AI training stage, GPU accounts for about 64% of the market, while FPGA and ASIC account for 22% and 14% respectively; in the inference stage, GPU accounts for about 42% of the market, while FPGA and ASIC account for 34% and 24% respectively.
Performance requirements and specific indicators of AI chips in different application scenarios |
The pattern of being monopolized by foreign countries
GPU is not only a huge business at present but also has unlimited potential in the future. According to Verified Market Research, from 2021 to 2030, GPUs will grow from $33 billion to $477.3 billion at a compound annual growth rate of 33.3%.
The GPU will be made into various specifications according to the platform's different power consumption load requirements. For example, the typical power consumption of the GPU in mobile phones is 5W, the typical power consumption in notebook computers is 150W, and the desktop computer can reach 400W. The data center strives for performance. According to power consumption, the market is mainly divided into desktop and mobile applications.
Both markets are in a three-legged situation: the desktop GPU market is monopolized by Nvidia, AMD, and Intel, and the mobile GPU market is monopolized by Arm, Imagination, and Qualcomm. At the software level, the above-mentioned foreign companies also provide support for a series of heterogeneous computing standards such as CUDA and OpenCL.
In terms of desktop-class products, graphics cards for PC or gaming account for the majority of the market, with more than 50% share, of data centers.
According to Jon Peddie Research (JPR) data, in Q2 2022, the GPU shipments (including integrated and discrete graphics cards) used in PCs will be 84 million, of which Intel's GPU market share is as high as 68%, mainly due to Intel's market share in desktops/notebooks Computer CPUs integrate a large number of core graphics; AMD ranks second with a 17% share. This company has both core graphics and independent graphics, but core graphics obviously occupy the majority, with independent graphics accounting for only about 3% of the overall PC market; Nvidia mainly focuses on independent graphics Therefore, although it seems to have only a 15% market share, it basically dominates the independent market.
GPU supply in the Q2 PC market in 2022 |
Nvidia is the absolute leader in discrete GPUs in the world. In the early days, NVIDIA's focus was on the PC graphics processing business. After that, it took advantage of the general upsurge of GPUs to expand to areas such as smart terminals, autonomous driving, and AI algorithms. Judging from the 2022 Q2 financial report, NVIDIA's main businesses include gaming GPUs, data center GPUs, professional visual design GPUs, intelligent driving GPUs, and OEM and other businesses, accounting for 30.5%, 56.8%, 7.4%, and 3.3%. ,2%.
In order to better cope with the competition, the architectural design of each generation of NVIDIA graphics cards has changed a lot. According to the statistics of each generation of NVIDIA's architecture, the core two-factor Streaming Multiprocessor (SM) and cache (Cache) with improved performance have undergone major design changes. This is to reduce the chip's limited area and power consumption. Constantly adjust the configuration ratio of various components, and seek the optimal solution through process iteration.
NVIDIA Architecture Changes |
NVIDIA is the proposer of the GPU concept, and almost every product will arouse large-scale discussions among game enthusiasts and designers. In particular, the 40 series uses the new Ada Lovelace architecture, using the TSMC 4N custom process, the shader capacity is as high as 83TFlops, and the effective ray tracing computing capacity reaches 191TFlops, which is 2.8 times that of the previous generation. There is also the fourth generation of Tensor Cores, and the FP8 tensor processing performance is as high as 1.32PFlops, which is 5 times that of the previous generation.
NVIDIA 30-series and 40-series graphics cards summary, tabulation 丨 Nutshell Hard Technology |
At the same time, Nvidia is also an advocate for GPUs in the data center. Not only the industry's first general-purpose GPU product but also the parallel programming model CUDA was released in 2006. The hardware and software base composed of general-purpose GPU and CUDA constitutes the foundation for NVIDIA to lead AI computing.
However, the past few months for Nvidia have not been easy. Affected by the continuous decline in demand in the semiconductor industry, there was an avalanche of financial reports and a sharp drop in stock prices. The newly released 40-series graphics card was also full of controversy, leading Huang Renxun to cancel the RTX 4080 12GB version.
AMD's GPUs are primarily competitive with cost-effectiveness. On discrete GPUs, the price of similar products is generally about 30% lower than that of NVIDIA, and on integrated GPUs, its APU products with core graphics are cheaper than Intel CPUs with core graphics.
In terms of nuclear display, according to Tom's Hardware test data, the core display of the AMD Ryzen series excels in many games.
Performance comparison of core graphics cards |
In terms of independent display, AMD has always been the chaser of NVIDIA. From the perspective of floating point computing power, it has a certain gap with NVIDIA; from the perspective of actual performance, it is on an equal footing with NVIDIA. To say that N card (NVIDIA) and A card (AMD) are stronger or weaker, no one can give a conclusion for the time being.
Performance comparison of discrete graphics cards |
In everyone's cognition, Intel and GPU seem to be completely incompatible, but in fact, it is the real leader in GPU shipments, thanks to its CPU accounting for nearly 70% of the global PC market (including mobile Notebooks, desktops, servers), its nuclear display has also been brought into thousands of industries.
Global PC Graphics Processing Unit (GPU) shipment share, by vendor, from Q2, 2009 to Q1, 2022 |
But it is as strong as Intel, and it has also repeatedly failed on independent GPUs.
Intel is definitely not a novice or amateur in GPUs. This company has the best GPU engineers in the industry, the best fabs, a bank account that others can only imagine, and a brand that resounds around the world. It even has the title of the largest GPU seller in the world, with more shipments than the competition. The sum of the opponents is even more. Perhaps, for other companies, such achievements are already very satisfying, but Intel's repeated disappointments with discrete GPUs over the past 20 years have made the company uncomfortable.
In 1998, Intel released a product, the Intel i740. The 3D performance of this product is not bad, but it can only be regarded as qualified among ATI, NVIDIA, S3 Graphics, and other products, but I have no choice but to give up the independent display temporarily. the road.
After that, in 2009, Intel did not give up its dream of being independent and planned to build Larrabee graphics processors. You know, the GPU at that time was a combination of simple small computing cores, and Intel just happened to hold the P54C, the Pentium generation processor core of the year. Integrating this 20-plus-year-old core into a graphics card sounds easy, but apparently, the Larrabee research project still caused a lot of trouble for Intel. After countless bounces and news of insufficient research funding, the final plan was announced. fail. However, Intel developed the many-core architecture (MIC) Xeon Phi coprocessor based on Larrabee's research and was selected by Tianhe 2, so Intel is not busy this time.
In 2020, Intel came back from the ashes, betting everything on discrete graphics on the new Xe architecture. In 2022, Intel Arc series graphics cards will be launched, covering mobile, desktop, workstations, and data centers. Whether Intel can succeed this time depends on the follow-up market feedback.
The story of mobile-grade products is not as colorful as desktop-grade GPUs, especially on mobile phones, tablets, and wearable devices. GPUs are highly bound to the architecture, and IP architectures such as Arm, Imagination, and Qualcomm Adreno have their own fans. The pattern may be difficult to change.
From the product point of view, most of the GPU IP used by MediaTek and Samsung's mobile SoCs come from Arm; Apple and Qualcomm's GPU IPs are self-developed (Apple's GPUs are largely inherited from Imagination); UNISOC's mobile SoCs use Imagination's GPU IP.
Smartphone and Tablet GPU Benchmark Ranking |
What are the opportunities for domestic GPUs?
"The price of NVIDIA's data center GPU is astonishingly expensive, and it cannot be replaced by domestic products." Economic Observer previously quoted practitioners as saying that the price of NVIDIA A100 GPU is about $3,000, and there is no replacement, and in June this year, Nvidia announced a 20% price increase for the A100 80G GPU chip.
The industry has been suffering from monopoly for a long time. In the past two years, there has been a wave of GPU financing in China, and projects have been funded one by one.
Since 2020, the total financing of the GPU industry has exceeded 20 billion yuan. From 2020 to 2021 alone, there will be nearly 20 financing events in the general-purpose GPU field, and these companies are mainly pursuing the desktop-level discrete graphics card market. According to Verified Market Research data, the discrete GPU market in mainland China will be worth $4.739 billion in 2020 and is expected to exceed $34.557 billion in 2027.
Why do domestic startups only love discrete graphics cards? On the one hand, the integrated GPU is highly bound to the CPU and is basically designed and produced by CPU manufacturers, such as the core display of Intel and AMD, and the self-developed GPU integrated inside the domestic CPU manufacturer Godson 7A2000. On the one hand, the discrete graphics card is a high-performance device track. It is not only technologically ahead of the integrated graphics card but also has a wider application range. On the other hand, the integrated graphics card is mostly used as a bright card or a low-load daily card.
At present, the funded start-up companies such as Xintong Semiconductor, Innosilicon, Moore Thread, Tianshu Zhixin, and Biren Technology have all launched products one after another, and even entered some complete machines. Loongson Zhongke, Haiguang Information, Cambrian Ji, VeriSilicon, and several listed companies have also continued to work on the GPU business (including integrated graphics and independent graphics).
But in general, domestic GPU products are still in their infancy, lacking in application scenarios, product performance has a certain gap with NVIDIA and AMD products, and it is difficult for software and ecology to compete. Although the advantages are not obvious, driven by international force majeure factors, the domestic replacement has to be considered.
Domestic GPU financing and listing situation, watchmaking丨nutshell technology, reference materials丨“Science and Technology Innovation Board Daily”, Capital Stock |
0 Comments