NVIDIA HAS REVEALED details of its next-generation GPU architecture, codenamed Pascal.
Upcoming GPU technology will offer roughly 10 times the performance of Maxwell-powered GPUs in terms processing power related to "deep learning" apps.
This chip is aimed at next-generation super-computers, workstations, gaming PCs and cloud super-computers. The chip is also based on a technology called NVLink, and uses 3D memory to amplify the bandwidth between the GPU and memory sub-system. Pascal fits in a module that's a third the size of a PCIe card.
NVLink is chip-to-chip communications. The programming model is basically PCI Express with enhanced DMA capability, that software can adopt this interface very easily. This solution enables programmers to bind memory between the CPU and GPU, the GPU and GPU, and the second generation cache coherency between the GPU and CPU cache.
One of the benefits of parallel computing is to be able to take all these GPUs and put them in parallel, and treat them like one big massive GPU. If we'd only have the bandwidth to communicate from GPU to GPU.
For the first time, Nvidia is building heterogeneous chips on top of other chips on a single wafer. This solution starts off with a base wafer where interconnections are carried out among the chips. Thus, instead of going off to a PC board, Nvidia is doing the interchip routing on the wafer itself.
"Thousands of little bumps characterized by that vertical signal on these chips are flipped and bumped onto the base wafer," Huang explained. "The interface went from hundreds of bits to thousands of bits. You can just imagine a GPU with thousands and thousands of memory interfaced bits."
Nvidia stacked the memory bits one on top of the other, and punched holes through them. Each DRAM is then stacked on top of each other, all stacked on top of a wafer that can have incredibly small wires that connect to the GPU.
Roadmap with a release schedule for some time in 2016