Nvidia has announced the Blackwell Ultra GB300, a new professional AI accelerator that surpasses the GB200, with enhanced core counts, increased memory capacity, and faster input/output capabilities, and is now in mass production.
The Blackwell Ultra GPU within the GB300 features 160 streaming multiprocessors, significantly more than the GB200’s 144, bringing the total CUDA core count to 20,480. Built on the Blackwell architecture and utilizing the 4NP TSMC node, an advanced iteration of the 4N node, the GB300 promises substantial performance gains.
The GB300’s cores are equipped with 5th-generation Tensor Cores, supporting FP8, FP6, and NVFP4 formats. These enhancements, combined with increased Tensor memory, are expected to at least double performance in relevant calculations. The increased memory capacity, with eight 12-Hi HBM3E stacks resulting in 288 GB of high-bandwidth memory, doubles the memory available in the GB200, enabling the platform to handle larger AI models and accelerate their deployment.
The GB300 utilizes the PCIe 6 interface, doubling the performance of the previous generation PCIe 5 design and reaching speeds of up to 256 GB/s. However, this increased performance comes at the cost of higher power consumption, with the GB300 platform potentially drawing up to 1,400W during peak operation.
While the GB300 is entering mass production for global distribution, Nvidia has confirmed that it will not be shipped to China due to export restrictions. A less powerful GB30 variant may be made available to China in the future, although this remains uncertain due to ongoing concerns about Nvidia hardware tracking.




