
AMD Shares New Second-Gen 3D V-Cache Chiplet Details Up To 2.5TB/s
AMD’s Ryzen 9 7950X3D is the quickest gaming CPU on the planet on account of AMD’s determination to convey its disruptive 3D chip stacking expertise to Zen 4, however curiously, the corporate hasn’t shared any particulars concerning the new Second-Gen 3D V-Cache. Within the Ryzen 7000X3D informational supplies. Initially, we discovered some particulars at a latest tech convention we included in our assessment, and now AMD has lastly answered a number of of our follow-up questions and the chip stays within the 7nm course of and now has a peak bandwidth of as much as 2.5TB/s, first-gen 3D V- Cache, alternatively, peaked at 2TB/s (amongst plenty of different new info). We even have new footage and diagrams of the brand new 6nm I/O Die that AMD is utilizing for its Ryzen 7000 processors.
AMD has moved on to the second technology of 3D V-Cache and Intel has no competing expertise. This offers AMD a win in each the most effective CPUs for video games and sure datacenter purposes. Total, AMD’s second-generation 3D V-Cache expertise is a formidable step ahead over the first-gen, because it permits the corporate to make use of its now mature and cheaper 7nm course of node to spice up cutting-edge 5nm computing efficiency. to die. The brand new design represents AMD’s take into the third dimension the important thing benefit of chip-based design methodologies that use an older, cheaper compute node along side costly new compute expertise. Now for the finer particulars.
First, a fast high-level refresher. As you possibly can see above, AMD’s 3D V-Cache expertise locations an extra L3 SRAM chip straight within the middle of the computing die (CCD) to isolate it from heat-generating cores. This cache will increase the capability to 96MB for the 3D V-Cache geared up chip, bettering the efficiency of latency delicate purposes akin to gaming.
We acquired new details about the second technology implementation each straight from AMD and on the 2023 Worldwide Convention on Strong State Circuits (ISSCC), the place AMD gave a presentation on the Zen 4 structure.
AMD’s earlier technology 3D V-Cache used a 7nm L3 SRAM chip stacked on a 7nm Zen 3 CCD. AMD has caught to the 7nm course of for its new L3 SRAM chip, however is now putting it on prime of a smaller 5nm Zen 4 CCD (see chart under). Nonetheless, this creates a dimension mismatch that requires a number of adjustments.
Row 0 – Cell 0 | 2nd Era 7nm 3D V-Cache Sample | First Era 7nm 3D V-Cache Sample | 5nm Zen 4 Core Advanced Die (CCD) | 7nm Zen 3 Core Advanced Die (CCD) |
Dimension | 36mm^2 | 41mm^2 | 66.3mm^2 | 80.7mm^2 |
Variety of Transistors | ~4.7 Billion | 4.7 billion | 6.57 Billion | 4.15 Billion |
MTr/mm^2 (Transistor Density) | ~130.6 Million | ~114.6 Million | ~99 Million | ~51.4 Million |
First, AMD has shrunk the 7nm SRAM die, so it is now 36mm2 in dimension in comparison with the earlier technology’s 41mm2. Nonetheless, the whole transistor depend stays the identical at ~4.7 billion, so the brand new die is considerably extra dense than the primary technology chip.
As we noticed with the primary technology SRAM chip, that is an unimaginable transistor density for the 7nm chip — we’re taking a look at nearly 3 occasions the density of the primary technology 7nm computing chip, and surprisingly, the 7nm SRAM chip is considerably extra dense than the 5nm computing chip. It is because, as earlier than, the chip makes use of a density-optimized 7nm model specialised for SRAM. It additionally lacks the everyday management circuitry discovered within the cache; this circuit is within the primary die which additionally helps to scale back the delay load. In distinction, the 5nm die accommodates varied sorts of transistors together with buses and different material sorts not discovered within the simplified L3 SRAM chip.
As earlier than, the additional latency from the extra L3 SRAM cache weighs in at 4 hours, however the bandwidth between the L3 chip and the bottom die has elevated to 2.5 TB/s, a 25% enhance over the earlier 2 TB/s peak.
The stacked L3 SRAM chip is linked to the bottom die by way of two sorts of silicon (TSVs – a vertical electrical connection). Energy TSVs transfer energy between chips, whereas Sign TSVs transfer information between volumes.
Within the first technology design, each sorts of TSV had been positioned within the L3 area of the bottom particle. Nonetheless, the L3 cache on the bottom die is now smaller as a result of elevated density of the 5nm course of, and though the 7nm L3 SRAM chip is smaller, it now overlaps with the L2 cache (the earlier technology solely overlapped L3 on the base to die). Due to this, AMD needed to change the TSV connections on each the bottom die and the L3 SRAM chip.
Because of the elevated density of the 5nm L3 cache on the bottom die, it was crucial to maneuver these highly effective TSVs from L3 to the L2 area — AMD achieved 0.68x efficient space scaling throughout the L3 cache, buses and management for the bottom die. logic in comparison with the outdated 7nm base chip, so there’s bodily much less room for TSVs within the L3 cache.
Sign TSVs stay inside the L3 cache space on the bottom die, however transferring the facility TSVs to L2 helped cut back the TSV house within the L3 cache by 50%. It is unclear how a lot of the L3 TSV density affirmation is as a result of elimination of robust TSVs, however — the directional energy and sign TSVs mixed can create sign integrity points which are usually combated by growing the hole between TSVs. Separating the 2 sorts of TSVs into separate areas might present an extra profit by permitting AMD to pack the sign TSVs extra carefully collectively.
AMD’s 3D chip stacking expertise TSMC’s SoIC technology. TSMC’s SoIC is pulseless, that means it does not use micro-pulses or soldering to attach the 2 dies. We went into in-depth particulars of this expertise in our RYzen 7 5800X3D assessment, and you may be taught much more concerning the hybrid meeting and manufacturing course of right here.
AMD says it makes use of the identical bonding course of for the brand new chip, although paired with course of and DTCO enhancements, and the minimal TSV pitch hasn’t modified. AMD additionally utilized what it realized from the primary technology design to assist cut back management circuit overhead within the new design.
Tom’s {Hardware} Measurements | Single Toothed Crown | Multi-Channel Steady | Voltage (peak) | nt Energy |
CCD 0 (3D V-Cache) | 5.25GHz | 4.85GHz | 1.152 | 86W |
CCD 1 (No extra cache) | 5.75GHz | 5.3GHz | 1,384 | 140W |
The L3 SRAM chip additionally stays in the identical energy area because the CPU cores, in order that they can’t be tuned independently. This contributes to the decrease frequency on the cache-equipped chip because the voltage can’t exceed ~1.15V. You’ll be able to see our in-depth check of the 2 totally different chip sorts right here.
Row 0 – Cell 0 | 6nm I/O Die (IOD) – Ryzen 7000 | 12nm I/O Die (IOD) – Ryzen 5000 | 6nm I/O Die (IOD) – EPYC |
Dimension | 117.8mm^2 | 125mm^2 | 386.88mm^2 |
Variety of Transistors | 3.37 Billion | 2.09 Billion | 11 billion |
MTr/mm^2 (Transistor Density) | ~28.6 Million | ~16.7 Million | ~29.8 Million |
AMD’s ISSCC presentation additionally included many new particulars concerning the 6nm I/O Dies (IOD) used within the Ryzen 7000 and EPYC Genoa processors. Within the album above, you possibly can see the zoomed photographs and an annotated die shot from the chip detective. @Locuza_. You may also broaden the tweet under to learn Locuza’s wonderful evaluation of the Ryzen 7000 IOD.
We have tabulated the specs for straightforward comparability, and as you possibly can see, the EPYC Genoa I/O Sample is big in comparison with the Ryzen 7000 — it’s because AMD can join as much as 12 computing chipsets (CCDs) to the I. /O Measure for EPYC Genoa processors.
In distinction, client chips are restricted to 2 chipsets, which is a set limitation as a result of as you possibly can see in Locuza’s schematic, the Ryzen 7000 I/O Die has solely two World Reminiscence Interconnect 2 (GMI2) connections connecting the compute chipsets collectively. IOD. That is annoying – decrease core depend Genoa fashions with 4 CCDs could have twin GMI3 connectivity (large mode), a brand new functionality that might provide benefits in some memory-intensive duties. It might be attention-grabbing so as to add it to client chips.
We have additionally included the total ISSCC 2022 deck under in your assessment — with a number of extra attention-grabbing information.
Zen 4 Raphael 6 nm shopper I/O die:- 128b DDR5 PHY + 32b (8b per 32b channel) for ECC- 2x GMI3 Ports, 3x CCDs not doable. :p- 28x PCIe 5, Zen1/2/3 cIOD had 32x PCIe lanes. Thus, AMD decreased waste for the client market. – Actually simply an RDNA2 WGP, 128 Shader “Core” https://t.co/bkqdVvhgrn pic.twitter.com/erYxTw1p8hMarch 4, 2023
#AMD #Shares #SecondGen #VCache #Chiplet #Particulars #25TBs