Book a demo

Cut patent&paper research from weeks to hours with PatSnap Eureka AI!

Try now

Wafer-level fan-out vs flip-chip BGA for AI chips

Wafer-Level Fan-Out Packaging vs Flip-Chip BGA for AI Chips — PatSnap Insights
Semiconductor Packaging

Wafer-level fan-out packaging is displacing flip-chip BGA as the preferred architecture for AI accelerators—eliminating the conventional substrate, achieving sub-2 µm interconnect pitch, and enabling 3D memory stacking that conventional solder-bump approaches cannot match. Patent data from JCET, MediaTek, Kepler Computing, IBM, and others maps exactly where the technology stands and where it is heading.

PatSnap Insights Team Innovation Intelligence Analysts 11 min read
Share
Reviewed by the PatSnap Insights editorial team ·

How wafer-level fan-out packaging works: structure and RDL mechanics

Wafer-level fan-out (WLFO) packaging completes the packaging process at the wafer level—either without separating dies from the wafer, or by rearranging individual dies into a reconstituted wafer form—before final singulation. A redistribution layer (RDL) formed on a passivation layer electrically connects each chip’s I/O pads to external contacts, fanning signals out beyond the die’s physical footprint without a conventional organic or ceramic substrate. That substrate-free architecture is the defining departure from every legacy approach.

<2 µm
RDL interconnect pitch in WLFO
100–150 µm
C4 solder bump pitch in FC-BGA
(0.5–2):1
JCET aperture-to-solid ratio for warpage control
300–700 W
Typical TDP range for modern AI processors

The RDL is the technological heart of WLFO. By routing signals laterally through patterned metal layers deposited directly on the chip surface and encapsulant, the package can present a much larger ball pitch to the PCB while maintaining extremely dense die-level interconnect—without the signal path ever traversing a laminate substrate. This is why WLFO can achieve interconnect pitches well below 2 µm, compared to the 100–150 µm solder bump pitch typical of flip-chip ball grid array (FC-BGA) packages.

Wafer-level fan-out (WLFO) packaging uses a redistribution layer (RDL) formed on a passivation layer to electrically connect chip I/O pads to external contacts, enabling signal fan-out beyond the die edge without a conventional organic or ceramic substrate. RDL-based interconnect can achieve pitches well below 2 µm, compared to the 100–150 µm solder bump pitch of flip-chip BGA.

Warpage: the primary yield risk in large-format WLFO

The most significant manufacturing challenge in WLFO is package warpage. When a reconstituted wafer contains chips surrounded by molding compound, the mismatch in thermal expansion coefficients and Young’s moduli between the molding compound, embedded dies, and any reinforcement layers causes the wafer to bow during thermal cycling—directly reducing yield. For large-format packages used in AI accelerators, this is not a minor process nuisance; it is the primary limiter on production economics.

What is package warpage in WLFO?

Warpage occurs when mismatches in the thermal and mechanical properties of the molding compound, embedded chips, and structural reinforcement layers cause the reconstituted wafer to bow during processing. JCET Advanced Packaging addresses this with a sheet-like structural layer containing defined apertures for each chip, controlling the aperture-to-solid ratio within (0.5–2):1. Because the Young’s modulus of the plastic packaging layer is lower than that of the structural layer, the composite system achieves “remarkably reduced package warpage and better package strength” (JCET, 2023).

Thermal management embedded in the fan-out stack

Modern AI processors carry thermal design power (TDP) values in the 300–700 W range, making thermal management a co-equal design constraint alongside signal integrity. WLFO addresses this by integrating heat spreaders directly into the package stack. In the architecture described by the Guangdong Academy of Sciences Semiconductor Institute (2021), the chip’s back face bonds via adhesive to the heat spreader’s die-attach region, encapsulant fills the gaps between the heat spreader and chip protection material, and package wiring grows directly on the chip’s front face, the encapsulant surface, and the heat spreader’s protruding structures. The result is a fan-out package explicitly designed for “high power density chips” with improved operating power and reduced power consumption—properties that substrate-bound packaging cannot replicate without adding separate thermal interface layers and heat slugs that increase package height and thermal resistance.

Figure 1 — Interconnect pitch comparison: WLFO RDL vs. flip-chip BGA solder bumps
Interconnect pitch comparison: wafer-level fan-out RDL vs. flip-chip BGA for AI chip packaging 0 50 µm 100 µm 150 µm <2 µm WLFO RDL 100 µm FC-BGA (min) 150 µm FC-BGA (max) WLFO RDL pitch FC-BGA C4 bump (min) FC-BGA C4 bump (max)
WLFO RDL achieves sub-2 µm interconnect pitch versus the 100–150 µm solder bump pitch of flip-chip BGA, enabling dramatically higher I/O density per unit die area.

Multi-die WLFO is enabled by arranging dies side-by-side within the reconstituted wafer and routing die-to-die signals through the RDL rather than through a package substrate or PCB. MediaTek’s 2016 EP patent demonstrates this configuration, with connection paths linking I/O pads on adjacent die sides—a foundational technique that directly enables chiplet-based AI processors where logic, memory, and I/O tiles must communicate at high bandwidth and low latency within a single package boundary.

3D die stacking and heterogeneous integration for AI workloads

Three-dimensional die stacking combines the substrate-free advantages of WLFO with vertical integration of compute and memory dies, addressing the memory bandwidth bottleneck that defines AI training and inference workloads. The stacking order is not architecturally neutral—it has direct thermal consequences that patent filers are now explicitly resolving.

In 3D die-stacked AI packages, placing the compute die below the DRAM die creates thermal problems because any heat sink is closer to the DRAM and farther from the compute die. Kepler Computing’s 2025 patent argues that positioning the memory die below the compute die, with TSV density aligned to die-to-die I/O count, resolves both bandwidth and thermal constraints simultaneously for ultra-high bandwidth AI processing systems.

Kepler Computing’s 2025 CN patent describes an IC package in which a first die containing DRAM is placed below a second die containing compute logic, coupled via micro-bumps in a wafer-to-wafer bonding configuration. The thermal engineering rationale is explicit: “placing the compute die below the DRAM die leads to thermal problems for the compute die, because any heat sink is closer to the DRAM and farther from the compute die.” By inverting the stack—memory below, compute above—and aligning TSV density directly with die-to-die I/O count, the architecture achieves area-array bandwidth rather than the perimeter-limited bandwidth of conventional flip-chip arrangements.

“Placing HBM on the sides of a compute die in a conventional flip-chip arrangement does not solve the bandwidth problem because bandwidth is constrained by peripheral I/O from each side of the HBM and compute die.” — Kepler Computing, 2025

Panasonic IP Management’s 2022 CN patent refines the multi-die stacking model further, stacking multiple memory dies and multiple compute dies on a system chip, with each die type maintaining its own independently optimized layout. An interposer can be incorporated optionally—enabling memory and compute dies to be redesigned independently without redesigning the entire system chip, thereby increasing configuration flexibility. When the interposer is omitted, cost reduces without sacrificing the core stacking architecture.

Explore the full patent landscape for 3D die stacking and AI chip packaging in PatSnap Eureka.

Search AI Packaging Patents in PatSnap Eureka →

System-on-wafer: eliminating the PCB entirely

The logical frontier of WLFO philosophy is the system-on-wafer (SoW), in which multiple chips are mounted directly on semiconductor wafers, interconnected via wiring layers and power modules, with inter-wafer connectors linking SoWs together. A 2025 WO patent filing explicitly critiques the conventional PCB-integrated server format, noting it “imposes limitations on thermal efficiency, power efficiency, and bandwidth availability, and may result in a relatively large physical footprint.” By eliminating the PCB substrate entirely in favor of wafer-native interconnect at the system level, SoW architectures represent the maximum expression of the trends that began with single-chip WLFO packages in the 2010s.

Figure 2 — AI chip packaging evolution: from FC-BGA to system-on-wafer
AI chip packaging evolution: flip-chip BGA to wafer-level fan-out to 3D stacking to system-on-wafer FC-BGA Substrate Conventional WLFO RDL-based Fan-Out 3D Stack TSV + HBM Vertical Integration Chiplets Multi-die RDL Heterogeneous SoW No PCB System-on-Wafer Increasing bandwidth density, thermal efficiency, and integration complexity →
AI chip packaging has evolved from substrate-bound FC-BGA through WLFO, 3D stacking, and chiplet integration toward system-on-wafer architectures that eliminate the PCB entirely.

Heterogeneous integration at the wafer scale also enables independent optimisation of each die type. Logic dies can be fabricated at the leading process node for maximum compute density, while memory dies use nodes optimised for density and cost, and I/O dies use nodes optimised for high-voltage tolerance—all integrated within a single WLFO or SoW boundary. According to WIPO patent trend data, heterogeneous integration filings have accelerated significantly across US, CN, EP, WO, and KR jurisdictions since 2020, reflecting the industry’s recognition that no single process node can simultaneously optimise all functions of a modern AI chip.

Where flip-chip BGA hits its ceiling in AI systems

Flip-chip BGA remains the dominant volume packaging technology for CPUs and discrete GPUs, and it retains cost advantages for single-die, high-volume designs with well-understood I/O requirements. However, the patent record is explicit about where FC-BGA’s architecture becomes structurally inadequate for AI-scale workloads.

PCIe-based chip-to-chip interconnect—the external interface most commonly associated with flip-chip BGA-packaged AI accelerators—is bandwidth-limited and non-scalable for ultra-large-scale AI computation. Taichu (Wuxi) Electronic Technology’s 2025 patent states that “AI chip ultra-large-scale parallel computing capability is growing increasingly strong, and the traditional method of inter-chip data transfer via PCIe can no longer meet the bandwidth demands of AI computation.”

The bandwidth ceiling of FC-BGA manifests at three levels. First, at the die level: neural network inference and training are memory-bandwidth-bound, and the C4 solder bump array that connects the die to the package substrate is a perimeter-limited interface. Placing HBM on the sides of a compute die in a conventional flip-chip arrangement constrains bandwidth to the number of peripheral I/Os on each die edge—an area that scales linearly with die perimeter rather than quadratically with die area. Second, at the package level: the organic or ceramic substrate that FC-BGA requires is itself a routing bottleneck, with trace widths and via densities that cannot match the sub-2 µm RDL achievable in WLFO. Third, at the system level: IBM’s 2015 CN patent on packaging for eight-socket, one-hop SMP topologies documents that “it is impossible to use packaging technology described in the prior art to achieve high bandwidth, eight-node, one-hop systems” using standard socket-based packaging with land grid array (LGA) connectors—requiring multi-plane vertical stacking as a workaround that effectively demonstrates the density ceiling of conventional substrate-based approaches.

Key finding: FC-BGA’s perimeter I/O constraint

In a conventional flip-chip configuration, memory bandwidth scales with the number of peripheral I/Os on each die edge—a perimeter-limited quantity. A vertically stacked WLFO configuration aligns TSV density directly with die-to-die I/O count, achieving area-array rather than perimeter-limited bandwidth. For large-scale AI training accelerators, this distinction is the deciding architectural factor.

For smaller AI inference chips—edge AI in wearables and audio applications, for instance—FC-BGA and even simpler wire-bonding approaches on PCB substrates remain viable. Shenzhen Jiutian Ruixin Technology’s 2023 patent demonstrates that system-in-package integration of AI, ASIC, and MEMS chips can be accomplished via wire bonding, suitable for edge deployments with constrained I/O and thermal requirements. The critical distinction is scale: as AI workload size grows and memory bandwidth requirements intensify, the architectural limitations of FC-BGA become mandatory constraints rather than engineering trade-offs. Standards bodies including JEDEC and the IEEE have both published specifications for HBM and advanced packaging interconnect that reflect this bandwidth-driven transition away from conventional solder-bump architectures.

Head-to-head: WLFO vs. flip-chip BGA across eight attributes

Across the eight attributes most relevant to AI chip packaging decisions, WLFO and FC-BGA present a clear trade-off profile: WLFO wins on interconnect density, bandwidth, thickness, and chiplet scalability; FC-BGA retains advantages in manufacturing maturity, warpage resistance, and unit cost for single-die designs.

Attribute Wafer-Level Fan-Out (WLFO) Flip-Chip BGA (FC-BGA)
Interconnect density RDL-based; sub-2 µm pitch achievable Solder bump (C4) pitch typically 100–150 µm
Die-to-die bandwidth High: direct RDL routing between dies Low–medium: routed through package substrate
Package thickness Thin: no interposer or substrate required Thicker: organic/ceramic substrate required
Warpage risk High without structural control layers Lower: substrate provides mechanical rigidity
Thermal performance Integrated heat spreader possible in stack Heat slug bonded above chip; substrate is thermal bottleneck
Chiplet scalability Native multi-die support via side-by-side RDL Limited by substrate routing density
Manufacturing complexity High: reconstituted wafer, RDL, molding Moderate: mature infrastructure
Cost profile Higher NRE; competitive at scale Lower for single-die, high-volume designs

The warpage disadvantage of WLFO is now being systematically addressed through composite structural engineering. JCET Advanced Packaging’s 2023 patent introduces a sheet-like structural layer with defined apertures housing each chip, with the aperture-to-solid ratio controlled within (0.5–2):1. Because the Young’s modulus of the plastic packaging layer is lower than that of the structural layer, the composite system achieves reduced package warpage and improved package strength—directly improving yield for large-format AI accelerator packages. The thermal disadvantage is addressed by Guangdong Academy of Sciences’ heat-spreader-embedded fan-out structure, which routes thermal energy through the package stack rather than relying on a separate lid or heat slug bonded above the substrate.

Need to map the competitive patent positions in advanced AI chip packaging? PatSnap Eureka gives you the full picture.

Explore Full Patent Data in PatSnap Eureka →

Key patent assignees and the innovation clusters shaping AI packaging

The patent dataset spanning US, CN, EP, WO, and KR jurisdictions reveals three distinct innovation clusters, each addressing a different layer of the WLFO and advanced AI packaging stack.

Packaging foundries and OSATs

JCET Advanced Packaging—China’s largest outsourced semiconductor assembly and test (OSAT) provider—is active in WLFO structural innovations specifically targeting warpage control, a critical production yield issue for large-format packages. STS Semiconductor & Telecommunications (Korea) holds foundational WLFO process patents covering RDL formation, passivation layering, and wafer-scale singulation, establishing the baseline manufacturing framework from which subsequent innovations depart. The Guangdong Academy of Sciences Semiconductor Institute is pursuing thermally enhanced fan-out structures with integrated heat spreaders for high-power-density applications—directly relevant to AI processors with TDP values in the 300–700 W range.

Fabless AI chip and system designers

Panasonic IP Management is patenting scalable AI chip architectures using die-stacking with optional interposers, enabling independent redesign of memory and compute dies. Kepler Computing is specifically addressing the compute-over-memory 3D die-stacking configuration for ultra-high bandwidth AI inference and training, with explicit thermal and bandwidth co-optimisation. Taichu (Wuxi) Electronic Technology is integrating network switching and routing logic directly onto AI chips to eliminate the PCIe external interconnect bottleneck—a recognition that the bandwidth problem extends beyond the package boundary to the system interconnect fabric.

Multi-die and chiplet system architects

MediaTek holds key patents on side-by-side multi-die wafer-level packages with die-to-die RDL interconnect, a foundational technique for chiplet-based AI processors. A 2025 WO filing extends the WLFO concept to full system-on-wafer architectures for AI server infrastructure, proposing inter-wafer connectors that link SoWs together to form compute clusters without PCB intermediaries. IBM’s documented architectural ceiling for conventional substrate-based packaging in multi-processor HPC topologies motivates the entire vertical stacking and wafer-scale integration trajectory now visible across 2021–2025 filings.

Figure 3 — Patent assignee clusters in AI chip packaging: WLFO, 3D stacking, and system integration
Patent assignee clusters in wafer-level fan-out packaging and AI chip packaging innovation across OSAT, fabless, and system architect categories OSATs & Foundries JCET Advanced Warpage control (2023) STS Semiconductor Foundational WLFO (2017) Guangdong Acad. Sci. Thermal fan-out (2021) Fabless & AI Designers Kepler Computing 3D HBM stacking (2025) Panasonic IP Mgmt Multi-die AI chip (2022) Taichu (Wuxi) On-chip routing (2025) System Architects MediaTek Multi-die RDL (2016) KIM, BRIAN H. System-on-wafer (2025) IBM HPC topology limits (2015) Patent dataset: US, CN, EP, WO, KR jurisdictions · 2015–2025
Three distinct innovation clusters drive advanced AI chip packaging: OSATs solving manufacturing yield, fabless designers addressing bandwidth and thermal co-optimisation, and system architects extending WLFO to wafer-scale infrastructure.

A clear convergence is visible across the 2021–2025 filings: thermal management, ultra-high-bandwidth memory integration, and wafer-scale interconnect are now treated as co-equal design constraints—none of which can be adequately addressed by conventional FC-BGA alone. The Semiconductor Industry Association has similarly identified advanced packaging as a critical technology pillar in its roadmap for sustaining semiconductor performance scaling beyond traditional transistor density improvements. This institutional recognition reinforces the patent-level evidence that WLFO and 3D stacking are no longer niche techniques—they are the primary engineering response to the bandwidth and thermal demands of AI compute at scale.

A 2025 WO patent filing on wafer-level heterogeneous integration for AI accelerators proposes system-on-wafer (SoW) architectures in which multiple chips are mounted directly on semiconductor wafers and interconnected via wiring layers and power modules, with inter-wafer connectors linking SoWs together—eliminating the PCB substrate entirely and arguing that conventional PCB-integrated server formats impose limitations on thermal efficiency, power efficiency, and bandwidth availability.

Frequently asked questions

Wafer-level fan-out packaging — key questions answered

Still have questions? Let PatSnap Eureka answer them for you.

Ask PatSnap Eureka for a Deeper Answer →

Your Agentic AI Partner
for Smarter Innovation

PatSnap fuses the world’s largest proprietary innovation dataset with cutting-edge AI to
supercharge R&D, IP strategy, materials science, and drug discovery.

Book a demo