Book a demo

Cut patent&paper research from weeks to hours with PatSnap Eureka AI!

Try now

Wafer-level fan-out vs flip-chip BGA for AI chips

Wafer-Level Fan-Out Packaging vs Flip-Chip BGA for AI Chips — PatSnap Insights
Semiconductor Packaging

Wafer-level fan-out packaging has emerged as the preferred interconnect architecture for AI accelerators, replacing flip-chip BGA’s substrate-bound approach with RDL-based die-to-die routing that eliminates perimeter I/O bandwidth limits. Patent data from JCET, MediaTek, Kepler Computing, and IBM maps the structural mechanisms, thermal strategies, and heterogeneous integration trends reshaping next-generation AI chip design.

PatSnap Insights Team Innovation Intelligence Analysts 11 min read
Share
Reviewed by the PatSnap Insights editorial team ·

How WLFO packaging works: structure and RDL mechanics

Wafer-level fan-out (WLFO) packaging encapsulates and redistributes chips at the wafer scale—either without separating dies from the wafer, or by rearranging individual dies into a reconstituted wafer form—before final packaging and singulation. As established by STS Semiconductor & Telecommunications in their 2017 foundational patent, “packaging proceeds on the whole chips without separating the chips from the wafer or the individual chips are rearranged in a wafer form and then packaging proceeds.” A redistribution layer (RDL) is formed on a passivation layer, electrically connecting the chip I/O pattern to external contacts and enabling signal fan-out beyond the chip’s physical footprint without a conventional organic or ceramic substrate.

<2 µm
WLFO RDL interconnect pitch
100–150 µm
Flip-chip BGA C4 bump pitch
300–700 W
TDP range of modern AI processors
(0.5–2):1
JCET aperture-to-solid ratio for warpage control

The fundamental distinction from traditional packages is that I/O can be extended laterally beyond the die edge without a conventional substrate—a property that directly enables the chiplet-based architectures now prevalent in AI accelerators. This is why, according to WIPO patent filings spanning the US, CN, EP, WO, and KR jurisdictions, WLFO has become the dominant subject of advanced packaging innovation since 2021.

Redistribution Layer (RDL)

An RDL is a metal wiring layer formed on the chip surface and encapsulant, routing electrical signals from the die’s native I/O pad locations to new, spatially expanded contact positions. In WLFO, the RDL replaces the conventional package substrate, enabling sub-2 µm pitch interconnect versus the 100–150 µm pitch of flip-chip BGA solder bumps.

Package warpage is the primary manufacturing challenge in large-format WLFO. It arises from mismatches in the thermal and mechanical properties of the molding compound, embedded chips, and structural reinforcement layers—and is one of the primary yield limiters for AI accelerator packages. JCET Advanced Packaging’s 2023 patent addresses this directly: a sheet-like structural layer with defined apertures houses each chip, and the ratio of the aperture region to the solid structural region is controlled within (0.5–2):1. Because the Young’s modulus of the plastic packaging layer is lower than that of the sheet-like structure, the composite system achieves “remarkably reduced package warpage and better package strength.”

In wafer-level fan-out packaging, JCET Advanced Packaging controls the ratio of the aperture region to the solid structural region within (0.5–2):1 in its sheet-like structural layer to achieve reduced package warpage and improved package strength, addressing the primary yield risk in large-format WLFO production.

Thermal management is embedded directly into the fan-out stack in designs targeting high-power AI chips. The Guangdong Academy of Sciences Semiconductor Institute’s 2021 patent integrates a heat spreader into the fan-out architecture: the chip’s back face is bonded via adhesive to the heat spreader’s die-attach region, the encapsulant fills gaps between the heat spreader and the temporary chip protection material, and the package wiring is grown directly on the chip’s front face, the encapsulant surface, and the heat spreader’s protruding structures. The result is a fan-out architecture explicitly engineered for “high power density chips” with improved operating power and reduced power consumption—properties directly relevant to AI processors with thermal design power (TDP) values exceeding 300–700 W.

Figure 1 — Interconnect pitch comparison: WLFO RDL vs. flip-chip BGA C4 bumps
Wafer-level fan-out RDL interconnect pitch versus flip-chip BGA C4 bump pitch comparison 150 µm 112 µm 75 µm 37 µm 0 µm 150 µm FC-BGA (max) 100 µm FC-BGA (min) <2 µm WLFO RDL FC-BGA max FC-BGA min WLFO RDL
WLFO RDL interconnect pitch is below 2 µm — more than 50× finer than the 100–150 µm C4 bump pitch of flip-chip BGA — enabling orders-of-magnitude higher interconnect density for AI chiplet integration.

Multi-die WLFO is enabled by side-by-side die arrangement and fine-pitch redistribution. MediaTek’s 2016 EP patent demonstrates a wafer-level package in which multiple dies are arranged adjacently, with connection paths linking I/O pads on adjacent die sides—enabling die-to-die communication through the RDL rather than through the package substrate or PCB. This is foundational to chiplet-based AI processors, where logic, memory, and I/O tiles must communicate at high bandwidth and low latency within a single package boundary.

3D stacking and heterogeneous integration for AI workloads

Three-dimensional die stacking compounds the advantages of WLFO with vertical integration of compute and memory dies, directly addressing the memory-bandwidth constraints that define AI training and inference performance. Kepler Computing’s 2025 CN patent describes an IC package in which a first die containing DRAM is placed below a second die containing compute logic, coupled via micro-bumps in a wafer-to-wafer bonding configuration.

“Placing the compute die below the DRAM die leads to thermal problems for the compute die, because any heat sink is closer to the DRAM and farther from the compute die.”

Kepler’s patent argues that positioning the memory die below the compute die—with TSV density aligned to die-to-die I/O count—resolves both bandwidth and thermal constraints simultaneously, achieving “ultra-high bandwidth AI processing systems.” This stacking orientation places the heat sink in proximity to the compute die, which generates the majority of thermal load, while TSV density is matched to the full area-array I/O count rather than perimeter-limited contacts.

Kepler Computing’s 2025 patent on 3D integrated ultra-high bandwidth memory positions the DRAM die below the compute die in a wafer-to-wafer bonding configuration, with TSV density aligned to die-to-die I/O count, to achieve area-array rather than perimeter-limited memory bandwidth for AI processing systems.

Panasonic IP Management’s 2022 CN patent further refines this architecture by stacking multiple memory dies and multiple compute dies on a system chip, with each die type maintaining its own layout pattern optimized independently. An interposer can be optionally incorporated—enabling memory and compute dies to be redesigned independently without redesigning the entire system chip, thereby increasing configuration flexibility while enabling cost reduction when the interposer is omitted.

Explore the full patent landscape for AI chip packaging and 3D stacking technologies in PatSnap Eureka.

Explore AI Chip Patents in PatSnap Eureka →

At the system level, a 2025 WO patent filing proposes systems-on-a-wafer (SoWs) in which multiple chips are mounted directly on semiconductor wafers, interconnected via wiring layers and power modules, with inter-wafer connectors linking SoWs together. This represents the logical extension of WLFO philosophy—eliminating the PCB substrate entirely in favor of wafer-native interconnect at the system level. The patent explicitly critiques the conventional PCB-integrated server format, noting it “imposes limitations on thermal efficiency, power efficiency, and bandwidth availability, and may result in a relatively large physical footprint.” Standards bodies including IEEE have documented similar interconnect scaling challenges in heterogeneous integration roadmaps.

Figure 2 — WLFO packaging integration evolution: from planar RDL to system-on-wafer
Wafer-level fan-out packaging evolution from single-die RDL to multi-die chiplet to 3D stacking to system-on-wafer Single-Die WLFO + RDL STS Semi, 2017 Multi-Die Chiplet RDL MediaTek, 2016 3D Stack TSV + HBM Kepler, 2025 System-on- Wafer (SoW) KIM, B.H., 2025
The WLFO integration trajectory progresses from single-die RDL packages (STS Semiconductor, 2017) through multi-die chiplet RDL (MediaTek, 2016) and 3D TSV stacking with HBM (Kepler Computing, 2025) to full system-on-wafer architectures (KIM, BRIAN H., 2025) that eliminate PCB substrates entirely.

Where flip-chip BGA hits its ceiling for AI chips

Flip-chip BGA connects die I/O pads via solder bumps (C4 bumps) directly to an organic or ceramic package substrate at pitches of 100–150 µm, which then fans out to a ball grid array for PCB attachment. For AI workloads, the architecture’s perimeter-limited I/O is its fundamental constraint: placing HBM peripherally in a flip-chip configuration “does not solve the bandwidth problem because bandwidth is constrained by peripheral I/O from each side of the HBM and compute die,” as Kepler Computing’s patent explicitly states.

In flip-chip BGA configurations, placing High Bandwidth Memory (HBM) on the sides of a compute die does not resolve the memory bandwidth problem for AI chips, because bandwidth remains constrained by the peripheral I/O count from each side of the HBM and compute die—a structural limitation that wafer-level fan-out packaging with area-array TSV interconnect overcomes.

IBM’s 2015 CN patent on eight-socket one-hop SMP topology illustrates a second ceiling: standard socket-based packaging using land grid array (LGA) connectors cannot scale to eight-processor, single-hop topologies within a single plane. The document states that “it is impossible to use packaging technology described in the prior art to achieve high bandwidth, eight-node, one-hop systems,” requiring multi-plane vertical stacking via LGA connectors and rewiring cards. This multi-plane workaround demonstrates the bandwidth and density ceiling that conventional substrate-based packaging—including FC-BGA—imposes on AI compute systems requiring multi-chip coherency.

The interconnect bottleneck extends beyond the package itself. Taichu (Wuxi) Electronic Technology’s 2025 CN patent identifies PCIe-based chip-to-chip interconnect—the external interface most commonly associated with FC-BGA-packaged AI accelerators—as “fundamentally bandwidth-limited and non-scalable,” noting that “AI chip ultra-large-scale parallel computing capability is growing increasingly strong, and the traditional method of inter-chip data transfer via PCIe can no longer meet the bandwidth demands of AI computation.” This motivates the integration of routing and network interface logic directly onto the AI chip, effectively bypassing the FC-BGA package boundary as a bandwidth constraint. Research published by Nature on photonic and electronic co-integration similarly identifies off-chip bandwidth as a primary bottleneck for AI accelerator scaling.

Key finding

For small to medium AI inference chips with well-understood I/O requirements and mature thermal envelopes, FC-BGA remains cost-competitive. Even SiP integration of AI, ASIC, and MEMS chips can be accomplished via wire bonding on a PCB substrate—suitable for edge AI in wearables and audio applications. For large-scale training accelerators, however, the bandwidth, latency, and density requirements drive mandatory adoption of advanced packaging including WLFO and 3D stacking.

The Semiconductor Industry Association has documented the growing gap between transistor scaling and interconnect bandwidth scaling—a gap that advanced packaging formats like WLFO are specifically designed to close. FC-BGA’s manufacturing infrastructure is mature and cost-effective for single-die, high-volume designs, but its organic substrate becomes a thermal bottleneck and routing density limiter as die sizes and I/O counts grow toward the requirements of frontier AI models.

Head-to-head: WLFO vs. flip-chip BGA across key attributes

The following comparison draws directly from the patent data analyzed, mapping structural, thermal, bandwidth, and manufacturing attributes across both packaging approaches.

Attribute Wafer-Level Fan-Out (WLFO) Flip-Chip BGA (FC-BGA)
Interconnect density RDL-based, sub-2 µm pitch achievable Solder bump (C4) pitch typically 100–150 µm
Die-to-die bandwidth High: direct RDL routing between dies Low-medium: routed through package substrate
Package thickness Thin: no interposer or substrate required Thicker: organic/ceramic substrate required
Warpage risk High without structural control layers Lower: substrate provides mechanical rigidity
Thermal performance Integrated heat spreader possible Heat slug bonded above chip; substrate is thermal bottleneck
Scalability for chiplets Native multi-die support via side-by-side RDL Limited by substrate routing density
Manufacturing complexity High: reconstituted wafer, RDL, molding Moderate: mature infrastructure
Cost profile Higher NRE; competitive at scale Lower for single-die, high-volume designs

The warpage limitation of WLFO is addressed by JCET’s sheet-like structural layer with controlled aperture-to-solid ratios of (0.5–2):1. The thermal limitation is addressed by Guangdong Academy of Sciences’ heat-spreader-embedded fan-out structure. FC-BGA’s bandwidth ceiling manifests most acutely in AI workloads because neural network inference and training are memory-bandwidth-bound: a vertically stacked WLFO configuration aligns TSV density directly with die-to-die I/O count, achieving area-array rather than perimeter-limited bandwidth.

Map the full competitive patent landscape for WLFO and FC-BGA packaging across all major jurisdictions.

Analyse Packaging Patents in PatSnap Eureka →

Key patent assignees and innovation clusters

The patent data identifies four distinct clusters of assignees driving WLFO and advanced AI chip packaging innovation, each addressing a different layer of the technology stack.

Packaging foundries and OSATs

JCET Advanced Packaging—China’s largest OSAT—is active in WLFO structural innovations specifically targeting warpage control, a critical production yield issue. STS Semiconductor & Telecommunications (Korea) holds foundational WLFO process patents covering RDL formation, passivation layering, and wafer-scale singulation. Guangdong Academy of Sciences Semiconductor Institute is pursuing thermally enhanced fan-out structures with integrated heat spreaders for high-power-density applications. Patent activity from these foundry-side assignees reflects the manufacturing challenge of transitioning WLFO from a niche advanced packaging format to a volume production process suitable for AI accelerator demand. EPO filings in this cluster have grown substantially since 2021, reflecting the technology’s maturation across European jurisdictions.

Fabless AI chip and system designers

Panasonic IP Management is patenting scalable AI chip architectures using die-stacking with optional interposers—enabling memory and compute dies to be redesigned independently without redesigning the entire system chip. Kepler Computing is specifically addressing the compute-over-memory 3D die-stacking configuration for ultra-high bandwidth AI inference and training. Taichu (Wuxi) Electronic Technology is integrating network switching logic directly onto AI chips to eliminate external interconnect bottlenecks associated with PCIe-based FC-BGA systems.

Multi-die and chiplet system architects

MediaTek holds key patents on side-by-side multi-die wafer-level packages with die-to-die RDL interconnect—a foundational technique now widely adopted for AI accelerator chiplet integration. The 2025 WO filing on wafer-level heterogeneous integration for AI accelerators extends the WLFO concept to full system-on-wafer architectures for AI server infrastructure, eliminating PCBs and conventional packages entirely in favor of wafer-native interconnect at the system level.

Legacy HPC packaging specialists

IBM’s 2015 CN patent documents the architectural ceiling of conventional substrate-based packaging for multi-processor AI/HPC topologies—demonstrating that achieving high bandwidth, eight-node, one-hop systems is impossible with prior-art packaging technology, motivating vertical LGA stacking as a bridging technology toward the wafer-scale integration approaches now emerging.

A clear trend across 2021–2025 patent filings from JCET Advanced Packaging, Kepler Computing, Panasonic IP Management, MediaTek, and Taichu (Wuxi) Electronic Technology is the convergence of thermal management, ultra-high-bandwidth memory integration, and wafer-scale interconnect as co-equal design constraints for AI chip packaging—none of which can be adequately addressed by conventional flip-chip BGA alone.

“AI chip ultra-large-scale parallel computing capability is growing increasingly strong, and the traditional method of inter-chip data transfer via PCIe can no longer meet the bandwidth demands of AI computation.”

Frequently asked questions

Wafer-level fan-out packaging — key questions answered

Still have questions? Let PatSnap Eureka answer them for you.

Ask PatSnap Eureka for a Deeper Answer →

References

  1. Wafer Level Fan-Out Package and Method for Manufacturing the Same — STS Semiconductor & Telecommunications Co., Ltd., 2017
  2. Fan-Out Water-Level Packaging Structure and Manufacturing Method Thereof — JCET Advanced Packaging Co., Ltd., 2023
  3. Fan-Out Packaging and Preparation Method — Guangdong Academy of Sciences Semiconductor Institute, 2021
  4. Wafer-Level Package Having Multiple Dies Arranged in Side-by-Side Fashion and Associated Yield Improvement Method — MediaTek, Inc., 2016
  5. 3D Integrated Ultra-High Bandwidth Memory — Kepler Computing Co. (Kaipule Computing), 2025
  6. AI Chip — Panasonic IP Management Co., Ltd., 2022
  7. Wafer-Level Heterogeneous Integration for Artificial Intelligence Accelerators and Systems — KIM, BRIAN H., 2025
  8. AI Chip Package — Shenzhen Jiutian Ruixin Technology Co., Ltd., 2023
  9. Packaging for Eight-Socket One-Hop SMP Topology — IBM (International Business Machines Corporation), 2015
  10. AI Chip Computation and Communication Fusion Method, Device, and AI Chip — Taichu (Wuxi) Electronic Technology Co., Ltd., 2025
  11. WIPO — World Intellectual Property Organization (patent jurisdiction data)
  12. IEEE — Heterogeneous Integration Roadmap
  13. EPO — European Patent Office (advanced packaging filings)
  14. Semiconductor Industry Association — Interconnect bandwidth scaling research
  15. Nature — Photonic and electronic co-integration for AI accelerator bandwidth

All data and statistics in this article are sourced from the references above and from PatSnap‘s proprietary innovation intelligence platform.

Your Agentic AI Partner
for Smarter Innovation

PatSnap fuses the world’s largest proprietary innovation dataset with cutting-edge AI to
supercharge R&D, IP strategy, materials science, and drug discovery.

Book a demo