How WLFO packaging works: structure and RDL mechanics
Wafer-level fan-out (WLFO) packaging encapsulates and redistributes chips at the wafer scale—either without separating dies from the wafer, or by rearranging individual dies into a reconstituted wafer form—before final packaging and singulation. As established by STS Semiconductor & Telecommunications in their 2017 foundational patent, “packaging proceeds on the whole chips without separating the chips from the wafer or the individual chips are rearranged in a wafer form and then packaging proceeds.” A redistribution layer (RDL) is formed on a passivation layer, electrically connecting the chip I/O pattern to external contacts and enabling signal fan-out beyond the chip’s physical footprint without a conventional organic or ceramic substrate.
The fundamental distinction from traditional packages is that I/O can be extended laterally beyond the die edge without a conventional substrate—a property that directly enables the chiplet-based architectures now prevalent in AI accelerators. This is why, according to WIPO patent filings spanning the US, CN, EP, WO, and KR jurisdictions, WLFO has become the dominant subject of advanced packaging innovation since 2021.
An RDL is a metal wiring layer formed on the chip surface and encapsulant, routing electrical signals from the die’s native I/O pad locations to new, spatially expanded contact positions. In WLFO, the RDL replaces the conventional package substrate, enabling sub-2 µm pitch interconnect versus the 100–150 µm pitch of flip-chip BGA solder bumps.
Package warpage is the primary manufacturing challenge in large-format WLFO. It arises from mismatches in the thermal and mechanical properties of the molding compound, embedded chips, and structural reinforcement layers—and is one of the primary yield limiters for AI accelerator packages. JCET Advanced Packaging’s 2023 patent addresses this directly: a sheet-like structural layer with defined apertures houses each chip, and the ratio of the aperture region to the solid structural region is controlled within (0.5–2):1. Because the Young’s modulus of the plastic packaging layer is lower than that of the sheet-like structure, the composite system achieves “remarkably reduced package warpage and better package strength.”
In wafer-level fan-out packaging, JCET Advanced Packaging controls the ratio of the aperture region to the solid structural region within (0.5–2):1 in its sheet-like structural layer to achieve reduced package warpage and improved package strength, addressing the primary yield risk in large-format WLFO production.
Thermal management is embedded directly into the fan-out stack in designs targeting high-power AI chips. The Guangdong Academy of Sciences Semiconductor Institute’s 2021 patent integrates a heat spreader into the fan-out architecture: the chip’s back face is bonded via adhesive to the heat spreader’s die-attach region, the encapsulant fills gaps between the heat spreader and the temporary chip protection material, and the package wiring is grown directly on the chip’s front face, the encapsulant surface, and the heat spreader’s protruding structures. The result is a fan-out architecture explicitly engineered for “high power density chips” with improved operating power and reduced power consumption—properties directly relevant to AI processors with thermal design power (TDP) values exceeding 300–700 W.
Multi-die WLFO is enabled by side-by-side die arrangement and fine-pitch redistribution. MediaTek’s 2016 EP patent demonstrates a wafer-level package in which multiple dies are arranged adjacently, with connection paths linking I/O pads on adjacent die sides—enabling die-to-die communication through the RDL rather than through the package substrate or PCB. This is foundational to chiplet-based AI processors, where logic, memory, and I/O tiles must communicate at high bandwidth and low latency within a single package boundary.
3D stacking and heterogeneous integration for AI workloads
Three-dimensional die stacking compounds the advantages of WLFO with vertical integration of compute and memory dies, directly addressing the memory-bandwidth constraints that define AI training and inference performance. Kepler Computing’s 2025 CN patent describes an IC package in which a first die containing DRAM is placed below a second die containing compute logic, coupled via micro-bumps in a wafer-to-wafer bonding configuration.
“Placing the compute die below the DRAM die leads to thermal problems for the compute die, because any heat sink is closer to the DRAM and farther from the compute die.”
Kepler’s patent argues that positioning the memory die below the compute die—with TSV density aligned to die-to-die I/O count—resolves both bandwidth and thermal constraints simultaneously, achieving “ultra-high bandwidth AI processing systems.” This stacking orientation places the heat sink in proximity to the compute die, which generates the majority of thermal load, while TSV density is matched to the full area-array I/O count rather than perimeter-limited contacts.
Kepler Computing’s 2025 patent on 3D integrated ultra-high bandwidth memory positions the DRAM die below the compute die in a wafer-to-wafer bonding configuration, with TSV density aligned to die-to-die I/O count, to achieve area-array rather than perimeter-limited memory bandwidth for AI processing systems.
Panasonic IP Management’s 2022 CN patent further refines this architecture by stacking multiple memory dies and multiple compute dies on a system chip, with each die type maintaining its own layout pattern optimized independently. An interposer can be optionally incorporated—enabling memory and compute dies to be redesigned independently without redesigning the entire system chip, thereby increasing configuration flexibility while enabling cost reduction when the interposer is omitted.
Explore the full patent landscape for AI chip packaging and 3D stacking technologies in PatSnap Eureka.
Explore AI Chip Patents in PatSnap Eureka →At the system level, a 2025 WO patent filing proposes systems-on-a-wafer (SoWs) in which multiple chips are mounted directly on semiconductor wafers, interconnected via wiring layers and power modules, with inter-wafer connectors linking SoWs together. This represents the logical extension of WLFO philosophy—eliminating the PCB substrate entirely in favor of wafer-native interconnect at the system level. The patent explicitly critiques the conventional PCB-integrated server format, noting it “imposes limitations on thermal efficiency, power efficiency, and bandwidth availability, and may result in a relatively large physical footprint.” Standards bodies including IEEE have documented similar interconnect scaling challenges in heterogeneous integration roadmaps.
Where flip-chip BGA hits its ceiling for AI chips
Flip-chip BGA connects die I/O pads via solder bumps (C4 bumps) directly to an organic or ceramic package substrate at pitches of 100–150 µm, which then fans out to a ball grid array for PCB attachment. For AI workloads, the architecture’s perimeter-limited I/O is its fundamental constraint: placing HBM peripherally in a flip-chip configuration “does not solve the bandwidth problem because bandwidth is constrained by peripheral I/O from each side of the HBM and compute die,” as Kepler Computing’s patent explicitly states.
In flip-chip BGA configurations, placing High Bandwidth Memory (HBM) on the sides of a compute die does not resolve the memory bandwidth problem for AI chips, because bandwidth remains constrained by the peripheral I/O count from each side of the HBM and compute die—a structural limitation that wafer-level fan-out packaging with area-array TSV interconnect overcomes.
IBM’s 2015 CN patent on eight-socket one-hop SMP topology illustrates a second ceiling: standard socket-based packaging using land grid array (LGA) connectors cannot scale to eight-processor, single-hop topologies within a single plane. The document states that “it is impossible to use packaging technology described in the prior art to achieve high bandwidth, eight-node, one-hop systems,” requiring multi-plane vertical stacking via LGA connectors and rewiring cards. This multi-plane workaround demonstrates the bandwidth and density ceiling that conventional substrate-based packaging—including FC-BGA—imposes on AI compute systems requiring multi-chip coherency.
The interconnect bottleneck extends beyond the package itself. Taichu (Wuxi) Electronic Technology’s 2025 CN patent identifies PCIe-based chip-to-chip interconnect—the external interface most commonly associated with FC-BGA-packaged AI accelerators—as “fundamentally bandwidth-limited and non-scalable,” noting that “AI chip ultra-large-scale parallel computing capability is growing increasingly strong, and the traditional method of inter-chip data transfer via PCIe can no longer meet the bandwidth demands of AI computation.” This motivates the integration of routing and network interface logic directly onto the AI chip, effectively bypassing the FC-BGA package boundary as a bandwidth constraint. Research published by Nature on photonic and electronic co-integration similarly identifies off-chip bandwidth as a primary bottleneck for AI accelerator scaling.
For small to medium AI inference chips with well-understood I/O requirements and mature thermal envelopes, FC-BGA remains cost-competitive. Even SiP integration of AI, ASIC, and MEMS chips can be accomplished via wire bonding on a PCB substrate—suitable for edge AI in wearables and audio applications. For large-scale training accelerators, however, the bandwidth, latency, and density requirements drive mandatory adoption of advanced packaging including WLFO and 3D stacking.
The Semiconductor Industry Association has documented the growing gap between transistor scaling and interconnect bandwidth scaling—a gap that advanced packaging formats like WLFO are specifically designed to close. FC-BGA’s manufacturing infrastructure is mature and cost-effective for single-die, high-volume designs, but its organic substrate becomes a thermal bottleneck and routing density limiter as die sizes and I/O counts grow toward the requirements of frontier AI models.
Head-to-head: WLFO vs. flip-chip BGA across key attributes
The following comparison draws directly from the patent data analyzed, mapping structural, thermal, bandwidth, and manufacturing attributes across both packaging approaches.
| Attribute | Wafer-Level Fan-Out (WLFO) | Flip-Chip BGA (FC-BGA) |
|---|---|---|
| Interconnect density | RDL-based, sub-2 µm pitch achievable | Solder bump (C4) pitch typically 100–150 µm |
| Die-to-die bandwidth | High: direct RDL routing between dies | Low-medium: routed through package substrate |
| Package thickness | Thin: no interposer or substrate required | Thicker: organic/ceramic substrate required |
| Warpage risk | High without structural control layers | Lower: substrate provides mechanical rigidity |
| Thermal performance | Integrated heat spreader possible | Heat slug bonded above chip; substrate is thermal bottleneck |
| Scalability for chiplets | Native multi-die support via side-by-side RDL | Limited by substrate routing density |
| Manufacturing complexity | High: reconstituted wafer, RDL, molding | Moderate: mature infrastructure |
| Cost profile | Higher NRE; competitive at scale | Lower for single-die, high-volume designs |
The warpage limitation of WLFO is addressed by JCET’s sheet-like structural layer with controlled aperture-to-solid ratios of (0.5–2):1. The thermal limitation is addressed by Guangdong Academy of Sciences’ heat-spreader-embedded fan-out structure. FC-BGA’s bandwidth ceiling manifests most acutely in AI workloads because neural network inference and training are memory-bandwidth-bound: a vertically stacked WLFO configuration aligns TSV density directly with die-to-die I/O count, achieving area-array rather than perimeter-limited bandwidth.
Map the full competitive patent landscape for WLFO and FC-BGA packaging across all major jurisdictions.
Analyse Packaging Patents in PatSnap Eureka →Key patent assignees and innovation clusters
The patent data identifies four distinct clusters of assignees driving WLFO and advanced AI chip packaging innovation, each addressing a different layer of the technology stack.
Packaging foundries and OSATs
JCET Advanced Packaging—China’s largest OSAT—is active in WLFO structural innovations specifically targeting warpage control, a critical production yield issue. STS Semiconductor & Telecommunications (Korea) holds foundational WLFO process patents covering RDL formation, passivation layering, and wafer-scale singulation. Guangdong Academy of Sciences Semiconductor Institute is pursuing thermally enhanced fan-out structures with integrated heat spreaders for high-power-density applications. Patent activity from these foundry-side assignees reflects the manufacturing challenge of transitioning WLFO from a niche advanced packaging format to a volume production process suitable for AI accelerator demand. EPO filings in this cluster have grown substantially since 2021, reflecting the technology’s maturation across European jurisdictions.
Fabless AI chip and system designers
Panasonic IP Management is patenting scalable AI chip architectures using die-stacking with optional interposers—enabling memory and compute dies to be redesigned independently without redesigning the entire system chip. Kepler Computing is specifically addressing the compute-over-memory 3D die-stacking configuration for ultra-high bandwidth AI inference and training. Taichu (Wuxi) Electronic Technology is integrating network switching logic directly onto AI chips to eliminate external interconnect bottlenecks associated with PCIe-based FC-BGA systems.
Multi-die and chiplet system architects
MediaTek holds key patents on side-by-side multi-die wafer-level packages with die-to-die RDL interconnect—a foundational technique now widely adopted for AI accelerator chiplet integration. The 2025 WO filing on wafer-level heterogeneous integration for AI accelerators extends the WLFO concept to full system-on-wafer architectures for AI server infrastructure, eliminating PCBs and conventional packages entirely in favor of wafer-native interconnect at the system level.
Legacy HPC packaging specialists
IBM’s 2015 CN patent documents the architectural ceiling of conventional substrate-based packaging for multi-processor AI/HPC topologies—demonstrating that achieving high bandwidth, eight-node, one-hop systems is impossible with prior-art packaging technology, motivating vertical LGA stacking as a bridging technology toward the wafer-scale integration approaches now emerging.
A clear trend across 2021–2025 patent filings from JCET Advanced Packaging, Kepler Computing, Panasonic IP Management, MediaTek, and Taichu (Wuxi) Electronic Technology is the convergence of thermal management, ultra-high-bandwidth memory integration, and wafer-scale interconnect as co-equal design constraints for AI chip packaging—none of which can be adequately addressed by conventional flip-chip BGA alone.
“AI chip ultra-large-scale parallel computing capability is growing increasingly strong, and the traditional method of inter-chip data transfer via PCIe can no longer meet the bandwidth demands of AI computation.”