The mathematical core: how WBC structures the control problem
Whole-body control treats the humanoid robot as a single kinematically and dynamically coupled system — not a collection of independent subsystems. It defines a compound task of operational-space objectives (center-of-mass trajectory, foot placement, end-effector pose) and resolves all tasks simultaneously against a constraint set encoding contact forces, joint torque bounds, and friction cone membership. This architecture was formally specified in a foundational patent from the Board of Regents, University of Texas System (2016), which defines a robot model that computes kinematic and dynamic properties, binds controller parameters to transport layers for external process access, and instantiates the WBC using the compound task and constraint definitions.
The mathematical substrate of WBC is inverse dynamics in the operational space. Joint accelerations are computed by mapping Cartesian-space acceleration commands through the robot’s forward kinematics Jacobian, and ground reaction forces at multiple contact points are optimised to satisfy Newton-Euler equations of motion. A patent from Shenzhen Weiteshi Technology (2018) describes this precisely: the whole-body dynamic controller operates in the operational space as an acceleration command computation, converts commands to joint accelerations via differential forward kinematics, and optimises contact reaction forces for under-actuated robot dynamics. Critically, the system supports simultaneous computation of position- and time-parameterised outputs, enabling generation of multiple walking patterns at speeds compatible with real-time control.
SRBD simplifies the humanoid by treating leg links as massless connectors and concentrating total body mass at the floating base. Translational dynamics follow Newton’s second law across a finite set of contact points; rotational dynamics follow the Euler equation with the Coriolis/centrifugal term neglected at low speeds. This yields a compact state-space representation amenable to both MPC and learning-based optimisation — and is the dominant simplification enabling real-time WBC execution, as detailed using the Unitree H1 humanoid by Electronic Science and Technology University Zhongshan Campus (2025).
Priority-ordered task hierarchies are central to WBC’s ability to satisfy multiple conflicting objectives simultaneously. A 2025 patent from Beijing Institute of Technology describes a tracking control module that employs a full rigid-body dynamics model, resolves joint torques through inverse dynamics, implements motion tracking via WBC, and uses hierarchical optimisation to prioritise motion tasks over contact force tasks — a canonical formulation in the field. The state estimation module supplies all current state quantities required by both the upstream MPC and downstream WBC layers, illustrating the standard two-level architecture.
Whole-body control (WBC) in humanoid robots formulates locomotion as a prioritised constrained optimisation problem over joint torques and contact forces, treating the entire robot as a single kinematically and dynamically coupled system rather than decomposing it into independent subsystems.
Hybrid architectures: pairing WBC with MPC and trajectory optimisation
The most widely deployed state-of-practice for model-based humanoid locomotion pairs WBC with model predictive control in a two-layer architecture: MPC plans contact sequences and center-of-mass trajectories over a receding horizon, while WBC resolves instantaneous joint torques accounting for the robot’s full dynamic model at the current time step. This separation of concerns is documented across multiple 2024–2025 filings from Chinese universities and technology companies.
Huazhong University of Science and Technology (2025) introduces a full-body kinematics–centroidal dynamics model that generates complete walking trajectories including upper-body joint motion in real time. The MPC model unifies trajectory generation and online tracking control, with optimal state and control input passed as inputs to the WBC layer. The centroidal dynamics formulation explicitly includes linear momentum and angular momentum at the center of mass, contact forces at each contact point, and the cross-product moment arm from the CoM to each contact site — enabling arm and waist dynamics to contribute to whole-body balance rather than being treated as passive appendages.
“Prior WBC approaches fail to include upper-body joint motion in the planning stage, leaving WBC to compensate inadequately after the fact — yielding locomotion with appended arm motion rather than genuinely whole-body coordinated walking.”
Zhejiang University (2025) constructs a nonlinear model predictive control (NMPC) problem that simultaneously plans CoM state, arm end-effector state, foot state, and control inputs. Equality and inequality constraints governing whole-body coordinated motion are designed into the optimisation, including task trajectory constraints on arm end-effectors and momentum conservation constraints that leverage arm motion to assist locomotion balance. An inverse dynamics layer then computes feed-forward joint torques, combined with joint-level PD controllers for closed-loop tracking.
In MPC+WBC two-layer humanoid locomotion architectures, model predictive control plans contact sequences and center-of-mass trajectories over a receding horizon, while whole-body control resolves instantaneous joint torques from those reference trajectories using the robot’s full dynamic model at the current time step.
Explore the full patent landscape for humanoid robot control in PatSnap Eureka.
Explore WBC Patents in PatSnap Eureka →Tsinghua University (2024) introduces collision-aware WBC (CA-WBC) for foot legged robots, implemented as a weighted quadratic program (WQP). When a stance leg unexpectedly fails to contact the ground — a “late collision” scenario — the WBC switches objectives and constraints according to a decision tree, adding discrete collision model terms to the optimisation. The authors explicitly identify that computational resources on real robots are constrained, motivating the WQP implementation over more expensive second-order cone program (SOCP) formulations.
Beijing Xiaomi Robot Technology (2024) addresses the real-time execution challenge through offline precomputation: full dynamic model parameters are discretised and indexed by key joint angles, and grid interpolation at runtime recovers per-joint dynamics parameters to generate WBC joint torque commands. This strategy explicitly reduces computational load while maintaining full-dynamics accuracy — a practical engineering trade-off that recurs across several recent filings. Yibajwu Robotics (Suzhou) (2026) extends this further with a hierarchical dynamics optimisation model that fuses real-time joint signals with historical optimisation data and updates model weights continuously based on deviation tracing information.
Reinforcement learning integration: learning WBC-compatible locomotion policies
Classical WBC, while theoretically rigorous, requires accurate dynamic models and struggles with robustness on unstructured terrain. The patent corpus documents a strong trend toward incorporating reinforcement learning to address these limitations — either by learning policies that output targets consumed by downstream WBC layers, or by replacing portions of the WBC stack with learned neural networks. iFLYTEK (2025) explicitly catalogs classical locomotion algorithms — Zero Moment Point (ZMP), Model Predictive Control (MPC), and Whole Body Control (WBC) — and identifies their shared limitation: mathematical model simplifications that yield stable behaviour only in specific environments and fail on unstructured terrain.
Centroidal momentum, a compact whole-body dynamic descriptor, is traditionally computed on CPU using libraries such as Pinocchio, creating a data transfer bottleneck when RL training runs on GPU. Hangzhou Dianzi University (2026) resolves this by constructing CasADi symbolic expressions for joint-space inertia tensors and centroidal momentum, compiling these to CUDA kernels, and executing them in parallel across thousands of simulation environments on GPU — enabling reward functions that supervise whole-body coordination rather than just joint positions or CoM velocity.
Electronic Science and Technology University Zhongshan Campus (2025) trains a fully-connected residual network offline to approximate MPC-optimal ground reaction forces from CoM state inputs, with a priority-based loss function that enforces friction cone and reaction force bound constraints. During online deployment, the network replaces the MPC solver, providing efficient inference. The residual network outputs are fused with joint angles, angular velocities, and prior actions, then processed through a reinforcement learning policy trained with biomechanically-referenced trajectories from public human motion datasets. A symmetric loss constraint on lower-limb motion improves convergence speed. The final policy constitutes a whole-body control strategy incorporating both model-based prior knowledge and human motion references.
iFLYTEK (2025) identifies WBC, ZMP, and MPC as classical locomotion algorithms that share a common limitation: mathematical model simplifications that yield stable behaviour only in specific environments and fail on unstructured terrain, motivating reinforcement learning integration to achieve terrain robustness.
Shenzhen Weiteshi Technology (2018) proposes an integrated framework combining a phase-space planning (PSP) framework, a robust reinforcement learning process exploiting PSP’s inherent directional walking constraints, and a whole-body dynamic controller. The RL process operates over simplified models derived from PSP, while the WBC layer handles full-body dynamics in the operational space. The framework generates multiple walking patterns and achieves real-time computational speed, addressing the longstanding gap in 3D full-body humanoid dynamic walking under RL guidance — as recognised by standards bodies including IEEE in robotics control research.
For biped robots specifically, Zhejiang University of Technology (2026) demonstrates that running and jumping can be learned without complex demonstration data by designing reference trajectories for leg joint pitch angles and body pose, then using RL with a two-stage training protocol — base policy first, then strict optimisation — with Proximal Policy Optimisation (PPO) and asymmetric Actor-Critic networks. Domain randomisation enhances sim-to-real transfer stability. This approach aligns with broader trends in robot learning documented by Nature and corroborates methodological standards from WIPO‘s global patent data on autonomous systems.
Track RL-based WBC patent filings across all major assignees with PatSnap Eureka.
Analyse Humanoid Robot Patents in PatSnap Eureka →Full-body coordination: arms, waist, and upper-body dynamics
A distinguishing challenge of humanoid WBC versus quadruped WBC is the need to coordinate upper-body degrees of freedom — arms, waist, and torso — with lower-body locomotion. Several 2025 patents explicitly address this problem, and the consensus finding is clear: arm and upper-body dynamics must be incorporated at the planning stage, not left to WBC post-hoc correction.
Huazhong University of Science and Technology (2025) argues that prior WBC approaches fail to include upper-body joint motion in the planning stage, leaving WBC to compensate inadequately after the fact. By incorporating arm and waist dynamics directly into the centroidal dynamics planning model — including nc contact points with three-dimensional contact forces and full centroidal momentum (linear plus angular) as state variables — the method achieves genuinely whole-body coordinated walking. Zhejiang University (2025) reinforces this with momentum conservation constraints in the NMPC problem that leverage arm motion to assist locomotion balance.
Zhiyuan Innovation (Shanghai) Technology (2025) addresses the specific challenge that vigorous upper-limb motions destabilise the lower body. A prediction model trained on dynamically-augmented upper-limb trajectory samples learns to map upper-limb motion information and robot state to lower-limb joint angles, ensuring that even in cases of violent arm motion the lower body can maintain balance. Dynamic augmentation during training varies joint masses, control coefficients, angles, and velocities, creating a robust learned correspondence between upper and lower body dynamics.
Humanoid whole-body control requires arm and upper-body dynamics to be incorporated at the planning stage — not compensated post-hoc in the WBC layer — to achieve genuinely coordinated locomotion, as demonstrated by Huazhong University of Science and Technology and Zhejiang University in 2025 patents using centroidal dynamics planning models with momentum conservation constraints.
Xiaopeng Motors (2025) introduces a hybrid full-body control approach that separately governs the lower limbs and waist via torque commands from an RL model masked to the relevant joints, and the upper limbs via joint position commands derived from reference motion data, then concatenates both command sets into a unified whole-body control instruction. This hybrid strategy explicitly separates the locomotion-stability-critical lower body from the manipulation-oriented upper body while maintaining a coherent whole-body command interface — a pragmatic decomposition that avoids the full computational cost of jointly optimising all degrees of freedom at every timestep.
Tencent Technology (2025) extends WBC beyond pure locomotion to non-prehensile manipulation on a wheel-legged platform, incorporating passivity-based control (IDA-PBC) for a nonlinear sphere-balancing task into the whole-body control architecture. The base control information for maintaining sphere balance is computed via passive control of the nonlinear coupled robot-sphere system, then fed into the WBC layer that determines per-joint torques — demonstrating WBC’s extensibility to loco-manipulation tasks requiring simultaneous balance and object control. This generalisation aligns with research directions tracked by IEEE in robotics and automation.
Key assignees and the patent landscape
Based on frequency and technical depth of relevant filings across the 50+ patent corpus, the following assignees constitute the dominant innovation clusters in whole-body control for humanoid locomotion. The geographic distribution spans China, the United States, Europe, and Japan, with Chinese universities and technology companies accounting for the majority of 2024–2026 filings.
The University of Texas System (Board of Regents) holds a foundational WO patent (2016) that defines the software architecture for binding WBC to transport layers and external processes — a framework-level contribution underpinning many subsequent implementations. SoftBank Robotics Europe applies linear predictive position and velocity controllers with WBC-compatible multi-state supervisors to humanoid robots with omnidirectional wheel bases (2017, 2019), targeting commercial deployment of whole-body balance in humanoid service robots. Beijing Institute of Technology contributes slip-aware WBC for wheel-legged robots (2025), extending WBC to hybrid wheel-leg platforms operating on slippery or irregular terrain — a platform class also addressed by several Tencent filings.
The geographic concentration of recent filings in China reflects broader trends in humanoid robotics investment documented by international bodies including WIPO in its Global Innovation Index. The technical depth of university filings — from Zhejiang University’s NMPC formulations to Hangzhou Dianzi University’s CUDA-accelerated centroidal dynamics — indicates that academic institutions remain primary contributors to foundational WBC methodology, while technology companies such as Tencent and Beijing Xiaomi Robot Technology focus on deployable implementations with explicit computational constraints. Further patent intelligence is available through PatSnap Analytics and the PatSnap Eureka platform.