Loading…
PyTorch Day China 2025
In-person | 2025 June 7
Learn more on our website

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered to participate in the sessions. If you have not registered but would like to join us, please visit the BAAI Conference webpage.

Please note: This schedule is automatically displayed in China Standard Time (UTC+08:00)To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.
Saturday June 7, 2025 16:20 - 16:40 CST
Due to their sparse nature, Mixture-of-Experts (MoE) models are particularly well-suited for hybrid CPU/GPU inference, especially in low-concurrency scenarios. This hybrid approach leverages the large, cost-effective memory capacity of CPU/DRAM and the high bandwidth of GPU/VRAM.

In this talk, we introduce KTransformers, a high-performance inference system specifically designed for efficient heterogeneous computing of diverse MoE models. KTransformers employs AMX-optimized kernels that fully harness the computational power of modern CPUs and integrates an asynchronous CPU–GPU task scheduling mechanism that significantly reduces overhead. As a result, it achieves 4.62–19.74× speedups in prefilling and 1.25–4.09× speedups in decoding compared to existing systems.

This greatly enhances the accessibility of large MoE models for local users who prioritize security or wish to explore model internals. Consequently, KTransformers has already seen widespread adoption in both the open-source community and industry.
Speakers
avatar for Dr. Mingxing Zhang

Dr. Mingxing Zhang

Assistant Professor, Tsinghua University
Dr. Mingxing Zhang, Assistant Professor at Tsinghua University, focuses on memory systems research. He is the co-founder of the open-source projects Mooncake and KTransformers. His work has been published in over thirty papers at top international conferences and journals, including... Read More →
Saturday June 7, 2025 16:20 - 16:40 CST
TBA

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link