Ziyu Zhong
📍 Lund, Sweden
✉️ ziyu.zhong@eit.lth.se
LinkedIn
About
I am currently a doctoral student in Networked and Extended Reality (XR) at Lund University.
My research lies at the intersection of:
- Multimodal machine learning (pose, video, IMU/mocap),
- Real-time XR/AR systems (CloudXR, edge-assisted rendering),
- Latency-aware prediction and inference optimization.
Before my PhD, I completed an M.Sc. in Computer Science at The University of Hong Kong and a B.A. in Digital Media Development at York University.
I also worked as a software engineer at Tencent (QQ, PCG) and ByteDance (Video Architecture), building large-scale multimedia and graphics-related systems.
I enjoy building end-to-end systems that go from data collection and calibration, through ML modeling and optimization, to interactive prototypes in XR, AR, and mobile environments.
Research Interests
- Vision / XR foundation models and representation learning
- XR/VR/AR pose tracking and motion prediction
- Latency-aware inference and edge/cloud offloading
Education
Ph.D. in Networked & Extended Reality (XR), Lund University, Sweden
2023 – PresentM.Sc. in Computer Science (Multimedia Computing), The University of Hong Kong, Hong Kong
2020 – 2022B.A. in Digital Media Development, York University, Canada
2016 – 2020
Projects and Publication
Conference Papers
- [VRST]Predictability-Aware Motion Prediction for Edge XR via High-Order Error-State Kalman Filtering
ACM Symposium on Virtual Reality Software and Technology (VRST), ACM, 2025.
Authors: Ziyu Zhong, …
Summary: Latency-aware motion prediction for distributed / cloud XR rendering, compensating motion-to-photon latency using SLAM-based and ML-based pose prediction.
DOI:10.1145/3756884.3765973
Real-Time Translation System (M.Sc. Thesis)
- Designed a full real-time simultaneous translation pipeline:
- speech-to-text,
- neural translation (Transformer-based),
- low-latency delivery to mobile clients.
- Built a production-style cloud service and an Android demo app to showcase the end-to-end system.
- Focused on balancing latency, QoE, and model complexity, giving me practical experience in deploying ML systems in real-world conditions.
GPU-Accelerated Max-Flow for Real-Time Workloads
- Implemented a CUDA-based Max-Flow solver, offloading CPU bottlenecks to GPU.
- Achieved ≈80% reduction in runtime compared to the CPU baseline.
- Used Nvidia Nsight tools to analyze warp divergence, memory throughput, and kernel-level bottlenecks.
- This work informs my approach to efficient inference and acceleration.
Industry Experience
Software Engineer — QQ (PCG)
Tencent, Shenzhen, China · Dec 2020 – Feb 2023
- Maintained and evolved a large-scale Android codebase for QQ (PCG), working across legacy modules and modern Kotlin/Jetpack components.
- Owned end-to-end features in high-traffic UI flows, coordinating changes across app, rendering, and service layers to ensure stability and backward compatibility.
- Refactored critical paths (Java/Kotlin/C++) to reduce ANRs and jank, profiling with Systrace/Perfetto and optimizing RecyclerView diffing, threading, and I/O.
- Integrated and hardened cross-platform UI via the PTS2 2D engine (C++), including a DSL parser and shared rendering/layout primitives for Android/iOS.
- Improved build and release reliability with Gradle/CMake optimizations, modularization, and CI pipelines; monitored performance and crashes via Grafana and internal metrics.
- Collaborated with multi-team stakeholders (graphics, backend, QA) to triage device-specific issues, validate fixes across diverse Android OEMs, and ship weekly releases at scale.
Software Engineer Intern — Video Architecture
ByteDance, Beijing, China · Jun 2020 – Aug 2020
- Gained deep familiarity with the cloud gaming/phone codebase, tracing end-to-end data paths across capture, encode, transport, and render pipelines.
- Designed and validated signaling flows (session setup/teardown, device–cloud state sync, QoS/bitrate adaptation) using ByteRTC/WebRTC-like mechanisms, including NAT traversal, ICE/STUN/TURN, and custom control channels.
- Improved reliability and latency by profiling message queues and transport layers (UDP/TCP), optimizing handshake, heartbeat, and reconnect logic for heterogeneous devices.
- Implemented device–cloud state synchronization (Java/C++ with ByteRTC) for low-latency streaming.
- Worked on a cross-platform AirPlay streaming SDK (Android/Windows), focusing on multimedia performance and compatibility.
Technical Skills
- Machine Learning: PyTorch, TensorFlow; temporal models; basic GPU acceleration.
- Programming: Python, C/C++, Java, Kotlin, Rust, CUDA, Shell, CMake.
- Systems & Tools: Git, Docker, Kubernetes, Slurm, gdb, tsan, Wireshark, Grafana, Nvidia Nsight Systems/Compute.
- Platforms: Android, Linux, macOS, Windows; Raspberry Pi, Arduino.
- Web & Data: HTML/CSS/JavaScript, SQL, DB2.