Research Overview

Pervasive Augmented Intelligence

I lead the Pervasive Intelligent Systems Lab, where we design intelligent systems that can sense, learn, and adapt to augment and support human capabilities in healthcare, learning, and work. Our research lies at the intersection of pervasive systems, artificial intelligence, and human-centered sensing. Inspired by the vision of pervasive computing, we explore how computation and intelligence can be seamlessly embedded into the ambient environment and worn on the body to enable continuous, context-aware, and personalized support in everyday life.

We pursue this vision across the full stack, from novel sensing and hardware platforms, to edge and mobile systems, to AI algorithms and trustworthy deployment. The systems we build are designed not only to be capable, but also efficient, robust, and trustworthy in real-world use.

Research pillar 1
The key pillars of our research.
Research pillar 2
Core research areas.
Three focused areas

Holistic Sensing

Understanding people through multimodal signals that capture the user, the surrounding environment, and their interaction over time.

Efficient Intelligence

Designing sensing, learning, and system pipelines that work continuously under tight energy, latency, and resource constraints.

Trustworthy Deployment

Building AI systems that are privacy-preserving, secure, and robust enough for real-world human-centered applications.

Representative Themes

Select a theme to explore representative projects

Theme

Resource-efficient AI for Pervasive Computing Systems

We develop resource-efficient machine learning methods for mobile sensing systems, with a particular focus on wearable eye-tracking platforms for cognitive context sensing. This line of research asks how intelligent sensing systems can remain effective under practical constraints in training data, computation, and deployment, especially when they must adapt to diverse users, devices, and real-world settings.

Resource-efficient Gaze Estimation via Frequency-domain Learning

EfficientGaze
Framework of proposed EfficientGaze.

Modern gaze estimation systems have benefited greatly from deep learning, but this progress often comes at a practical cost: high computational overhead and strong dependence on large-scale labeled gaze data. These challenges make deployment difficult on resource-constrained mobile and wearable platforms, where both inference efficiency and calibration cost matter.

In EfficientGaze work, we developed a resource-efficient framework for gaze representation learning that addresses both challenges jointly. At the core of the system is a frequency-domain gaze estimation pipeline, which exploits the spectral compaction property of the discrete cosine transform (DCT) to extract informative gaze representations with substantially lower computational cost. In parallel, we introduced a multi-task gaze-aware contrastive learning framework that learns gaze representations in an unsupervised manner, reducing the dependence on expensive manual gaze annotation while improving cross-subject generality. Extensive evaluation shows that EfficientGaze achieves gaze estimation performance comparable to existing supervised approaches while significantly improving efficiency, enabling up to 6.80× faster system calibration and 1.67× faster gaze estimation. This work demonstrates how principled representation design in the frequency domain can make gaze-based sensing more practical, scalable, and deployable for real-world mobile systems.

Project Repository Publication at ACM TOSN

Efficient High-frequency Eye Tracking with Event Cameras

Comparison between conventional frame-based and proposed hybrid gaze estimation.

Conventional eye trackers are typically constrained by the fixed frame rate and bandwidth of CCD/CMOS cameras. Pushing these systems to very high sampling frequencies often leads to substantial sensing and downstream computational cost. In EV-Eye work, we rethink eye tracking from an efficiency perspective by leveraging event cameras, which emit signals only when pixel-level intensity changes occur. This event-driven sensing principle allows the system to focus computation on informative eye motion rather than repeatedly processing redundant full frames, making high-frequency tracking far more practical for resource-constrained wearable platforms.

Built on this idea, we developed a hybrid frame-event pipeline that combines low-rate near-eye grayscale images for robust pupil segmentation with asynchronous event streams for lightweight pupil updates. Rather than operating at a fixed maximum rate, the system updates adaptively based on how quickly informative events accumulate: it speeds up automatically during rapid eye movements such as saccades, and slows down when the eye is relatively stable. This design enables pupil tracking at up to 38.4 KHz while better utilizing limited computation, energy, and bandwidth.

Project Repository Publication at NeurIPS

Graph-based Few-shot Cognitive Context Sensing

Demonstration of GazeGraph on a wearable eye-tracking platform for cognitive context sensing.

Eye movements are tightly coupled with cognitive processes, making them a powerful sensing modality for inferring a person’s psychological and cognitive state. Yet, building robust gaze-based cognitive sensing systems is challenging because eye-movement data is highly heterogeneous across individuals, visual stimuli, and eye-tracking devices.

In our GazeGraph work, we developed a generalized deep learning framework that robustly recognizes a user’s cognitive context under such heterogeneity and rapidly adapts to unseen sensing scenarios with minimal training data. Leveraging graph-based modeling, we represent eye-movement trajectories as spatio-temporal gaze graphs, enabling deep models to jointly capture the structural and temporal dynamics of gaze behavior. On top of this representation, we introduced a few-shot graph learning module based on meta-learning, allowing fast and data-efficient adaptation to new users and new eye trackers. The system was demonstrated on the Magic Leap AR headset, showing how eye-movement-based cognitive inference can support context-aware immersive applications.

Project Repository Publication at ACM SenSys

Psychology-inspired Generative Model for Eye Movement Synthesis

EyeSyn
EyeSyn synthesizes realistic eye-movement signals across diverse cognitive tasks and sensing setups.

Gaze-based sensing also lacks diverse and sufficiently large eye-movement datasets for training. Conventional data augmentation approaches, such as GAN-based generation, themselves require large heterogeneous gaze datasets for training, which are rarely available in practice.

In EyeSyn work, we addressed this bottleneck by introducing a psychology-inspired gaze synthesis framework that eliminates the need for large-scale human gaze collection. Grounded in cognitive and behavioral science, EyeSyn synthesizes the gaze signals that an eye tracker would record during specific cognitive activities. Unlike conventional data collection, which is tied to particular lab setups, stimulus designs, and subject pools, EyeSyn can systematically simulate a wide range of eye-tracking configurations, including viewing distances, stimulus sizes, and sampling frequencies. EyeSyn faithfully reproduces characteristic human gaze patterns and captures diversity arising from different sensing setups and individual variability. It provides a practical tool for scaling, stress-testing, and improving gaze-based sensing systems when real annotated data is limited.

Project Repository Publication at ACM/IEEE IPSN