Where are they looking in the operating room?

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Medical Computer Vision · Depth: Expert, quick

Summary

Researchers have introduced gaze-following, a computer vision task for inferring where individuals are looking, into the operating room (OR) environment. This novel application aims to enhance surgical workflow analysis by understanding clinical roles, surgical phases, and team communications. The study extended the 4D-OR dataset with gaze-following annotations and the Team-OR dataset with gaze-following and new team communication activity annotations. They developed novel approaches, including a gaze heatmap-based method for role and phase recognition, and a self-supervised spatial-temporal model for team communication detection. Their method achieved F1 scores of 0.92 for clinical role prediction and 0.95 for surgical phase recognition, significantly outperforming existing baselines in team communication detection by over 30%.

Key takeaway

For computer vision engineers developing solutions for surgical environments, integrating gaze-following models can provide critical insights into workflow, roles, and communication. Your systems could achieve F1 scores of 0.92 for role prediction and 0.95 for phase recognition, significantly improving upon current baselines for team communication detection. Consider extending existing surgical datasets with gaze annotations to leverage this approach.

Key insights

Gaze-following in the OR significantly enhances surgical workflow analysis and team communication understanding.

Principles

Method

A gaze heatmap-based approach predicts clinical roles and surgical phases. A self-supervised spatial-temporal model detects team communication using gaze-based clip features.

In practice

Topics

Best for: AI Scientist, Computer Vision Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.