CataCompDetect: Intraoperative Complication Detection in Cataract Surgery

Abstract

Cataract surgery is one of the most commonly performed procedures worldwide, with over 26 million surgeries performed annually. Despite standardized techniques, intraoperative complications may still occur due to surgeon variability and patient-related factors. Such complications pose significant risks to visual outcomes and, in severe cases, may cause permanent vision loss. We propose CataCompDetect, a complication detection framework that combines complication-specific risk scoring and vision-language reasoning for classification. The framework first identifies relevant surgical phases via expert-derived priors, then performs risk assessment to identify video segments exhibiting anatomical cues suggestive of complications, followed by a vision-language model for final classification. To evaluate CataCompDetect, we introduce CataComp-104, the first cataract surgery video dataset annotated for intraoperative complications, comprising 104 surgeries with 44 complication cases across three clinically significant types: iris prolapse, PCR, and vitreous loss. CataCompDetect achieves an average F1 score of 60.05%, with per-complication F1 scores of 75.00% (iris prolapse), 57.14% (PCR), and 48.00% (vitreous loss).

Target Complications

Three clinically significant intraoperative complications in MSICS cataract surgery.

Iris Prolapse

Protrusion of iris tissue through the corneal incision, appearing as dark brownish-red tissue extending beyond the incision margin. Caused by fluctuations in intraocular pressure or incision instability.

F1: 75.00%

Posterior Capsule Rupture (PCR)

A rupture in the thin posterior capsule supporting the lens, typically occurring during lens removal or cortical wash. Appears as a visible tear in the capsule. Can lead to vitreous loss if unmanaged.

F1: 57.14%

Vitreous Loss

Occurs when the gel-like vitreous prolapses into the anterior chamber, often following PCR. Appears as translucent, web-like strands through the pupil, causing characteristic tear-drop pupil distortion.

F1: 48.00%

Complication Examples

Representative frames from CataComp-104 illustrating each intraoperative complication.

Iris Prolapse

Dark brownish-red tissue protruding beyond the incision margin

Posterior Capsule Rupture

Visible tear in the posterior capsule during cortical wash

Vitreous Loss

Translucent web-like vitreous strands with tear-drop pupil distortion

Method: CataCompDetect

A three-stage pipeline integrating surgical phase-aware localization, anatomical risk scoring, and vision-language classification.

1

Phase Localization

Surgical phase predictions (MS-TCN++) narrow analysis to phases where each complication is most likely. PCR/Vitreous Loss: cortical wash. Iris Prolapse: entire video.

→

2

Anatomical Risk Scoring

Per-frame risk scores computed from iris/pupil segmentation masks using SAM 2 and TernausNet. Temporal sliding window identifies high-risk segments.

→

3

VLM Classification

Top-5 high-risk segments per video are verified by GPT-5 with complication-specific few-shot prompts describing precise clinical visual indicators.

Risk Scoring Modules

Iris Prolapse

SAM 2 identifies candidate masks at the iris periphery; filtered by size and color thresholds. Largest validated mask area serves as risk score.

PCR

Histogram equalization + edge detection inside pupil mask. Longest edge (bounding-box diagonal) normalized by pupil area is the risk score.

Vitreous Loss

Pupil boundary partitioned into angular sectors; risk score = max sector radius / global mean radius, capturing localized wedge-shape deformation.

CataComp-104 Dataset

The first cataract surgery video dataset annotated for intraoperative complications. Collected under routine surgical conditions using a smartphone-mounted microscope (1920×1080 px, 30 fps).

104

MSICS Videos

43h 49m

Total Duration

44

Complication Cases

2

Expert Annotators

Complication	Train		Val
Complication	Videos	Avg. Duration	Videos	Avg. Duration
None	32	18m 10s ± 7m 51s	27	16m 41s ± 6m 10s
Iris Prolapse	11	25m 38s ± 21m 47s	14	31m 34s ± 24m 56s
PCR	13	50m 16s ± 23m 34s	11	37m 07s ± 14m 44s
Vitreous Loss	13	50m 16s ± 23m 34s	12	35m 59s ± 14m 35s
Total	53	26m 03s ± 19m 15s	51	24m 47s ± 17m 23s

Results

Per-complication and average detection performance on CataComp-104 (validation split). Best results per complication in bold.

Complication · Method	Accuracy	Sensitivity	Specificity	F1 Score
Iris Prolapse
Random	50.00%	50.00%	50.00%	35.44%
I3D Baseline*	70.20%	27.14%	86.49%	25.10%
VideoMAEv2 Baseline*	61.96%	28.57%	74.59%	29.22%
DINOv3 Baseline*	68.24%	14.29%	88.65%	18.18%
Risk-scoring only	37.25%	100.00%	13.51%	46.67%
VLM-only (GPT-5)	78.43%	28.57%	97.30%	42.10%
CataCompDetect (GPT-5)	90.20%	78.57%	94.59%	81.48%
PCR
Random	50.00%	50.00%	50.00%	30.14%
I3D Baseline*	77.65%	10.91%	96.00%	16.27%
VideoMAEv2 Baseline*	77.65%	9.09%	96.50%	14.12%
DINOv3 Baseline*	77.65%	18.18%	94.00%	23.46%
Risk-scoring only	70.59%	63.64%	72.50%	48.28%
VLM-only (GPT-5)	43.14%	81.82%	32.50%	38.30%
CataCompDetect (GPT-5)	78.43%	45.45%	87.50%	47.62%
Vitreous Loss
Random	50.00%	50.00%	50.00%	32.00%
I3D Baseline*	77.25%	18.33%	95.38%	26.97%
VideoMAEv2 Baseline*	75.69%	8.33%	96.41%	13.19%
DINOv3 Baseline*	75.69%	16.67%	93.85%	22.13%
Risk-scoring only	66.67%	75.00%	64.10%	51.43%
VLM-only (GPT-5)	29.41%	100.00%	7.69%	40.00%
CataCompDetect (GPT-5)	76.47%	58.33%	82.05%	53.85%
Average
Random	50.00%	50.00%	50.00%	32.53%
I3D Baseline*	75.03%	18.79%	92.62%	22.78%
VideoMAEv2 Baseline*	71.77%	15.33%	89.17%	18.84%
DINOv3 Baseline*	73.86%	16.38%	92.17%	21.26%
Risk-scoring only	58.17%	79.55%	50.04%	48.79%
VLM-only (GPT-5)	50.33%	70.13%	45.83%	40.13%
CataCompDetect (GPT-5)	81.70%	60.78%	88.05%	60.98%

* Evaluated as feature probing baselines where downstream multi-layer perceptron classifiers are trained on top of frozen backbone features across 5 random seeds.

CataCompDetect: Intraoperative ComplicationDetection in Cataract Surgery

Abstract

Target Complications

Iris Prolapse

Posterior Capsule Rupture (PCR)

Vitreous Loss

Complication Examples

Method: CataCompDetect

Phase Localization

Anatomical Risk Scoring

VLM Classification

Risk Scoring Modules

CataComp-104 Dataset

Results

CataCompDetect: Intraoperative Complication
Detection in Cataract Surgery