Computer Vision — Intraoperative Detection & Labeling
GoPro + YOLO pipeline that turns FESS surgical video into an objective per-case efficiency report. Three goals: (1) the intraoperative detection & labeling system, (2) a semantic network / ontology that turns raw detections into meaningful surgical events, and (3) validation — proving the metrics mean what they claim. Part of the Pharyvac Surgical Technologies pipeline and the engine behind ARS DWK grant Aim 1.
Claude’s Role
Act as the research-methods and engineering thinking partner for this project. Concretely:
- Keep every metric honest — distinguish detection accuracy from metric validity (the swap-count artifact is the standing reminder).
- Help design and document the four validation pillars; turn loose ideas into pre-registered, defensible methods.
- Maintain the detection-ontology as the single source of truth for what each detection/event means, and keep the annotation protocol in sync with it.
- Ground all writing in the real numbers (n=16, 92.4%/85.3%, 7 classes, 25.4% unlabelled) — never invent data.
Prime directive: the immediate goal is validation. If a session is drifting into model-tinkering or new features without moving validation forward, nudge me back: “Does this get us closer to a defensible validity claim for Aim 1 — or is it scope creep before the metrics are validated?”
Process
How work flows from raw video to a validated, shareable result:
- Capture — GoPro records the case; video → frames.
- Detect — YOLO emits per-frame instrument labels (7 classes) →
CSV detections/. - Define — the detection-ontology specifies how detections compose into events (bouts, swaps, scope cleaning, pauses) and outcomes.
- Smooth & derive — apply temporal smoothing (1 s min bout), compute outcomes into the Analysis-Ready Data Format schema.
Annotation tool decided (2026-06-12): CVAT, self-hosted Community Edition (free + keeps OR video on controlled infra; annotators on a local network). Setup: cvat-self-host-runbook. CVAT Online is the fallback only if annotators end up remote. See validation-plan P1.
- Validate — run the four pillars in order (validation-plan): annotation quality → ground-truth accuracy → outcome correlation → external/prospective.
- Publish / release — IFAR submission, open-source toolkit + ontology.
Key People
- Jaymarc — PI, rhinologist (solo PI on the ARS DWK grant; takes no salary).
Folder Structure
README.md— project index / entry point.specs/— definitions & data contracts: validation-plan, detection-ontology, Analysis-Ready Data Format, FESS Cases Clean Dataset.research/— analyses: YOLO Model Improvement Analysis, Swap Count Artifact Analysis, OR Efficiency and Cost Analysis.drafts/— written outputs: grant narrative (+ PDF).notes/— working notes / scratchpad (created as needed).CSV detections/— raw per-case detection CSVs.Computer Vision MOC.md— legacy auto-generated folder-index (kept for the os-optimizer system).
Rules & Conventions
- Vault routing — follow
Projects/CLAUDE.md: status/overview →README.md; analysis →research/; specs/definitions →specs/; written content →drafts/; working notes →notes/. Addproject: computer-visionto frontmatter on every note. - No
(C)prefix — this vault doesn’t use it; mark authorship via frontmatter instead, consistent with existing files. - Ask before editing existing (non-Claude) notes — the analyses, dataset, and grant narrative are the user’s; propose changes rather than overwriting.
- Wikilinks by basename — files can move between subdirs without breaking
[[links]]. - Honesty over polish — report metrics on held-out data, pre-register expected directions, and never present a derived metric as validated until it has cleared its pillar.
Current Status
Last updated: 2026-06-12 Status: Structure built out. Working prototype (n=16); validation plan and ontology drafted; immediate focus is locking operational definitions + standing up the inter-rater annotation study.