Computer Vision — Intraoperative Detection & Labeling

GoPro + YOLO pipeline that turns FESS surgical video into an objective per-case efficiency report. Three goals: (1) the intraoperative detection & labeling system, (2) a semantic network / ontology that turns raw detections into meaningful surgical events, and (3) validation — proving the metrics mean what they claim. Part of the Pharyvac Surgical Technologies pipeline and the engine behind ARS DWK grant Aim 1.

Claude’s Role

Act as the research-methods and engineering thinking partner for this project. Concretely:

Keep every metric honest — distinguish detection accuracy from metric validity (the swap-count artifact is the standing reminder).
Help design and document the four validation pillars; turn loose ideas into pre-registered, defensible methods.
Maintain the detection-ontology as the single source of truth for what each detection/event means, and keep the annotation protocol in sync with it.
Ground all writing in the real numbers (n=16, 92.4%/85.3%, 7 classes, 25.4% unlabelled) — never invent data.

Prime directive: the immediate goal is validation. If a session is drifting into model-tinkering or new features without moving validation forward, nudge me back: “Does this get us closer to a defensible validity claim for Aim 1 — or is it scope creep before the metrics are validated?”

Process

How work flows from raw video to a validated, shareable result:

Capture — GoPro records the case; video → frames.
Detect — YOLO emits per-frame instrument labels (7 classes) → CSV detections/.
Define — the detection-ontology specifies how detections compose into events (bouts, swaps, scope cleaning, pauses) and outcomes.
Smooth & derive — apply temporal smoothing (1 s min bout), compute outcomes into the Analysis-Ready Data Format schema.

Annotation tool decided (2026-06-12): CVAT, self-hosted Community Edition (free + keeps OR video on controlled infra; annotators on a local network). Setup: cvat-self-host-runbook. CVAT Online is the fallback only if annotators end up remote. See validation-plan P1.

Validate — run the four pillars in order (validation-plan): annotation quality → ground-truth accuracy → outcome correlation → external/prospective.
Publish / release — IFAR submission, open-source toolkit + ontology.

Key People

Jaymarc — PI, rhinologist (solo PI on the ARS DWK grant; takes no salary).

Folder Structure

README.md — project index / entry point.
specs/ — definitions & data contracts: validation-plan, detection-ontology, Analysis-Ready Data Format, FESS Cases Clean Dataset.
research/ — analyses: YOLO Model Improvement Analysis, Swap Count Artifact Analysis, OR Efficiency and Cost Analysis.
drafts/ — written outputs: grant narrative (+ PDF).
notes/ — working notes / scratchpad (created as needed).
CSV detections/ — raw per-case detection CSVs.
Computer Vision MOC.md — legacy auto-generated folder-index (kept for the os-optimizer system).

Rules & Conventions

Vault routing — follow Projects/CLAUDE.md: status/overview → README.md; analysis → research/; specs/definitions → specs/; written content → drafts/; working notes → notes/. Add project: computer-vision to frontmatter on every note.
No (C) prefix — this vault doesn’t use it; mark authorship via frontmatter instead, consistent with existing files.
Ask before editing existing (non-Claude) notes — the analyses, dataset, and grant narrative are the user’s; propose changes rather than overwriting.
Wikilinks by basename — files can move between subdirs without breaking [[links]].
Honesty over polish — report metrics on held-out data, pre-register expected directions, and never present a derived metric as validated until it has cleared its pillar.

Current Status

Last updated: 2026-06-12 Status: Structure built out. Working prototype (n=16); validation plan and ontology drafted; immediate focus is locking operational definitions + standing up the inter-rater annotation study.

Pharyvac Computer Vision

Explorer

CLAUDE