docs: add design document for local video key-frame extractor

2026-05-14 00:21:32 -07:00
parent 2ea3fdee30
commit 2e704349f7
1 changed files with 929 additions and 0 deletions
@@ -0,0 +1,929 @@
+Recommended approach
+
+Use:
+
+<input type="file">
+→ URL.createObjectURL(file)
+→ hidden <video>
+→ sample frames periodically
+→ draw sampled frame to canvas
+→ compare against previous accepted frame
+→ keep only significant-change frames
+→ user clicks Copy on selected image
+→ navigator.clipboard.write()
+
+URL.createObjectURL(file) gives the browser a local blob URL for the selected file, and you should release it with URL.revokeObjectURL() when done.
+
+Do not analyze every frame at full resolution
+
+For change detection, use a small analysis canvas:
+
+video frame → draw to 160×90 or 320×180 canvas → compare pixels
+
+Then only render the full-resolution frame when you decide it is worth keeping.
+
+Example strategy:
+
+const ANALYSIS_W = 160;
+const ANALYSIS_H = 90;
+const SAMPLE_INTERVAL = 0.5; // seconds
+const CHANGE_THRESHOLD = 0.18; // tune this
+
+That means a 10-minute video sampled every 0.5 seconds is only:
+
+10 min × 60 / 0.5 = 1200 comparisons
+
+At 160×90, that is cheap.
+
+Simple frame-difference algorithm
+
+Use a downscaled canvas and compare luminance differences:
+
+function frameDifference(a, b) {
+  let changed = 0;
+  const pixels = a.data.length / 4;
+
+  for (let i = 0; i < a.data.length; i += 4) {
+    const lumA = 0.299 * a.data[i] + 0.587 * a.data[i + 1] + 0.114 * a.data[i + 2];
+    const lumB = 0.299 * b.data[i] + 0.587 * b.data[i + 1] + 0.114 * b.data[i + 2];
+
+    if (Math.abs(lumA - lumB) > 32) changed++;
+  }
+
+  return changed / pixels;
+}
+
+Then:
+
+if (diff > CHANGE_THRESHOLD) {
+  // save this timestamp as a candidate key frame
+}
+Better heuristic
+
+Instead of comparing every sampled frame to the immediately previous sample, compare it to the last accepted key frame.
+
+That avoids saving several frames during the same transition.
+
+Frame A accepted
+Frame B slightly different → reject
+Frame C slightly different → reject
+Frame D substantially different from A → accept
+
+Also add a minimum time gap:
+
+const MIN_SECONDS_BETWEEN_ACCEPTED = 2.0;
+Clipboard copying
+
+Copying images to the clipboard is possible with navigator.clipboard.write() and ClipboardItem. Browsers commonly support PNG image data, but clipboard access requires a secure context such as HTTPS or localhost, and browser/user activation rules apply.
+
+Example:
+
+async function copyCanvasToClipboard(canvas) {
+  const blob = await new Promise(resolve =>
+    canvas.toBlob(resolve, "image/png")
+  );
+
+  await navigator.clipboard.write([
+    new ClipboardItem({
+      "image/png": blob
+    })
+  ]);
+}
+
+Important: this should happen from a user gesture, like clicking a Copy button.
+
+Memory profile
+
+This is light if implemented correctly:
+
+1 local video blob URL
+1 hidden video element
+1 small analysis canvas
+1 full-res preview canvas/image
+4–10 saved output blobs
+
+Avoid:
+
+Storing hundreds/thousands of full-res canvases
+Storing every sampled frame
+Using FileReader.readAsDataURL(video)
+Exporting a huge ZIP unless needed
+
+For your target, keeping 4–10 PNG/WebP blobs in memory is fine.
+
+Practical UX flow
+
+I would build it like this:
+
+User selects video.
+App scans with adjustable settings:
+sample interval
+change threshold
+minimum seconds between captures
+max captures
+App shows candidate thumbnails.
+User deletes bad picks if needed.
+Each thumbnail has:
+Copy PNG
+Download
+timestamp
+Optional: button to rescan with stricter/looser threshold.
+Caveats
+Plain <video> seeking may not be perfectly frame-accurate.
+Sampling every 0.25–1.0s is much more practical than decoding every frame.
+Browser codec support matters. MP4/H.264 is safest.
+Safari/Firefox clipboard image support can be more limited than Chromium-based browsers.
+For exact frame extraction, use WebCodecs or ffmpeg.wasm, but for your “significant scene change” use case, <video> + canvas is probably enough.
+My recommended V1 stack
+React/Vite
+HTMLVideoElement
+Canvas API
+Blob/Object URLs
+navigator.clipboard.write()
+Optional: localForage or IndexedDB if you want session persistence
+
+No backend required. No upload required.
+
+What would be the best framework or library to handle these tasks? Come up with a detailed implementation / design document. 
+
+I’ll treat this as a local-only browser app design, optimized for your 100–1000 MB input range and 4–10 final images. I’ll ground the library/API choices against current browser support rather than assuming older constraints.
+
+Thought for 5s
+Recommendation
+
+Use React + Vite + TypeScript for the app shell, but do the actual video work with native browser APIs first:
+
+Vite + React + TypeScript
+HTMLVideoElement
+Canvas / OffscreenCanvas
+Web Worker
+Blob URLs
+Clipboard API
+
+For your use case, I would not start with ffmpeg.wasm. It is powerful, but it is overkill for scene/key-frame extraction and can become memory-heavy with 100–1000 MB files. ffmpeg.wasm is a browser/WebAssembly port of FFmpeg, but many real-world reports and project issues revolve around large-file memory limits and browser stability.
+
+For a stronger V2, consider Mediabunny + WebCodecs. Mediabunny is a TypeScript media toolkit designed for browser-side reading/writing/conversion, and WebCodecs exposes native VideoFrame, EncodedVideoChunk, VideoDecoder, and VideoEncoder primitives.
+
+Design Document: Local Video Key-Frame Extractor
+1. Goal
+
+Build a browser-only app that allows a user to:
+
+Select a local video file.
+Analyze it without uploading to a server.
+Detect frames where the image content has significantly changed.
+Present a small set of candidate key images.
+Let the user copy selected images to the clipboard.
+
+Target input:
+
+Video size: 100–1000 MB
+Expected output: 4–10 still images
+Processing model: local-only
+Backend: none required
+2. Recommended architecture
+User selects video
+    ↓
+Create local Blob URL
+    ↓
+Hidden <video> element loads local file
+    ↓
+Sampling engine seeks through video
+    ↓
+Downscaled analysis canvas compares frames
+    ↓
+Candidate detector stores timestamps
+    ↓
+Full-resolution renderer extracts selected frames
+    ↓
+Gallery shows thumbnails
+    ↓
+User copies PNG to clipboard
+
+Core browser APIs:
+
+Task	API
+Local file selection	File API
+Local video loading	URL.createObjectURL()
+Video decoding/seeking	HTMLVideoElement
+Pixel comparison	Canvas API
+Off-main-thread analysis	Worker + OffscreenCanvas where available
+Image export	canvas.toBlob()
+Clipboard copy	navigator.clipboard.write() + ClipboardItem
+
+URL.createObjectURL() is appropriate here because it creates a local blob URL for a File/Blob, avoiding the need to load the entire file into JavaScript memory. The Clipboard API supports writing non-text data through ClipboardItem, but it requires a secure context such as HTTPS or localhost.
+
+3. Framework choice
+Best V1 stack
+Vite
+React
+TypeScript
+Zustand or simple React state
+Canvas API
+Web Worker
+
+Why:
+
+Vite: fast local dev, easy static deployment.
+React: good for gallery, controls, progress UI, and stateful review flow.
+TypeScript: useful because media processing code quickly accumulates edge cases.
+No backend: app can deploy as static files.
+No heavy media dependency initially: browser decoding is enough for threshold-based change detection.
+
+Suggested install:
+
+npm create vite@latest video-keyframe-extractor -- --template react-ts
+cd video-keyframe-extractor
+npm install
+npm run dev
+
+Optional dependencies:
+
+npm install zustand
+
+I would avoid extra UI frameworks unless you want polished controls quickly.
+
+4. Library/API decision matrix
+Option	Use for V1?	Notes
+Native <video> + canvas	Yes	Best starting point. Simple, no server, no large WASM payload.
+Web Worker	Yes	Keeps UI responsive during scans.
+OffscreenCanvas	Yes, optional	Good for worker-side pixel analysis where supported.
+WebCodecs	V2	Better low-level control, but more complex. WebCodecs is browser-native and provides direct access to decoded frames.
+Mediabunny	V2 / advanced	Good candidate for more robust container/media handling. It advertises efficient browser-side reading and only loading what is needed.
+ffmpeg.wasm	Avoid for V1	Heavy and memory-risky for 100–1000 MB videos. Better for fallback/export/transcoding, not first-pass scene detection.
+OpenCV.js	Probably avoid	Useful for advanced CV, but heavy for simple scene-difference detection.
+5. App modules
+5.1 File intake module
+
+Responsibilities:
+
+Accept local video file.
+Validate file type and size.
+Create blob URL.
+Load video metadata.
+Display duration, resolution, codec/browser compatibility status if possible.
+
+Example state:
+
+type LoadedVideo = {
+  file: File;
+  objectUrl: string;
+  duration: number;
+  width: number;
+  height: number;
+};
+
+Important cleanup:
+
+URL.revokeObjectURL(objectUrl);
+
+Never use this for the video:
+
+FileReader.readAsDataURL(file);
+
+That would load and base64-expand the whole video in memory.
+
+5.2 Scan settings module
+
+Expose these controls:
+
+type ScanSettings = {
+  sampleIntervalSeconds: number;       // default: 0.5
+  analysisWidth: number;               // default: 160
+  analysisHeight: number;              // default: 90
+  pixelDeltaThreshold: number;         // default: 32
+  changedPixelRatioThreshold: number;  // default: 0.18
+  minSecondsBetweenCaptures: number;   // default: 2.0
+  maxCandidates: number;               // default: 20
+  finalTargetCount: number;            // default: 8
+};
+
+Suggested defaults:
+
+const defaultSettings: ScanSettings = {
+  sampleIntervalSeconds: 0.5,
+  analysisWidth: 160,
+  analysisHeight: 90,
+  pixelDeltaThreshold: 32,
+  changedPixelRatioThreshold: 0.18,
+  minSecondsBetweenCaptures: 2.0,
+  maxCandidates: 20,
+  finalTargetCount: 8,
+};
+5.3 Frame sampling module
+
+V1 should seek using a hidden video element:
+
+async function seekVideo(video: HTMLVideoElement, time: number): Promise<void> {
+  return new Promise((resolve, reject) => {
+    const onSeeked = () => {
+      cleanup();
+      resolve();
+    };
+
+    const onError = () => {
+      cleanup();
+      reject(video.error);
+    };
+
+    const cleanup = () => {
+      video.removeEventListener("seeked", onSeeked);
+      video.removeEventListener("error", onError);
+    };
+
+    video.addEventListener("seeked", onSeeked, { once: true });
+    video.addEventListener("error", onError, { once: true });
+
+    video.currentTime = Math.min(time, video.duration);
+  });
+}
+
+Then draw into a small analysis canvas:
+
+function captureAnalysisFrame(
+  video: HTMLVideoElement,
+  canvas: HTMLCanvasElement,
+  width: number,
+  height: number
+): ImageData {
+  canvas.width = width;
+  canvas.height = height;
+
+  const ctx = canvas.getContext("2d", { willReadFrequently: true });
+  if (!ctx) throw new Error("Could not get canvas context");
+
+  ctx.drawImage(video, 0, 0, width, height);
+  return ctx.getImageData(0, 0, width, height);
+}
+6. Change detection algorithm
+6.1 Simple and effective V1
+
+Compare current sampled frame against the last accepted key frame, not merely the immediately previous sample.
+
+This prevents over-detecting long transitions.
+
+function frameDifferenceRatio(
+  a: ImageData,
+  b: ImageData,
+  pixelDeltaThreshold: number
+): number {
+  const dataA = a.data;
+  const dataB = b.data;
+  let changed = 0;
+  const pixels = dataA.length / 4;
+
+  for (let i = 0; i < dataA.length; i += 4) {
+    const lumA =
+      0.299 * dataA[i] +
+      0.587 * dataA[i + 1] +
+      0.114 * dataA[i + 2];
+
+    const lumB =
+      0.299 * dataB[i] +
+      0.587 * dataB[i + 1] +
+      0.114 * dataB[i + 2];
+
+    if (Math.abs(lumA - lumB) > pixelDeltaThreshold) {
+      changed++;
+    }
+  }
+
+  return changed / pixels;
+}
+
+Acceptance logic:
+
+function shouldAcceptCandidate(params: {
+  diffRatio: number;
+  currentTime: number;
+  lastAcceptedTime: number | null;
+  changedPixelRatioThreshold: number;
+  minSecondsBetweenCaptures: number;
+}) {
+  const {
+    diffRatio,
+    currentTime,
+    lastAcceptedTime,
+    changedPixelRatioThreshold,
+    minSecondsBetweenCaptures,
+  } = params;
+
+  if (diffRatio < changedPixelRatioThreshold) return false;
+
+  if (
+    lastAcceptedTime !== null &&
+    currentTime - lastAcceptedTime < minSecondsBetweenCaptures
+  ) {
+    return false;
+  }
+
+  return true;
+}
+6.2 Better V1.5: histogram difference
+
+Raw pixel comparison can be sensitive to motion, noise, flashes, compression artifacts, and camera pans.
+
+A better approach is to compare low-resolution color/luma histograms.
+
+Recommended hybrid score:
+
+score = 0.7 × luminance histogram difference
+      + 0.3 × pixel difference ratio
+
+This tends to better detect scene/content changes instead of tiny motion.
+
+6.3 Optional V2: perceptual hash
+
+For more stable detection:
+
+Generate a small grayscale frame, e.g. 32×32.
+Compute average hash or difference hash.
+Compare Hamming distance.
+Accept when distance exceeds threshold.
+
+This is good when videos have subtle compression differences but meaningful visual changes.
+
+7. Scan workflow
+
+Pseudo-code:
+
+async function scanVideo(
+  video: HTMLVideoElement,
+  settings: ScanSettings,
+  onProgress: (progress: number) => void
+): Promise<CandidateFrame[]> {
+  const analysisCanvas = document.createElement("canvas");
+
+  const candidates: CandidateFrame[] = [];
+  let lastAcceptedFrame: ImageData | null = null;
+  let lastAcceptedTime: number | null = null;
+
+  for (
+    let time = 0;
+    time < video.duration;
+    time += settings.sampleIntervalSeconds
+  ) {
+    await seekVideo(video, time);
+
+    const imageData = captureAnalysisFrame(
+      video,
+      analysisCanvas,
+      settings.analysisWidth,
+      settings.analysisHeight
+    );
+
+    if (!lastAcceptedFrame) {
+      candidates.push({
+        time,
+        score: 1,
+        reason: "initial-frame",
+      });
+
+      lastAcceptedFrame = imageData;
+      lastAcceptedTime = time;
+      continue;
+    }
+
+    const diffRatio = frameDifferenceRatio(
+      lastAcceptedFrame,
+      imageData,
+      settings.pixelDeltaThreshold
+    );
+
+    if (
+      shouldAcceptCandidate({
+        diffRatio,
+        currentTime: time,
+        lastAcceptedTime,
+        changedPixelRatioThreshold: settings.changedPixelRatioThreshold,
+        minSecondsBetweenCaptures: settings.minSecondsBetweenCaptures,
+      })
+    ) {
+      candidates.push({
+        time,
+        score: diffRatio,
+        reason: "visual-change",
+      });
+
+      lastAcceptedFrame = imageData;
+      lastAcceptedTime = time;
+    }
+
+    onProgress(time / video.duration);
+
+    if (candidates.length >= settings.maxCandidates) {
+      // optional: stop early or continue and later rank candidates
+    }
+  }
+
+  return rankAndTrimCandidates(candidates, settings.finalTargetCount);
+}
+
+Candidate type:
+
+type CandidateFrame = {
+  time: number;
+  score: number;
+  reason: "initial-frame" | "visual-change" | "manual";
+  thumbnailUrl?: string;
+  fullBlob?: Blob;
+};
+8. Candidate ranking
+
+Since the desired final count is only 4–10, do not simply take every threshold crossing.
+
+Use ranking:
+
+Always include first frame if useful.
+Sort detected changes by score.
+Enforce time diversity.
+Keep the top N.
+
+Example:
+
+function rankAndTrimCandidates(
+  candidates: CandidateFrame[],
+  targetCount: number
+): CandidateFrame[] {
+  return [...candidates]
+    .sort((a, b) => b.score - a.score)
+    .slice(0, targetCount)
+    .sort((a, b) => a.time - b.time);
+}
+
+Better version:
+
+Divide video into N temporal buckets
+Pick the strongest candidate from each bucket
+Then fill remaining slots by highest score
+
+This avoids ending up with all images from one active segment.
+
+9. Full-resolution extraction
+
+Only render full-resolution images for selected candidates.
+
+async function extractFullResolutionFrame(
+  video: HTMLVideoElement,
+  time: number,
+  type: "image/png" | "image/jpeg" | "image/webp" = "image/png",
+  quality?: number
+): Promise<Blob> {
+  await seekVideo(video, time);
+
+  const canvas = document.createElement("canvas");
+  canvas.width = video.videoWidth;
+  canvas.height = video.videoHeight;
+
+  const ctx = canvas.getContext("2d");
+  if (!ctx) throw new Error("Could not get canvas context");
+
+  ctx.drawImage(video, 0, 0);
+
+  return new Promise((resolve, reject) => {
+    canvas.toBlob(
+      blob => {
+        if (!blob) reject(new Error("Canvas export failed"));
+        else resolve(blob);
+      },
+      type,
+      quality
+    );
+  });
+}
+
+For clipboard, prefer PNG:
+
+async function copyImageBlobToClipboard(blob: Blob) {
+  await navigator.clipboard.write([
+    new ClipboardItem({
+      [blob.type]: blob,
+    }),
+  ]);
+}
+
+Use:
+
+const blob = await extractFullResolutionFrame(video, candidate.time, "image/png");
+await copyImageBlobToClipboard(blob);
+
+Clipboard image writing should be initiated by a user action, such as clicking a Copy button, and the app should be served from HTTPS or localhost because the Clipboard API is gated behind secure contexts.
+
+10. UI design
+Main layout
+┌──────────────────────────────────────────────┐
+│ Video Key Frame Extractor                    │
+├──────────────────────────────────────────────┤
+│ [Select video]  filename.mp4                 │
+│ Duration: 04:32 | Size: 612 MB | 1920×1080   │
+├──────────────────────────────────────────────┤
+│ Scan Settings                                │
+│ Sample interval: [0.5s]                      │
+│ Change threshold: [18%]                      │
+│ Min gap: [2.0s]                              │
+│ Target images: [8]                           │
+│ [Scan Video]                                 │
+├──────────────────────────────────────────────┤
+│ Progress bar                                 │
+├──────────────────────────────────────────────┤
+│ Candidate Gallery                            │
+│ [thumb] 00:04.5 score .24 [Copy] [Download]  │
+│ [thumb] 00:18.0 score .41 [Copy] [Download]  │
+│ [thumb] 01:22.5 score .37 [Copy] [Download]  │
+└──────────────────────────────────────────────┘
+Controls
+
+Minimum controls:
+
+Select video
+Scan
+Stop scan
+Reset
+Threshold slider
+Sample interval dropdown
+Target count
+Copy image
+Download image
+
+Advanced controls:
+
+“Prefer fewer images”
+“Prefer more images”
+“Ignore tiny motion”
+“Include first frame”
+“Manually add current preview frame”
+11. State model
+
+Use Zustand or plain React reducer.
+
+type AppState = {
+  video: LoadedVideo | null;
+  scanStatus: "idle" | "loading" | "scanning" | "done" | "error";
+  settings: ScanSettings;
+  candidates: CandidateFrame[];
+  selectedCandidateId: string | null;
+  progress: number;
+  error: string | null;
+};
+
+Actions:
+
+type AppActions = {
+  loadVideo(file: File): Promise<void>;
+  scanVideo(): Promise<void>;
+  cancelScan(): void;
+  updateSettings(settings: Partial<ScanSettings>): void;
+  copyCandidate(candidateId: string): Promise<void>;
+  downloadCandidate(candidateId: string): Promise<void>;
+  removeCandidate(candidateId: string): void;
+};
+12. Worker design
+
+For V1, the main thread can own the hidden video element because HTMLVideoElement is DOM-bound. But pixel comparison can be pushed into a worker.
+
+Main thread:
+
+seek video
+draw frame to analysis canvas
+get ImageData
+send ImageData buffer to worker
+worker returns diff score
+
+Worker message:
+
+type AnalyzeFrameMessage = {
+  type: "analyze-frame";
+  current: ImageData;
+  previous: ImageData;
+  pixelDeltaThreshold: number;
+};
+
+Worker response:
+
+type AnalyzeFrameResult = {
+  type: "frame-score";
+  diffRatio: number;
+};
+
+For your target size, this may not be necessary at first. But it is worth structuring the code so the algorithm can move into a worker later.
+
+13. Performance expectations
+
+For a 10-minute video:
+
+Sample interval: 0.5s
+Comparisons: ~1200
+Analysis resolution: 160×90
+Pixels per comparison: 14,400
+Total pixel comparisons: ~17.3 million
+
+That is very reasonable in browser JavaScript.
+
+For a 30-minute video:
+
+Sample interval: 0.5s
+Comparisons: ~3600
+Analysis resolution: 160×90
+Total pixel comparisons: ~51.8 million
+
+Still reasonable. The bottleneck will usually be video seeking/decoding, not the pixel math.
+
+14. File-size considerations
+
+Your 100–1000 MB range is practical because:
+
+The video is referenced via local file blob URL.
+You only decode sampled frames.
+You only store 4–10 output blobs.
+Analysis frames are tiny.
+
+Memory use should look roughly like:
+
+Video decoder buffers: browser-managed
+Analysis frame A: ~160 × 90 × 4 = 57.6 KB
+Analysis frame B: ~57.6 KB
+Full-res output image: only when accepted/copied
+Candidate blobs: 4–10 images
+
+At 1920×1080, one raw RGBA canvas frame is about:
+
+1920 × 1080 × 4 = ~8.3 MB
+
+At 3840×2160:
+
+3840 × 2160 × 4 = ~33.2 MB
+
+Still fine if you only create one full-res canvas at a time.
+
+15. Browser compatibility
+Best target
+
+Start with Chromium-based browsers:
+
+Chrome
+Edge
+Brave
+Arc
+Chromium
+
+Reasons:
+
+Strong media support.
+Good Clipboard API support.
+Good WebCodecs path for later.
+
+Firefox should work for the V1 <video> + canvas path. WebCodecs VideoDecoder is listed by MDN as supported in Firefox 130+, while Chromium browsers have supported it since version 94.
+
+Safari may work for simple video/canvas extraction, but clipboard image writing and codec behavior are areas to test carefully.
+
+16. Security and privacy model
+
+The app should explicitly state:
+
+Your video is processed locally in your browser.
+The file is not uploaded.
+No server receives the video or extracted images.
+
+Implementation requirements:
+
+Do not send file contents to analytics.
+Do not log object URLs remotely.
+Avoid third-party scripts if privacy is part of the value prop.
+Serve over HTTPS.
+Make the app installable as a PWA if desired.
+17. Error handling
+
+Handle:
+
+Case	Response
+Unsupported video codec	Show “This browser cannot decode this video.”
+Huge resolution	Offer lower-quality extraction or warn about memory.
+Clipboard blocked	Offer download fallback.
+Scan too slow	Let user increase sample interval.
+Too many candidates	Raise threshold or bucket results.
+No candidates found	Lower threshold or include first/middle/end frames.
+
+Clipboard fallback:
+
+try {
+  await copyImageBlobToClipboard(blob);
+} catch {
+  downloadBlob(blob, "frame.png");
+}
+18. Suggested file structure
+src/
+  app/
+    App.tsx
+    store.ts
+  components/
+    FilePicker.tsx
+    VideoMetadataPanel.tsx
+    ScanSettingsPanel.tsx
+    ProgressPanel.tsx
+    CandidateGallery.tsx
+    CandidateCard.tsx
+  media/
+    loadVideo.ts
+    seekVideo.ts
+    captureFrame.ts
+    extractFullFrame.ts
+    clipboard.ts
+  analysis/
+    frameDifference.ts
+    histogramDifference.ts
+    candidateRanking.ts
+    scanVideo.ts
+  workers/
+    frameAnalysis.worker.ts
+  types/
+    media.ts
+    scan.ts
+19. MVP implementation plan
+Phase 1: Core prototype
+
+Build:
+
+File picker
+Hidden video loader
+Metadata display
+Scan button
+Basic frame-difference detection
+Candidate list with timestamps
+Thumbnail generation
+
+No worker yet.
+
+Phase 2: Copy/export
+
+Add:
+
+Full-res frame extraction
+Copy PNG to clipboard
+Download PNG fallback
+Remove candidate
+Manual add candidate
+Phase 3: Better detection
+
+Add:
+
+Histogram comparison
+Temporal bucketing
+Threshold presets
+“Strict / Balanced / Sensitive” modes
+
+Example presets:
+
+const presets = {
+  strict: {
+    sampleIntervalSeconds: 1.0,
+    changedPixelRatioThreshold: 0.28,
+    minSecondsBetweenCaptures: 4,
+  },
+  balanced: {
+    sampleIntervalSeconds: 0.5,
+    changedPixelRatioThreshold: 0.18,
+    minSecondsBetweenCaptures: 2,
+  },
+  sensitive: {
+    sampleIntervalSeconds: 0.25,
+    changedPixelRatioThreshold: 0.1,
+    minSecondsBetweenCaptures: 1,
+  },
+};
+Phase 4: Worker optimization
+
+Move frame comparison into a worker.
+
+Phase 5: Advanced media backend
+
+Evaluate:
+
+Mediabunny + WebCodecs
+
+This is the path if you want better control over frame extraction, codecs, and container-level handling. Mediabunny is actively positioned as a modern browser media toolkit, and WebCodecs is the underlying native browser primitive for direct encoded/decoded media handling.
+
+20. Final recommendation
+
+For your app, build V1 with:
+
+Vite + React + TypeScript
+HTMLVideoElement
+Canvas
+Blob URLs
+Clipboard API
+Optional Zustand
+
+Do not start with:
+
+ffmpeg.wasm
+OpenCV.js
+Server-side processing
+
+Best V1 strategy:
+
+Sample every 0.5s
+Analyze at 160×90
+Compare against last accepted key frame
+Use threshold + minimum time gap
+Rank candidates down to 4–10
+Extract full-res only when displaying/copying