← The Graveyard
AI Voice Notes

Talk into your phone; get back a clean transcript, a short summary, and auto-extracted to-dos. A voice memo that files itself.

Parked App (embedded static folder)

Next to-dos

  • Decide on-device vs cloud transcription default — privacy vs accuracy on long notes
  • Solve the iOS background-audio permission flow
  • Multi-speaker memos — diarization or punt for v1?
  • Pick the default destination for extracted to-dos (Reminders vs in-app)
  • Answer the existential question: how is this better than Otter?

Recent activity

  • To-do added — Whisper.cpp transcription working on-device · 5 hours ago
  • To-do added — Extraction prompt returning clean JSON (summary + action items) · 5 hours ago
  • To-do added — Decide on-device vs cloud transcription default — privacy vs accuracy... · 5 hours ago
  • To-do added — Solve the iOS background-audio permission flow · 5 hours ago
  • To-do added — Multi-speaker memos — diarization or punt for v1? · 5 hours ago
  • To-do added — Pick the default destination for extracted to-dos (Reminders vs in-app... · 5 hours ago
  • To-do added — Answer the existential question: how is this better than Otter? · 5 hours ago
  • Created project · 5 hours ago

Design doc

AI Voice Notes — design doc

What it is: A voice-memo app that turns rambling out-loud thinking into something useful — a clean transcript, a 3-line summary, and a list of action items — without you touching the keyboard.

The problem it solves

Voice memos pile up and never get listened to again. The value is trapped in audio. This extracts it: record → transcribe → summarize → pull out the to-dos → drop them somewhere you'll actually see them.

User flow

  1. Tap record, talk, tap stop.
  2. Transcript appears in seconds (on-device first, cloud fallback for long notes).
  3. A summary + extracted action items render below.
  4. One tap sends the action items to your task app (or this Latrop project, naturally).

Approach

  • Transcription: Whisper (small/distil) on-device where possible; cloud for >2 min.
  • Summary + to-dos: one LLM pass with a strict JSON schema (summary, bullets, action_items[]).
  • Storage: audio + transcript + extraction stored together; searchable.

Stack

Swift / SwiftUI (iOS first) · whisper.cpp · an LLM endpoint for extraction · local SQLite.

Open questions

  1. On-device vs cloud transcription default — privacy vs. accuracy on long notes?
  2. How to handle multi-speaker memos — diarization, or ignore for v1?
  3. Where do extracted to-dos go by default — Reminders, or stay in-app?

Why it's parked

Whisper on-device works; the extraction prompt is solid. Stalled on the iOS background-audio permissions dance and a lingering "is this just a worse Otter?" existential question. Shelved, not dead.

Decision log

  • 2026-04-12 — Extraction must return strict JSON (summary / bullets / action_items) so the UI is dumb.