Two Layer Design
Steno follows a strict two-layer structure:
StenoKit/- pure Swift package
- protocols, models, services, actors
- no UI dependencies
Steno/- SwiftUI app target
DictationController, views, onboarding, settings UI, menu bar wiring
This separation keeps business logic testable and reusable outside UI concerns.
Steno (SwiftUI app)
-> DictationController
-> SessionCoordinator (actor)
-> AudioCaptureService
-> TranscriptionEngine
-> CleanupEngine
-> InsertionService
-> HistoryStore
Pipeline
High-level runtime flow:
- Hotkey event triggers start/stop transitions.
- Audio capture service records session audio.
- Whisper CLI engine transcribes to raw text.
- Snippets and style/lexicon context are applied.
- Cleanup engine runs (cloud or local fallback).
- Insertion service writes text into target app.
- Transcript entry is persisted for history.
Fallback behavior is explicit at multiple stages (cleanup and insertion), which is why Steno remains usable even during partial failures.
Concurrency
Steno follows Swift 6 strict concurrency principles.
DictationControlleris@MainActorfor UI orchestrationSessionCoordinatoris anactorfor session isolation- domain models are value types and
Sendable - no singleton-heavy global mutable state in core paths
This reduces race conditions around recording state, cleanup decisions, and history persistence.
Extension Points
Common extension areas for contributors:
- new cleanup engines conforming to
CleanupEngine - new insertion transports conforming to
InsertionTransport - alternative context providers or app-specific behaviors
- benchmark and validation tooling in
StenoBenchmarkCore
When adding behavior, prefer protocol-first interfaces in StenoKit/Protocols and concrete implementations in StenoKit/Services.