Skip to content

How it works

  • Wake word — pyopen-wakeword detects "Hey Jarvis" instantly
  • Speech-to-text — faster-whisper transcribes commands locally
  • Text-to-speech — Piper provides voice feedback
  • Mouse control — GNOME Shell extension with Clutter virtual input
  • Browser scroll — JavaScript injection via qutebrowser IPC
  • Dictation — AT-SPI accessibility framework

All processing happens locally. No data leaves your machine.

File structure

easyspeak/
├── src/
│   ├── core/
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   ├── about.py           # Standalone libadwaita "About" window
│   │   ├── cli.py             # CLI entry point: argument parsing + logging
│   │   ├── config.py          # Tuning constants + Whisper model factory
│   │   ├── gnome_extension.py # Installs/refreshes/enables the extension
│   │   ├── log.py             # Logging setup
│   │   ├── main.py            # EasySpeak class + main loop
│   │   ├── mediakeys.py       # Replays multimedia keys via Mutter
│   │   ├── speech.py          # Persistent piper -> player pipeline
│   │   └── tray.py            # Panel indicator + asleep lifecycle
│   ├── data/                  # Bundled data files (easyspeak.data), shipped as package data
│   │   ├── __init__.py        # Marks the folder as package data
│   │   ├── easyspeak.desktop  # Desktop entry
│   │   └── easyspeak-autostart.desktop
│   ├── gnome@easyspeak.dev/   # GNOME Shell extension (UUID-named, shipped as package data)
│   │   ├── schemas/           # GSettings schema (+ compiled) for the settings
│   │   ├── __init__.py        # Marks the folder as package data (easyspeak.gnome)
│   │   ├── extension-helpers.js  # Pure JS helpers imported by extension.js
│   │   ├── extension.js       # The extension itself
│   │   ├── grid.js            # Numbered grid overlay + pointer injection primitive
│   │   ├── indicator.js       # Panel indicator + Quick Settings toggle widgets
│   │   ├── metadata.json      # Extension metadata
│   │   ├── prefs.js           # Extension Settings dialog (autostart, Quick Settings)
│   │   ├── screenshot.js      # Screen capture primitive (Wayland framebuffer grab)
│   │   └── windows.js         # Window/workspace operations on the focused window
│   └── plugins/
│       ├── __init__.py
│       ├── 00_eyetrack.py     # Head tracking (experimental)
│       ├── 00_mousegrid.py    # Grid overlay mouse control
│       ├── apps.py            # Application launcher
│       ├── browser.py         # Qutebrowser control + auto-config
│       ├── dictation.py       # Voice-to-text + AT-SPI enablement
│       ├── files.py           # Folder navigation
│       ├── media.py           # Playback controls
│       ├── sleep.py           # Voice deactivate
│       ├── system.py          # Volume, brightness, DND
│       └── zz_base.py         # Help and exit
├── tests/
│   ├── acceptance/            # Gherkin scenarios (pytest-bdd)
│   ├── benchmarks/            # pytest-benchmark suites for bencher.dev
│   ├── integration/           # tests against real binaries
│   └── unit/                  # tests for src/core/ and src/plugins/
└── pyproject.toml

GNOME Shell extension

On first start, core auto-installs the GNOME Shell extension to the user-local extensions directory (unless GNOME already sees it via a system-wide install). Files copied:

~/.local/share/gnome-shell/extensions/gnome@easyspeak.dev/
├── schemas/
│   ├── gschemas.compiled
│   └── org.gnome.shell.extensions.easyspeak.gschema.xml
├── extension-helpers.js
├── extension.js
├── grid.js
├── indicator.js
├── metadata.json
├── prefs.js
├── screenshot.js
└── windows.js

The extension exposes the compositor-only primitives — panel indicator, mouse grid, window operations, and screen capture — that the Python plugins drive. Its Settings dialog (prefs.js) offers "Show EasySpeak in Quick Settings Menu", persisted in dconf via the bundled GSettings schema: when on, the asleep state is surfaced as a Quick Settings toggle — on while listening, off while asleep — instead of the panel tray icon. On Wayland, GNOME Shell only scans for extensions at login, so a oneshot systemd user unit (installed by EasySpeak) re-copies the bundled files before the shell loads them at each login. See core.gnome_extension for the full lifecycle.