Introduction
BirdWatchAI turns any camera pointed at your bird feeder into an automatic bird-watching station โ identifying species on-device, saving photos and short video clips, and (optionally) sending you a notification when something rare shows up. This Part introduces the project, the two editions you can choose between, and the high-level picture of how a single detection happens.
What BirdWatchAI is
BirdWatchAI sits between an ordinary network camera and you. It watches the camera's video feed continuously, notices when something moves, decides whether the moving thing is actually a bird (filtering out squirrels, swaying branches, and shadows), identifies the species using a local AI model, and then saves a labeled snapshot plus a short clip โ all without anything leaving your home network unless you opt in to community sharing.
Key things it does:
- Identifies species automatically โ recognizes up to 965 North American birds with the default model, or 2,498 species worldwide with the optional larger model.
- Captures the moment โ high-resolution snapshot and a short video clip with the species, confidence, timestamp, temperature, and location burned in as an overlay.
- Tells you what's rare โ compares each species against regional / seasonal data so a Wood Thrush stands out from a House Sparrow.
- Notifies you โ email, mobile push (Pushover or ntfy), Windows toast, or even a digital photo frame.
- Builds a journal โ a searchable detection history, statistics dashboards, a life list, and (in the desktop app) gamified achievement badges.
- Connects you to others โ optionally publish sightings to the worldwide community feed at www.birdwatchai.com.
Desktop vs. Server โ which one should I run?
BirdWatchAI ships in two flavors. They share the same AI models and the same community backend; they differ mainly in where and how they run.
| Windows Desktop App | Server (Docker) | |
|---|---|---|
| Where it runs | Windows 10 / 11 PC | Raspberry Pi 4 / 5, Linux x86_64, or Windows + Docker Desktop |
| Interface | Native Windows Forms window | Web dashboard on port 8080 โ open from any device on your network |
| Best for | A PC that's already on most of the day; people who want the full feature set with no command line | Always-on operation, low power draw, headless / remote access; people comfortable copy-pasting one or two commands |
| Camera types | RTSP (any IP camera) | RTSP or wired Raspberry Pi Camera Module (CSI ribbon) |
| Updates | In-app update check; installer re-runs | One-click "Apply update" from the dashboard (via Watchtower sidecar) |
| Status | Mature (v2.1) | Newer port; tracks the desktop app feature set |
You can run either โ never both pointing at the same camera at the same time, since most cameras only allow a small handful of simultaneous RTSP streams. Pick the one that fits your situation; the rest of this manual flags instructions that apply to only one edition.
Feature matrix
What's available where:
| Feature | Desktop | Server |
|---|---|---|
| Local AI identification (965 species) | โ | โ |
| SpeciesNet model (2,498 species) | โ | โ |
| RTSP cameras | โ | โ |
| Wired Pi camera (CSI) | โ | โ |
| Snapshot + video clip per detection | โ | โ |
| Burn-in overlay on snapshots / clips | โ | โ |
| Email notifications | โ | โ |
| Pushover notifications | โ | โ |
| ntfy push | โ | โ |
| Windows toast notifications | โ | โ |
| Photo frame (FTP / email-to-frame) | โ | โ |
| Detection history (searchable) | โ | โ |
| Statistics dashboards | โ | โ |
| Species-grouped gallery + slideshow | โ | โ |
| Achievements / badges | โ | โ |
| Community sharing | โ | โ |
| Weekly / monthly summary reports | โ | โ |
| Video summaries (highlight reels) | โ | โ |
| Remote access (without VPN) | โ | โ (web dashboard) |
| One-click in-app update | โ | โ (Watchtower) |
System requirements
Windows desktop app
| Component | Requirement |
|---|---|
| Operating system | Windows 10 or 11, 64-bit only |
| Runtime | .NET 8 Desktop Runtime (x64) โ the installer offers to fetch this if missing |
| WebView2 Runtime | Needed for the "Pick on Map" location feature (pre-installed on Win 10/11) |
| Camera | Any RTSP-capable IP camera on your local network |
| Memory | ~200โ300 MB RAM in use |
| CPU | Under 5% idle; brief 10โ20% spike during a detection |
| Disk | A few GB for snapshots and clips, depending on how much you keep |
Server (Raspberry Pi or Docker)
| Component | Requirement |
|---|---|
| Hardware | Raspberry Pi 4 (4 GB+) or Pi 5; or any x86_64 Linux box; or Windows 10/11 with Docker Desktop (8 GB RAM minimum, 16 GB recommended) |
| Storage | 32 GB+ SD card (class A2) on a Pi; 20 GB free disk on Windows / Linux for the image + data folder |
| OS | Raspberry Pi OS Lite 64-bit, any modern x86_64 Linux, or Windows 10 1903+ / Windows 11 |
| Software | Docker (Engine on Linux, Docker Desktop on Windows) |
| Camera | RTSP camera, or โ on a Pi โ a wired Camera Module over CSI |
| Network | Wired Ethernet recommended; Wi-Fi works but is the most common source of "weird intermittent issues" |
How a detection happens
Both editions run the same pipeline. When monitoring is active:
- Motion is detected โ either by the camera's own hardware (via ONVIF, recommended for Tapo) or by the app comparing low-resolution frames.
- A high-resolution snapshot is captured immediately, and a short video clip starts recording (using a rolling pre-buffer so the clip includes the seconds before the trigger).
- Best-frame extraction picks the sharpest frame from the clip โ important when the bird arrives a moment after the motion trigger.
- An object detector finds the bird, crops to it, and silently drops non-bird motion like squirrels or branches.
- The classifier runs the local AI model on the cropped bird and returns the most likely species plus a confidence percentage.
- Enrichment looks up the current outdoor temperature and the regional rarity of the species (when your location is set).
- If confidence is above the threshold, the snapshot is saved with an on-image overlay, the clip is labeled, the detection is added to history, and any enabled notifications fire. Below-threshold detections are discarded by default (or kept in a "for review" folder if you opt in).
That whole pipeline runs in a couple of seconds per detection. The rest of this manual is about getting the camera connected, tuning the thresholds, and reading the results.