Immich's machine learning features turn a photo library backup tool into a fully searchable, face-tagged photo management platform. You can search "dog on beach" and get accurate results, tag faces once and have Immich find that person across 50,000 photos, and organise memories by location, event, and subject without manual tagging. These features run entirely on your NAS hardware using open-source ML models downloaded at setup. This guide covers how each AI feature works, how to configure them, and what to expect from NAS hardware performance. Including which models to disable on ARM hardware and how to tune ML processing for Intel and AMD NAS CPUs.
In short: Immich's AI runs automatically after a new photo is uploaded. Face detection, CLIP embeddings, and object detection all trigger via background jobs. Your main job is: (1) tag faces as they appear in the People view, and (2) ensure the ML container has enough RAM (disable ML on ARM NAS). On Intel NAS hardware, initial ML indexing of a 10,000-photo library takes 8-15 hours in the background.
Immich's Three AI Pipelines
Immich's machine learning is split into three separate functions, each using different models:
1. Facial Recognition: Uses a two-stage pipeline. Face detection (finding faces in photos) and face recognition (matching faces to known people). Immich uses InsightFace models. On upload, every face is detected and clustered into groups. You assign a name to a face cluster (e.g. "Jane") and Immich then finds all matching faces across your library. This is continuous. New photos with known faces are automatically tagged.
2. CLIP Semantic Search (Smart Search): Uses OpenAI's CLIP model to embed every photo as a vector representation. This enables natural language search: "birthday cake", "sunset with silhouette", "cat sleeping". Search results are semantically matched. The model understands visual concepts, not just filename keywords. CLIP processing is CPU-intensive on initial indexing.
3. Object Detection / Smart Memories (optional): Additional ML features including automatic scene classification. Less critical than face recognition and CLIP for most users.
Configuring Face Recognition
Face recognition is enabled by default in Immich when the ML container is running. Initial setup after library import:
- Go to Explore → People in Immich web interface
- Immich shows clustered face groups. Each group is a person it has detected
- Click on a group and assign a name. Immich shows you representative photos from the cluster to confirm identity
- After naming, Immich searches the rest of the library for matching faces and adds them to the person's timeline
- Review the "Unconfirmed" section under People. Immich will suggest additional face matches for your review
Improving accuracy: The more faces you tag, the better Immich's recognition becomes. Correct any misidentified faces by clicking the face in a photo → Reassign face. Over time, the recognition improves as Immich builds a larger reference set for each person.
Clustering sensitivity: Under Admin Settings → Machine Learning → Facial Recognition, you can adjust the MIN_FACES threshold (minimum faces in a cluster before it's shown) and MAX_RECOGNITION_DISTANCE (how strict the face matching is). Lower distance = stricter matching, fewer false positives but some true matches missed. Higher distance = more permissive, catches more matches but may mix up similar-looking people.
Using CLIP Smart Search
Once CLIP processing is complete, the search bar in Immich changes from filename-only to semantic search. To search semantically:
- Click the search icon (magnifying glass) in the Immich top bar
- Type a natural language description: "birthday cake with candles", "mountains with snow", "kids playing in water"
- Results appear ranked by visual relevance. Photos that most closely match your description visually
Combine CLIP search with other filters: person filter (find all photos of Jane at the beach), date range filter, album filter. The combination is powerful. "Jane hiking" filters for photos of Jane that visually match hiking scenes.
CLIP search quality depends on model size. Immich's default CLIP model (ViT-B/32) is fast and capable. Larger models (ViT-L/14) are available and produce better search results but require more RAM and processing time. Configure in Admin Settings → Machine Learning → Smart Search → CLIP Model.
Performance on NAS Hardware
ML processing rates on NAS hardware for initial library indexing:
- Intel Celeron N5095 (TS-464): ~500-1,000 photos/hour for CLIP + face detection combined. A 10,000-photo library: 10-20 hours initial indexing. Runs in background, does not affect NAS usability
- Intel Core i5 / Core i7: ~3,000-5,000 photos/hour. Initial indexing completes in 2-4 hours for 10,000 photos
- ARM Cortex-A55 (TS-233, DS223): ~50-100 photos/hour. A 10,000-photo library takes 4-8 days. Disable ML on ARM hardware unless you have significant patience
To disable ML on ARM NAS: set MACHINE_LEARNING_ENABLED=false in the .env file and restart the stack. You retain all core Immich features (photo backup, timeline, albums, basic search by date/album/filename); only semantic search and face recognition are disabled.
Tuning ML Performance
Several settings control how aggressively Immich runs ML processing:
Concurrent threads: Admin Settings → Machine Learning → Concurrency. Default is typically 2 threads. On a quad-core Celeron with other services running, setting this to 1 reduces impact on NAS responsiveness during initial indexing. Set back to 2-4 after initial indexing is complete for faster ongoing processing.
Processing priority: Immich ML jobs run at normal priority. If NAS performance is noticeably degraded during indexing (slow SMB transfers, sluggish web UI), reduce concurrency or schedule ML-intensive tasks during off-hours via the Admin Settings → Jobs → Smart Search / Face Detection schedules.
Model caching: Immich caches ML models in memory between processing runs. The ML container holds models in RAM. This is the primary RAM consumer. With the default models loaded, the ML container uses approximately 600MB-1.5GB RAM. If RAM is tight, the container may page models in/out on each job run, adding latency.
🇦🇺 Australian Users: Recommended NAS for Immich ML
Best NAS for Immich with AI features in Australia (March 2026):
- QNAP TS-464 (~$989): Intel Celeron N5095, ships with 8GB RAM. Strong choice for Immich ML. Initial indexing completes in a day on typical family photo libraries (5,000-20,000 photos)
- Synology DS423+ (~$980): Intel Celeron J4125, 2GB stock RAM. must upgrade to 8GB before running Immich ML. After RAM upgrade, performance similar to TS-464
- QNAP TS-473A (~$1,269): AMD Ryzen V1500B. Faster ML processing, and the PCIe slot supports future GPU passthrough for dramatically faster inference
For the best photo AI experience on a NAS today, the TS-464 with 8GB RAM (stock) provides the best cost/performance balance. See the Immich setup guide for Docker installation steps and the best NAS for AI guide for broader AI workload comparisons.
Related reading: our NAS buyer's guide, our NAS vs cloud storage comparison, and our NAS explainer.
Use our free AI Hardware Requirements Calculator to size the hardware you need to run AI locally.
Does Immich face recognition work as well as Google Photos?
Close, but not quite. Google Photos face recognition is among the best in the industry, trained on billions of images. Immich uses InsightFace models. Accuracy is good for clear, frontal faces but less reliable for partially obscured faces, extreme angles, children growing up, or faces with significant lighting variation. For most household photo libraries, Immich achieves 80-95% accuracy once you've tagged faces across a diverse set of reference photos. It improves over time as you correct misidentifications. For most users who don't need perfect Google-level accuracy, Immich is good enough to be genuinely useful.
Can I use CLIP search offline?
Yes. CLIP processing runs entirely on your NAS hardware. Once photos are indexed, CLIP search works with no internet connection. The CLIP model is downloaded when you first deploy Immich (~300MB for the default model) and then runs locally. Your search queries never leave your network. This makes Immich's smart search genuinely private. Unlike Google Photos semantic search which processes queries on Google's servers.
How do I re-run ML on photos that weren't processed?
In Immich Admin → Jobs, you can trigger specific ML jobs manually: Smart Search (CLIP), Face Detection, and Face Recognition. Click Run on any job to process items not yet indexed. If ML was disabled when photos were uploaded and you later enable it, running these jobs processes the backlog. The Jobs page shows progress and estimated completion time.
Is Immich's face recognition data stored on the NAS?
Yes. All face embeddings, person names, and ML processing results are stored in the Immich PostgreSQL database. Part of the Docker volume on your NAS. No ML data is sent to external services. This is fully local and private. Face embeddings are numerical vectors (not raw photos) stored in the database. They are used for matching but cannot reconstruct the original photos.
Will Immich ML slow down my NAS?
During initial indexing of a large library, yes. ML processing uses significant CPU. On a TS-464 with 8GB RAM running Immich, CPU usage during indexing is typically 60-90% of one or two cores. SMB file transfers and other NAS services remain functional but may be slightly slower. The impact is most noticeable during the first 12-24 hours after a large library import. Once initial indexing is complete, ongoing ML processing (new photos only) runs quickly and has negligible impact on NAS performance.
Haven't installed Immich yet? The Immich on NAS setup guide covers the complete Docker deployment, mobile app connection, and initial configuration.
Immich Setup Guide →