Nightingale: AI Karaoke from Your Music Library

Nightingale: A Neural Karaoke Studio for Your Music Library
Introduction
Nightingale is a bold, all-in-one solution that brings professional-grade karaoke capabilities to your personal music collection. By combining state-of-the-art neural networks and smart orchestration, it scans your music folder, isolates lead vocals from instrumentals, and delivers synchronized lyrics with word-level timestamps. It doesn’t stop there—Nightingale provides real-time pitch scoring, key and tempo adjustments, customizable profiles, and lush visual experiences that adapt to both audio and video files. The result is a self-contained, plug-and-play application that eliminates manual installations of Python, ffmpeg, or machine learning models. Everything you need is downloaded and bootstrapped automatically on first launch.
What Nightingale Does
- Scans your music folder and analyzes each file to prepare for karaoke-style playback.
- Separates lead vocals from instrumentation using the UVR Karaoke model by default, with an option to switch to Demucs for different vocal separation characteristics.
- Transcribes lyrics with word-level timestamps, either via automatic WhisperX transcription or by fetching synced lyrics from LRCLIB when available.
- Plays back instrumental tracks with on-screen, synchronized highlighting of lyrics, so you can sing along with confidence.
- Evaluates vocal performance in real time using pitch detection, with per-song scores and star ratings to track progress.
- Lets you shift key and tempo after analysis and generate cached playback variants for quick retries.
- Supports profiles to tailor preferences and keep scores organized per user context.
- Handles video files by separating vocals from the audio track while using the original video as a synchronized background.
- Offers a rich set of background themes and dynamic visuals to enhance the listening and singing experience.
- Provides a responsive sidebar with fast filters, metadata cleanups, and bulk analysis capabilities.
A Unified, Self-Contained Experience
Nightingale is designed to feel like a single package rather than a collection of tools. It ships as one binary, and it downloads all necessary components during setup. The application includes built-in support for:
- Multiple backends for stem separation (UVR Karaoke and Demucs)
- Accurate, word-level lyric alignment (WhisperX or LRCLIB)
- Real-time pitch scoring tied to your microphone input
- Flexible key and tempo transformation after analysis
- Profiles that enable you to quickly switch between different modes and preferences
- Video and image-friendly playback experiences with synchronized backgrounds
Feature Spotlight
Stem Separation: Natural, Flexible Vocals
- The default UVR Karaoke model isolates lead vocals from instrumental tracks while preserving backing vocals in the instrumental mix for a more authentic karaoke feel.
- Demucs is available as an alternative separation option, enabling experimentation with different vocal isolation characteristics.
- Users can adjust the guide vocal volume to fine-tune how prominently vocals appear in the mix, allowing for a customized karaoke experience.
Word-Level Lyrics: Precise, Timed Lyrics
- Lyrics can be transcribed automatically with precise word-level timestamps, providing alignment for real-time highlighting during playback.
- When available, Nirvana LRCLIB or LRCLIB provides fetched lyrics with alignment, reducing manual search and ensuring lyrics stay in sync with the music.
Pitch Scoring: Instant Feedback
- Real-time pitch detection works with your microphone input to generate a playing field for vocal performance.
- A per-song scoreboard and star ratings offer a motivating way to track improvement and compare performances across tracks.
Key & Tempo Shifts: Flexible Playback
- After analysis, you can adjust the song key and tempo to suit your singing range or practice needs.
- The app caches playback variants, so retrying different keys or paces is quick and painless.
Profiles: Personalization and Progress
- Create and switch between different player profiles to tailor settings such as background themes, microphone routing, and lyric display preferences.
- Scores and performance data are tracked per profile, enabling separate progress tracking for different users or singing styles.
Video Files and Backgrounds: Visual Depth
- Drop video files into your music folder and Nightingale automatically separates the vocals, using the original video as a synchronized background.
- A rich suite of backgrounds includes seven themes: five GPU shader backgrounds (Plasma, Aurora, Waves, Nebula, Starfield) and five Pixabay video flavor options (Nature, Underwater, Space, City, Countryside).
- Nightingale also supports automatic source video playback for video files, creating a cinematic karaoke environment.
Sidebar, Library, and Filters: Efficient Organization
- Quick filters and metadata cleanup buckets make it easy to organize large music collections.
- Artists and albums can be grouped for efficient browsing, and a bulk analysis action helps you quickly prepare many tracks at once.
- The Analyze All action is designed for bulk processing, ensuring your entire library is karaoke-ready in one workflow.
Mic Mirroring and Gamepad Support: Immersive Interaction
- Mic mirroring lets you route your live mic into playback for low-latency practice and monitoring.
- Full gamepad support enables navigation, control, and interaction without a traditional mouse or keyboard, enhancing the experience for couch setups or lounges.
Adaptive UI Scaling: Works on Any Screen
- The user interface scales to fit any resolution, including 4K televisions, ensuring accessibility and ease of use in different environments.
Self-Contained: A True Standalone App
- All critical components—FFmpeg, UV, Python, PyTorch, and necessary ML packages—are downloaded automatically during setup.
- Video backgrounds are pre-downloaded so the first session is ready to go, minimizing setup friction and maximizing your singing time.
Quick Start Guide
Getting Nightingale up and running is designed to be straightforward. The process emphasizes an automated setup that minimizes manual dependencies.
- Download the latest release for your platform from the Releases page and run it.
- On first launch, Nightingale guides you through setup steps and prompts you to choose a data folder.
- The Python environment and ML models are installed automatically, with everything needed for immediate use.
A note for macOS users
- Gatekeeper may block Nightingale if the app isn’t signed with an Apple Developer ID. After moving Nightingale.app to Applications, you can fix the quarantine issue with:
- xattr -cr /Applications/Nightingale.app
- This step clears the quarantine attribute and allows the app to run smoothly.
Supported File Formats
Nightingale is compatible with a wide range of audio and video formats, enabling you to work with most of your library:
- Audio: .mp3, .flac, .ogg, .wav, .m4a, .aac, .wma
- Video: .mp4, .mkv, .avi, .webm, .mov, .m4v
Controls and Interaction
Navigation and playback controls are designed to be intuitive and accessible across devices.
Navigation and Panel Management
- Move: Arrow keys
- Confirm / Select: Enter
- Back / Cancel: Escape
- Switch panel: Tab
- Search songs: Type to filter
Playback Controls
- Pause / Resume: Space
- Exit to menu: Escape
- Toggle guide vocals: G
- Guide volume up/down: + / -
- Cycle background theme: T
- Cycle video flavor: F
- Toggle microphone: M
- Next microphone: N
- Toggle mic mirroring: R
- Toggle fullscreen: F11
- Skip Intro / Outro: Use on-screen buttons
How It Works: A Visual Pipeline
Nightingale's workflow can be thought of as a layered pipeline that begins with your input file and ends with a polished karaoke experience:
- Audio or video file enters Nightingale.
- UVR Karaoke (or Demucs) performs stem separation to produce a vocal-focused track and an instrumental track.
- If the file is a video, the original video track is used as the synchronized background.
- WhisperX analyzes the audio to produce word-level lyrics alignment, or LRCLIB is fetched when available.
- The Tauri-based app (Rust + React) coordinates playback, providing synchronized lyrics, pitch scoring, and parameter controls.
- The final result is a cohesive experience where the vocal track is highlighted in time with the lyrics, backed by responsive visuals and adjustable song properties.
Analysis Caching and Performance
- Nightingale caches analysis results using blake3 file hashes, ensuring re-analysis only happens if the source file changes or if you explicitly trigger a re-analysis.
- This design accelerates workflows when you work with large music libraries, allowing quick retries after key or tempo adjustments.
Hardware and Performance Details
The performance of Nightingale hinges on the hardware you use and the model backends selected.
- PyTorch-based analyzer detects the best backend automatically:
- CUDA (NVIDIA GPU) provides the fastest analysis acceleration.
- MPS (Apple Silicon) enables the Mac GPU-accelerated path; WhisperX alignment may fall back to CPU in some configurations.
- CPU is always available as a fallback, though it is slower.
- UVR Karaoke model uses ONNX Runtime, enabling CUDA acceleration on NVIDIA GPUs or CoreML on Apple Silicon.
- Typical analysis time:
- A song on GPU: roughly 2–5 minutes.
- A song on CPU: roughly 10–20 minutes.
- The software footprint is kept reasonable by downloading and caching models as needed, rather than bundling massive assets in a single install.
Data Storage and Folder Structure
During setup, you choose a data folder — Nightingale stores most runtime data there, while config.json and nightingale.log reside in your home directory under ~/.nightingale.
Example layout of the data folder:
- cache/ — Stems, transcripts, lyrics, shifted variants, covers, playable videos
- songs.db — SQLite library and analysis metadata
- profiles.json — Player profiles and scores
- videos/ — Cached Pixabay video backgrounds
- sounds/ — Sound effects (e.g., celebratory cues)
- vendor/ — Pre-downloaded binaries:
- ffmpeg
- uv
- python
- venv
- analyzer
- .ready
- models/ — Model caches:
- torch/ (Demucs)
- huggingface/ (WhisperX)
- audio_separator/ (UVR Karaoke)
Configuration details
- The config.json file stores app settings, including the selected data folder path, enabling you to move between machines or restore preferences with ease.
Video Backgrounds and Pixabay Integration
- Nightingale’s video backgrounds are a mix of GPU shader visuals and Pixabay-sourced video content.
- The Pixabay integration leverages the Pixabay API to fetch video backgrounds in a variety of flavors: Nature, Underwater, Space, City, and Countryside.
- In development builds, the Pixabay API key is embedded; for a custom setup, you can supply your own key via an environment file (PIXABAYAPIKEY=yourkeyhere).
Building from Source: For Developers and Power Users
If you’re interested in building Nightingale from source or contributing to the project, here are the essentials you’ll need:
Prerequisites
- Rust 1.85+ (edition 2024)
- Node.js 20+
- pnpm (latest)
- Linux dependencies if building on Linux: libwebkit2gtk-4.1-dev, libssl-dev, libayatana-appindicator3-dev, librsvg2-dev, libxdo-dev, libasound2-dev
Development and Release Builds
- Development: git clone nightingale; cd nightingale; cargo desktop dev
- Release: cargo desktop build
Supported Platforms
- Linux x8664: x8664-unknown-linux-gnu
- Linux aarch64: aarch64-unknown-linux-gnu
- macOS ARM: aarch64-apple-darwin
- macOS Intel: x86_64-apple-darwin
- Windows x8664: x8664-pc-windows-msvc
License
Nightingale is released under GPL-3.0-or-later. For full licensing details, refer to the LICENSE file included with the project.
Use Cases and Practical Tips
- Deep library karaoke: If you have a large collection, use the Analyze All action to bulk-analyze tracks and build a robust foundation of stems, lyrics, and playback variants.
- Video karaoke nights: Leverage Nightingale’s video-friendly mode to create engaging karaoke sessions with synchronized video backdrops and lyrics.
- Practice and learning: The mic mirroring feature allows you to route your live mic into playback for real-time practice, while the pitch scoring provides tangible feedback to guide improvement.
- Personalization: Create multiple profiles for different singers or moods, enabling tailored visuals and playback settings for each user.
What Makes Nightingale Stand Out
- A truly self-contained system that minimizes setup friction by automatically downloading essential dependencies.
- A dual-model approach to vocal separation (UVR Karaoke by default, with Demucs as an alternative) to suit different songs and vocal styles.
- Word-level lyric alignment that ensures precise lyric highlighting, complemented by lyrics fetched from LRCLIB when possible.
- Real-time pitch scoring and mood-adjustable key/tempo changes that empower singers to adapt rooms, voices, and arrangements on the fly.
- A rich visual experience with seven background themes and the option to use source video as a backdrop for video files.
- A thoughtful, device-friendly interface with gamepad support and scalable UI for 4K displays.
Conclusion: A Seamless Karaoke Studio in Your Library
Nightingale brings together cutting-edge neural networks, intelligent workflow orchestration, and a polished user experience to deliver a complete karaoke studio built around your own music collection. Whether you’re practicing, performing, or simply enjoying a new way to listen to familiar tracks, Nightingale provides the tools you need to isolate vocals, synchronize lyrics, and interact with music in a hands-on, immersive way. The design emphasizes convenience—automatic model downloads, one binary installation, and an interface that adapts to your hardware and screen size—while preserving flexibility through multiple models, lyrics sources, and playback options. If you’ve ever wished for a high-quality, self-contained karaoke engine that lives inside your music library, Nightingale is the answer.
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/rzru/nightingale
GitHub - rzru/nightingale: Nightingale: AI Karaoke from Your Music Library
Nightingale is an AI-powered karaoke engine that turns your music library into interactive karaoke sessions....
github - rzru/nightingale