Kingfisher: Open Source Secret Scanner with Live Validation
- Introduction
- Kingfisher stands as an open source secret scanner with live validation, engineered in Rust to deliver speed and reliability at scale. It combines an Intel SIMD-accelerated regex engine (Hyperscan) with language-aware parsing to achieve high accuracy across massive codebases and collections.
- The project ships with 942 built-in rules designed to detect, validate, and triage leaked API keys, tokens, and credentials before they ever reach production environments. This accelerates secure development and reduces the blast radius of credential exposure.
- A browser-based report viewer accompanies Kingfisher, enabling visualization and triage of findings from Kingfisher as well as JSON reports produced by Gitleaks and TruffleHog. The viewer can be run locally via the CLI or accessed as a hosted experience on the docs site.
- Kingfisher is designed for offensive security engineers and blue-team defenders alike. It can scan a wide array of targets including repositories, cloud storage, chat messages, documentation, and CI pipelines, helping teams locate and verify exposed secrets quickly and accurately.
[Kingfisher Logo image from the input]
- The project emphasizes practical workflows for teams that need rapid, trustworthy secret detection, remediation, and audit-ready reporting. The combination of live validation, broad platform coverage, and a unified triage UI makes Kingfisher suitable for proactive security programs and DevSecOps practices.
- What Is Kingfisher?
- Kingfisher is a high-performance secret detection tool that targets source code and developer platforms. It answers common questions around “GitHub secret scanners,” “API key scanners,” and “token leak detection” with a workflow that integrates detection, live validation, and remediation in a single tool.
- Core capabilities include scanning code, Git history, and multiple platform integrations (GitHub, GitLab, Azure Repos, Bitbucket, Gitea, Hugging Face, Jira, Confluence, Slack, Microsoft Teams, Docker, AWS S3, Google Cloud Storage, and more).
- For credentials discovered during a scan, Kingfisher can validate them against their providers’ APIs to reduce false positives. When supported, it can revoke secrets directly from the CLI, streamlining remediation.
- Outputs from Kingfisher are available in multiple formats (JSON, SARIF, TOON, and HTML) to support security teams, compliance processes, and CI pipelines. The library crates also enable embedding the engine into your own Rust applications.
- Key Features 3.1 Performance, Accuracy, and 942 Rules
- Kingfisher emphasizes performance: multithreaded scanning powered by Hyperscan ensures fast processing for large codebases.
- It ships with 942 built-in rules (484 with live validation) and supports YAML-defined custom rules for extensibility.
- A notable capability is validate-and-revoke: live validation of discovered secrets and direct revocation of supported secrets across multiple providers (GitHub, GitLab, Slack, AWS, GCP, and others) to minimize exposure time.
- The system offers a blast-radius mapping mode that, with --access-map, instantly maps leaked keys to their effective cloud identities and exposed resources.
- The rule set covers a broad range of categories, including cloud providers, AI/ML tokens, CI/CD secrets, database credentials, and SaaS API keys, ensuring deep coverage across modern developer ecosystems.
- Visual inspection and triage are enhanced by a browser-based viewer that supports both Kingfisher’s native findings and external reports.
3.2 Multiple Scan Targets
- Kingfisher supports a wide spectrum of scan targets, enabling comprehensive secret detection across development workflows. This includes:
- Files and directories
- Local Git repositories
- GitHub organizations and repositories
- GitLab groups and projects
- Azure Repos
- Bitbucket workspaces
- Gitea organizations and repositories
- Hugging Face models and datasets
- Docker images
- Jira issues
- Confluence pages
- Slack messages
- Microsoft Teams messages
- AWS S3 buckets
- Google Cloud Storage buckets
- These targets demonstrate the tool’s breadth, allowing teams to scan conventional code assets and modern collaboration/storage platforms from a single interface.
- Visual icons accompany each target in the original docs, illustrating the breadth of integrations (e.g., Docker, Jira, Confluence, Slack, Teams, AWS S3, Google Cloud, and more).
3.3 Live Validation and Revocation
- Live validation verifies discovered secrets against provider APIs to confirm their status, reducing false positives and accelerating remediation decisions.
- When supported, tokens can be revoked directly from the CLI, enabling quick containment of exposed credentials without leaving the workflow.
- A revocation coverage matrix outlines which providers and rule IDs are currently supported, enabling teams to plan remediation strategies across the ecosystem.
3.4 Blast Radius Mapping
- A key capability is the blast radius mapping feature, which helps security teams understand the potential impact of a leaked credential. By mapping the credential to its cloud or service scope (e.g., which AWS resources or which GCP projects are implicated), teams can prioritize remediation according to risk.
- The --access-map option initiates this mapping, and the resulting data can be visualized in the report viewer to support triage and risk assessment.
3.5 Broad AI SaaS Coverage
- Kingfisher maintains broad coverage for AI and SaaS token ecosystems, including providers like OpenAI, Anthropic, Google Gemini, Cohere, AWS Bedrock, and many others.
- The tool can identify credentials for a wide array of AI/ML platforms and SaaS services, ensuring that modern tokenization patterns are detected and validated.
3.6 Compressed Files and Data Types
- The scanner supports extracting and scanning secrets from compressed files (tar.gz, bz2, xz) and ZIP-family containers (zip, jar, docx, xlsx, pptx, odt, epub, etc.), as well as asar, Hancom HWP (DEFLATE/zlib streams), and EGG (ALZip) scanning.
- Additional data formats include SQLite databases (to inspect contents stored in table rows) and Python bytecode (.pyc/.pyo) by extracting and scanning string constants.
3.7 Baseline Management
- Baseline management enables teams to track known secrets and suppress false positives over time. This supports a stable scanning process by distinguishing new findings from previously known exposures.
- Baselines can be created and updated to reflect the evolving risk posture of a project.
3.8 Checksum-Aware Detection
- A standout feature is checksum-aware matching, enabling offline structural verification of tokens that include internal checksums.
- This design reduces false positives by validating the structure of credentials without making any API calls. It aligns with modern token formats that embed checksums for self-verification.
- This approach supports offline verification, lowers the likelihood of false positives, and accelerates triage by catching invalid tokens early in the process.
- Documentation highlighting this capability and related templating is available in the RULES documentation.
3.9 Report Viewer (Local and Hosted)
- Kingfisher ships a browser-based report viewer and triager for Kingfisher JSON/JSONL, Gitleaks JSON, and TruffleHog JSON/JSONL outputs.
- Two usage modes:
- Local viewing via the CLI (kingfisher view ./report.json), which runs entirely client-side with no data leaving the machine.
- Hosted viewing via the static viewer on the docs site (https://mongodb.github.io/kingfisher/viewer/), which enables uploading multiple reports and triaging in a centralized UI.
- The viewer supports deduplication, cross-tool enrichment, blast-radius linking, and export of triage decisions for ticketing or rotation playbooks. It also enables cross-tool enrichment so that findings from different scanners can be correlated and enriched with Kingfisher's validation data.
- A note: running the viewer with --view-report starts a local web server (default port 7890) and opens the browser. The server binds to 127.0.0.1 by default for security.
3.10 Audit Reporting
- In addition to detection, Kingfisher supports generating audit-ready HTML reports that include scan metadata, timestamps, validation status, and links to findings at the file level.
- Audit metadata can be tailored to support compliance workflows, including evidence-friendly information about the scan and sanitized command arguments.
- This enables teams to demonstrate that secure development controls are operating in CI/CD and developer workflows.
3.11 Library Crates
- Kingfisher provides library crates that allow embedding the scanning engine into other Rust applications. This Beta feature enables developers to integrate Kingfisher’s capabilities directly into their own tooling and workflows, expanding the reach of the scanning technology beyond the standalone CLI.
3.12 Benchmark Results
- Benchmark results and performance comparisons are provided in the documentation, giving readers visibility into relative speed and efficiency across scenarios.
3.13 Visual and Demo Assets
- The documentation includes runtime comparisons (Kingfisher Runtime Comparison image) to illustrate performance characteristics.
- A basic usage demo showcases a command-line workflow and a slower replay for demonstration purposes (with a GIF illustrating scanning in action).
- An access-map viewer demo GIF demonstrates how the visualization can be used to interpret blast-radius data.
[Kingfisher Runtime Comparison image from the input]
- Basic Usage and Quick Start 4.1 Quick Start: Install Kingfisher
- Install options include:
- Homebrew for Linux/macOS: brew install kingfisher
- PyPI with uv: uv tool install kingfisher-bin
- One-line installer scripts for Linux/macOS/Windows
- Docker: docker run --rm -v "$PWD":/src ghcr.io/mongodb/kingfisher:latest scan /src
- Pre-built releases from GitHub
- Pre-commit hooks integration (git hooks, pre-commit framework, Husky)
- Compile from source if you prefer building locally
- The docs contain a consolidated installation guide with complete steps.
4.2 Quick Start: Basic Scan
- To scan a directory for secrets with validation:
- kingfisher scan /path/to/code
- Without validation:
- kingfisher scan ~/src/myrepo --no-validate
- Turbo mode (max speed by disabling certain checks):
- kingfisher scan ~/src/myrepo --turbo
- Display only validated findings:
- kingfisher scan /path/to/repo --only-valid
- Output formats and redirection (e.g., JSON):
- kingfisher scan /path/to/repo --format json --output findings.json
- Direct SARIF output:
- kingfisher scan /path/to/repo --format sarif --output findings.sarif
4.3 Quick Start: Access Map and Visualization
- Generate an access map during a scan and view it locally:
- kingfisher scan /path/to/code --access-map --view-report
- kingfisher view kingfisher.json
- Import third-party reports for triage:
- kingfisher view trufflehog.json
- kingfisher view gitleaks.json
- Combine multiple reports into a single triage session:
- kingfisher view report1.json report2.jsonl
- Load all reports from a directory:
- kingfisher view ./reports/
4.4 Quick Start: Revoke and Validate
- Validate a known secret without rescanning:
- kingfisher validate --rule opsgenie "12345678-9abc-def0-1234-56789abcdef0"
- Revoke a secret (examples):
- kingfisher revoke --rule slack "xoxb-…"
- kingfisher revoke --rule github "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
4.5 Quick Start: Integration and CI
- Scan a GitHub organization with a token:
- KFGITHUBTOKEN="ghp-…" kingfisher scan github --organization my-org
- Scan a GitLab group:
- KFGITLABTOKEN="glpat-…" kingfisher scan gitlab --group my-group
- Scan Azure Repos:
- KFAZUREPAT="pat" kingfisher scan azure --organization my-org
4.6 Quick Start: Report Viewer Demo
- Visualize results locally or via hosted viewer:
- kingfisher scan /path/to/code --view-report
- Access the hosted viewer example to see the same experiences in the browser.
[Kingfisher secret scanning demo image from the input]
- A demo GIF illustrates the scanning workflow, capturing how findings appear and how the viewer helps triage results.
- Report Viewer: Local and Hosted
- The viewer supports three formats: Kingfisher JSON/JSONL, Gitleaks JSON, and TruffleHog JSON/JSONL.
- Local viewing (CLI) keeps data entirely on the client side; no servers are involved.
- Hosted viewing provides a static page that reads uploaded reports and presents a unified triage experience.
- Benefits of the viewer:
- Skim hundreds of findings by detector, file, repository, and validation status.
- Import and compare multiple reports side-by-side, deduplicating by fingerprint.
- Prioritize live, validated secrets with a clear visual hierarchy.
- See blast radius and access-related details for each credential, enabling precise remediation planning.
- Cross-tool enrichment automatically links related findings across formats.
- Export triage decisions for ticketing systems or rotation runbooks.
- Caution: Use the access map feature only when authorized to inspect the target account, as Kingfisher may perform additional network requests to determine credential reach.
[Kingfisher access map and report viewer demo image from the input]
- A sample demo GIF demonstrates the viewer in action, including the live mapping and triage capabilities.
- Another thumbnail illustrates typical findings and how they are presented in the UI.
- Detection Rules
- Kingfisher ships with 942 built-in rules, designed to catch a wide spectrum of credential types across cloud providers, services, and runtimes.
- Categories highlighted in the rules include:
- Cloud Providers (AWS, GCP, Azure, Alibaba Cloud, DigitalOcean, IBM Cloud, Cloudflare, Heroku, Fly.io, Railway, Render, Temporal Cloud, and more)
- AI & ML tokens (OpenAI, Anthropic, Google Gemini, Azure OpenAI, Cohere, Mistral, Groq, xAI, Stability AI, Replicate, and more)
- Dev & CI/CD secrets (GitHub, GitLab, Bitbucket, Buildkite, CircleCI, TravisCI, TeamCity, Jenkins, Drone CI, Harness, Docker Hub, and many package registries)
- Databases (PostgreSQL, MySQL, MongoDB, Redis, PlanetScale, Supabase, Neon, ClickHouse, DataStax Astra, Firebase, JDBC, ODBC, and more)
- Messaging & Email (Slack, Discord, Teams, Telegram, Twilio, SendGrid, Mailgun, Mailchimp, and more)
- Observability (Datadog, Grafana, New Relic, Sentry, Dynatrace, Honeycomb, and others)
- Payments & Fintech (Stripe, PayPal, Square, GoCardless, and more)
- Security & Identity (Snyk, Auth0, Okta, LaunchDarkly, 1Password, JFrog Artifactory/Xray, SonarCloud, and others)
- CRM & Business SaaS (Salesforce, HubSpot, Jira, Confluence, Asana, Linear, Monday.com, Zendesk, Intercom, Shopify, and more)
- Crypto Material (private keys, JWTs, encryption keys, and related materials)
- The rules incorporate checksum intelligence for offline verification and provide a robust framework for adding custom detection rules as needed.
- Each rule is designed to support live validation, offline checks, and contextual verification to minimize false positives.
3.14 Write Custom Rules
- Kingfisher supports writing custom rules to suit unique environments. Documentation on creating and templating rules is provided in RULES.md.
- The checksum intelligence model is an important enhancement for custom rules, enabling offline verification of tokens with internal checksums.
- Usage Examples
- Kingfisher supports a variety of practical workflows and commands:
- Basic scanning with validation: kingfisher scan /path/to/code
- Scan only valid findings from third-party validations: kingfisher scan /path/to/code --only-valid
- Output JSON results to a file: kingfisher scan . --format json | tee kingfisher.json
- Generate SARIF directly to disk: kingfisher scan /path/to/repo --format sarif --output findings.sarif
- View and deduplicate multiple reports in a single triage: kingfisher view kingfisher.json gitleaks.json trufflehog.jsonl
- Revoke a secret using a specific rule: kingfisher revoke --rule github "ghp_xxx"
- Validate a known secret via standard input: echo "ghp_xxx" | kingfisher validate --rule github -
- Kingfisher’s usage examples also show complex CI scenarios, including scanning GitHub organizations, GitLab groups, Azure Repos, Jira issues, Slack conversations, and more.
- Environment Variables and Access
- Kingfisher uses environment variables to inject credentials for various platforms, such as:
- KFGITHUBTOKEN, KFGITLABTOKEN, KFGITEATOKEN
- KFAWSKEY, KFAWSSECRET, KFAWSSESSION_TOKEN
- KFAZURETOKEN, KFAZUREPAT, KFAZUREUSERNAME
- KFBITBUCKETTOKEN, KFBITBUCKETUSERNAME, KFBITBUCKETAPPPASSWORD, KFBITBUCKETOAUTHTOKEN
- KFHUGGINGFACETOKEN, KFHUGGINGFACEUSERNAME
- KFJIRATOKEN, KFCONFLUENCETOKEN, KFSLACKTOKEN, KFTEAMSTOKEN
- KFDOCKERTOKEN (and guidance about using Docker credentials)
- These variables facilitate authenticated scanning across hosted services and private repos, enabling richer detection and live validation across environments.
- Commands often demonstrate temporary usage for a session (exporting the variable for a single command) or per-command overrides to control access and scope.
- Advanced Features
- Baselines and suppression: suppress known secrets with a managed baseline, reducing noise while keeping new secrets visible for review.
- Filtering and suppression: skip known false positives or specific patterns using regex or keyword-based exclusions.
- CI pipeline scanning: scan changes between branches, commits, or within a CI context to focus on recent changes.
- Validation throttling: global or per-rule validation rate limits to prevent API overuse during large scans.
- Output customization: control validation response storage length, enable full validation payloads, and tailor report content for audits.
- Since Kingfisher aggregates data from multiple sources, features like deduplication and cross-tool enrichment are critical for efficient triage.
- Platform Integrations
- Version Control & Code Hosting: GitHub, GitLab, Azure Repos, Bitbucket, Gitea, Hugging Face
- Cloud Storage: AWS S3, Google Cloud Storage
- Containers: Docker images from registries
- Collaboration & Documentation: Jira, Confluence, Slack, Microsoft Teams
- The docs include references to an integration guide with authentication instructions and platform-specific scan commands.
- Documentation and Roadmap
- Kingfisher provides a comprehensive doc suite, including:
- INSTALLATION.md for installation and pre-commit hook setup
- INTEGRATIONS.md for platform-specific scanning guidance
- ACCESS_MAP.md for access map details and supported credential formats
- ARCHITECTURE.md for a high-level diagram of the CLI, scanner pipeline, validation, and outputs
- DEPLOYMENT.md for deployment models
- ADVANCED.md for advanced scanning features and tuning
- RULES.md for writing custom rules and checksum intelligence
- REVOCATION_PROVIDERS.md for revocation coverage by provider and rule ID
- BASELINE.md for baseline management
- LIBRARY.md for embedding Kingfisher as a Rust library
- FINGERPRINT.md for understanding finding fingerprints and deduplication
- COMPARISON.md for benchmark results
- PARSING.md for parsing details
- CONTEXT_VERIFICATION.md for the context-verification flow
- There is an active roadmap that emphasizes expanding rules, targets, and integration capabilities. Community involvement is encouraged via feature requests and pull requests.
- Licensing
- Kingfisher is released under the Apache 2.0 License, with citations in the docs indicating how to verify releases and attestations.
- Visual Aids and Demos
- A set of visual aids accompanies the documentation, including a runtime comparison graphic and usage demos. These assets illustrate how Kingfisher performs in practice and how the viewer aids triage.
- The docs also embed GIFs and thumbnails to demonstrate scanning workflows, access-map visualization, and real-world triage scenarios.
- Quick Start: Takeaways
- Kingfisher is a versatile, high-performance secret scanner with live validation, broad platform coverage, and a unified viewer to streamline triage and remediation.
- It is designed to fit into modern DevSecOps pipelines, giving security engineers and developers a practical, end-to-end workflow from detection to remediation and audit-ready reporting.
- The combination of offline checksum verification, live validation, and a comprehensive access-map visualization provides a strong defense-in-depth approach to secret management.
- Conclusion
- Kingfisher represents an integrated approach to secret detection, validation, and remediation across diverse development ecosystems. By unifying code scanning, live credential checks, and a browser-based triage interface, it helps teams identify, validate, and remediate secrets quickly and responsibly.
- The open source nature of the project invites collaboration, customization, and extension, enabling organizations to tailor rules, targets, and reporting to their unique security requirements.
- With support for hundreds of providers and a growing catalog of rules, Kingfisher aims to keep pace with evolving credential formats and deployment architectures, ensuring that the software supply chain remains resilient against credential leakage and abuse.
[Homebrew Formula Version image from the input] [Kingfisher logo image from the input] [Kingfisher Runtime Comparison image from the input] [Kingfisher secret scanning demo image from the input] [Kingfisher access map and report viewer demo image from the input] [Demo findings thumbnail image from the input]
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/mongodb/kingfisher
GitHub - mongodb/kingfisher: Kingfisher: Open Source Secret Scanner with Live Validation
Kingfisher is an open-source secret scanner designed to detect, validate, and triage leaked API keys, tokens, and credentials across various platforms using Rus...
github - mongodb/kingfisher