Skip to main content
Prowler scans audited resources for plaintext secrets using Kingfisher, an open-source secret-scanning engine that Prowler invokes as a subprocess. This guide explains the structure every secret-scanning check must follow to keep scanning correct and efficient on large accounts.
Since Prowler 5.32.0 the secret-scanning checks scan with Kingfisher. Earlier versions used the detect-secrets library.

Overview

Secret detection runs through a single helper in prowler/lib/utils/utils.py:
  • detect_secrets_scan_batch(payloads, excluded_secrets=..., validate=...) scans many payloads in chunked subprocess invocations and returns a {key: [findings]} dictionary. To scan a single payload, pass a one-entry mapping (for example, {0: data}).
Every Kingfisher invocation carries a fixed process-startup cost (around 100 ms). Scanning once per resource would spawn thousands of subprocesses on large accounts (for example, thousands of CloudWatch log groups). detect_secrets_scan_batch amortizes that cost: it writes each payload to a temporary file as it consumes them, runs one subprocess per chunk (500 payloads by default), and maps the findings back to each payload by key.

The Batched Structure

Every secret-scanning check follows three phases.

Phase 1: Collect

Define a generator that yields (key, payload) for each scannable unit. The generator builds payload strings only — it does not call Kingfisher. Lazy yielding keeps memory and temporary-disk usage bounded to a single chunk, which matters when an account holds thousands of resources.

Phase 2: Batch

Call detect_secrets_scan_batch once with the generator. The helper consumes it in chunks, runs Kingfisher per chunk, and returns the keys that produced findings mapped to their finding lists.

Phase 3: Report

Iterate the resources, look up the findings by key, and build one report per resource. Emit a finding for every iterated resource — never drop one silently. When a resource’s payload cannot be prepared for scanning (for example, user data that fails to base64-decode or decompress), report it as MANUAL with a status explaining the scan could not inspect it, rather than omitting it or claiming PASS.
from prowler.lib.check.models import Check, Check_Report_AWS
from prowler.lib.utils.utils import (
    annotate_verified_secrets,
    detect_secrets_scan_batch,
)
from prowler.providers.aws.services.example.example_client import example_client


class example_resource_no_secrets(Check):
    def execute(self):
        findings = []
        excluded = example_client.audit_config.get("secrets_ignore_patterns", [])
        validate = example_client.audit_config.get("secrets_validate", False)
        resources = list(example_client.resources)

        # Phase 1: collect — builds strings only, no scan.
        def payloads():
            for index, resource in enumerate(resources):
                if resource.scannable_data:
                    yield index, serialize(resource)

        # Phase 2: batch — one call, chunked subprocesses.
        batch_results = detect_secrets_scan_batch(
            payloads(), excluded_secrets=excluded, validate=validate
        )

        # Phase 3: report — look up findings by key.
        for index, resource in enumerate(resources):
            report = Check_Report_AWS(metadata=self.metadata(), resource=resource)
            report.status = "PASS"
            report.status_extended = f"No secrets found in {resource.name}."
            detect_secrets_output = batch_results.get(index)
            if detect_secrets_output:
                report.status = "FAIL"
                report.status_extended = (
                    f"Potential secret found in {resource.name} -> ..."
                )
                annotate_verified_secrets(report, detect_secrets_output)
            findings.append(report)

        return findings

Choosing the Key

The key maps each finding back to its source. Two shapes cover every check:
  • One payload per resource: use the resource index. This fits checks that serialize a single payload per resource, such as launch configurations, CloudFormation outputs, SSM documents, Step Functions definitions, and OpenStack metadata.
  • Several payloads per resource: use a (resource_index, fragment) tuple, where the fragment identifies the variable, log stream, container, file, or version. Phase 3 groups the per-fragment findings to build the resource report. This fits CloudWatch log streams, ECS containers, CodeBuild variables, Glue arguments, and Lambda code files.
Derive the indices from the same list(...) of resources in both Phase 1 and Phase 3 so the order stays stable and the keys align.

Preserving Per-Payload Results

detect_secrets_scan_batch runs Kingfisher with --no-dedup, so a secret that appears in more than one payload is reported for each one. This reproduces the result of scanning each payload individually. Build payload strings exactly as a single scan would: serialize the same data and keep line ordering, because messages often map a finding’s line_number back to a variable name or metadata key.

Validation and Severity

detect_secrets_scan_batch accepts validate, read from secrets_validate in the provider configuration or the --scan-secrets-validate flag. When enabled, Kingfisher confirms whether each secret is live, and confirmed secrets carry is_verified: True. After marking a report as FAIL, pass the findings to annotate_verified_secrets(report, findings). When any secret is verified, the helper escalates the finding to critical severity and appends a note that the secret was confirmed live. Validation stays off by default because it sends the discovered secret to the provider API.

Excluded Secrets

detect_secrets_scan_batch applies secrets_ignore_patterns — regular expressions from the provider configuration — against each finding’s source line and drops the matches, mirroring single-scan behavior.

Testing

To assert on the verified-secret path, mock detect_secrets_scan_batch in the check module and return the keyed dictionary. For a single resource scanned at index 0:
mock.patch(
    "prowler.providers.aws.services.example.example_resource_no_secrets.example_resource_no_secrets.detect_secrets_scan_batch",
    return_value={
        0: [{"type": "...", "line_number": 1, "is_verified": True}]
    },
)
Most tests need no mock at all: they seed resources that contain example secrets and assert on the FAIL status and message, which exercises the real batched path. Refer to the Testing documentation for the general structure.