# AI Skills System
Source: https://docs.prowler.com/developer-guide/ai-skills
This guide explains the AI Skills system that provides on-demand context and patterns to AI agents working with the Prowler codebase.
**What are AI Skills?** Skills are structured instructions that help AI agents (Claude Code, Cursor, Copilot, etc.) understand Prowler's conventions, patterns, and best practices.
Skills live in the [`skills/`](https://github.com/prowler-cloud/prowler/tree/master/skills) directory of the Prowler OSS repository. Each skill is a folder containing a `SKILL.md` file with its patterns and metadata.
## Installation
To enable skills for the supported AI coding assistants, run the setup script from the repository root:
```bash theme={null}
./skills/setup.sh
```
The script creates symlinks so each tool finds the skills in its expected location:
| Tool | Created by setup |
| -------------- | -------------------------------------------------------- |
| Claude Code | `.claude/skills/` symlink and `CLAUDE.md` |
| Gemini CLI | `.gemini/skills/` symlink and `GEMINI.md` |
| Codex (OpenAI) | `.codex/skills/` symlink (uses `AGENTS.md` natively) |
| GitHub Copilot | `.github/copilot-instructions.md` symlink to `AGENTS.md` |
After running the setup, restart the AI coding assistant to load the skills.
## Using Skills
AI agents discover skills automatically and load them when a request matches a skill trigger. To load a skill manually during a session, point the agent to the skill's `SKILL.md` file:
```text theme={null}
Read skills/{skill-name}/SKILL.md
```
For the full list of available skills, their triggers, and the Auto-invoke mappings, see the [`skills/README.md`](https://github.com/prowler-cloud/prowler/blob/master/skills/README.md) and [`AGENTS.md`](https://github.com/prowler-cloud/prowler/blob/master/AGENTS.md) in the repository.
## Available Skills
| Type | Skills |
| ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Generic** | typescript, react-19, nextjs-16, tailwind-4, pytest, playwright, django-drf, zod-4, zustand-5, ai-sdk-5, vitest, tdd |
| **Prowler** | prowler, prowler-sdk-check, prowler-api, prowler-ui, prowler-mcp, prowler-provider, prowler-compliance, prowler-compliance-review, prowler-docs, prowler-pr, prowler-ci, prowler-attack-paths-query |
| **Testing** | prowler-test-sdk, prowler-test-api, prowler-test-ui |
| **Meta** | skill-creator, skill-sync |
This table is a snapshot. The repository is the source of truth: see [`skills/README.md`](https://github.com/prowler-cloud/prowler/blob/master/skills/README.md) for the current, complete list.
## Skill Structure
Each skill follows the [Agent Skills spec](https://agentskills.io):
```text theme={null}
skills/{skill-name}/
├── SKILL.md # Patterns, rules, decision trees
├── assets/ # Code templates, schemas
└── references/ # Links to local docs (single source of truth)
```
## Key Design Decisions
1. **Self-contained skills** - Critical patterns inline for fast loading
2. **Local doc references** - No web URLs, points to `docs/developer-guide/*.mdx`
3. **Single source of truth** - Skills reference docs, no duplication
4. **On-demand loading** - AI loads only what's needed for the task
## Creating New Skills
Use the `skill-creator` meta-skill to create new skills that follow the Agent Skills spec. See [`AGENTS.md`](https://github.com/prowler-cloud/prowler/blob/master/AGENTS.md) for the full list of available skills and their triggers.
## How Skills Work
The diagrams below explain the internals of the skill system. They are useful for understanding the design, but are not required to install or use skills.
### Architecture Overview
```mermaid theme={null}
graph LR
subgraph FLOW["AI Skills Architecture"]
A["AI Agent"] -->|"1. matches trigger"| B["AGENTS.md"]
B -->|"2. loads"| C["Skill"]
C -->|"3. provides"| D["Patterns
Templates
Commands"]
C -->|"4. references"| E["Local Docs"]
D --> F["Correct Output"]
E --> F
end
style A fill:#1e3a5f,stroke:#4a9eff,color:#fff
style B fill:#5c4d1a,stroke:#ffd700,color:#fff
style C fill:#1a4d1a,stroke:#4caf50,color:#fff
style E fill:#4a1a4d,stroke:#ba68c8,color:#fff
style F fill:#1a4d2e,stroke:#66bb6a,color:#fff
```
### Request Lifecycle
```mermaid theme={null}
sequenceDiagram
participant U as User
participant A as AI Agent
participant R as AGENTS.md
participant S as Skill
participant AS as assets/
participant RF as references/
participant D as Local Docs
U->>A: "Create an AWS security check"
Note over A: Analyze request context
A->>R: Find matching skill trigger
R-->>A: prowler-sdk-check matches
A->>S: Load SKILL.md
S-->>A: Patterns, rules, templates, commands
Note over A: Need code template?
A->>AS: Read assets/aws_check.py
AS-->>A: Check implementation template
Note over A: Need more details?
A->>RF: Read references/metadata-docs.md
RF-->>A: Points to local docs
A->>D: Read docs/developer-guide/checks.mdx
D-->>A: Full documentation
Note over A: Execute with full context
A->>U: Creates check with correct patterns
```
### With and Without Skills
```mermaid theme={null}
graph TD
subgraph COMPARISON["BEFORE vs AFTER"]
direction LR
subgraph BEFORE["Without Skills"]
B1["AI guesses conventions"]
B2["Wrong structure"]
B3["Multiple iterations"]
B4["Web searches for docs"]
B5["Inconsistent patterns"]
end
subgraph AFTER["With Skills"]
A1["AI loads exact patterns"]
A2["Correct structure"]
A3["First-time right"]
A4["Local docs referenced"]
A5["Consistent patterns"]
end
end
style BEFORE fill:#5c1a1a,stroke:#ef5350,color:#fff
style AFTER fill:#1a4d1a,stroke:#66bb6a,color:#fff
```
### Full Component Map
```mermaid theme={null}
flowchart TB
subgraph ENTRY["ENTRY POINT"]
AGENTS["AGENTS.md
━━━━━━━━━━━━━━━━━
• Available skills registry
• Skill → Trigger mapping
• Component navigation"]
end
subgraph SKILLS["SKILLS LIBRARY"]
direction TB
subgraph GENERIC["Generic Skills"]
G1["typescript"]
G2["react-19"]
G3["nextjs-16"]
G4["tailwind-4"]
G5["pytest"]
G6["playwright"]
G7["django-drf"]
G8["zod-4"]
G9["zustand-5"]
G10["ai-sdk-5"]
end
subgraph PROWLER["Prowler Skills"]
P1["prowler"]
P2["prowler-sdk-check"]
P3["prowler-api"]
P4["prowler-ui"]
P5["prowler-mcp"]
P6["prowler-provider"]
P7["prowler-compliance"]
P8["prowler-compliance-review"]
P9["prowler-docs"]
P10["prowler-pr"]
P11["prowler-ci"]
end
subgraph TESTING["Testing Skills"]
T1["prowler-test-sdk"]
T2["prowler-test-api"]
T3["prowler-test-ui"]
end
subgraph META["Meta Skills"]
M1["skill-creator"]
M2["skill-sync"]
end
end
subgraph STRUCTURE["SKILL STRUCTURE"]
direction LR
SKILLMD["SKILL.md
━━━━━━━━━━━━━━
• Frontmatter
• Critical patterns
• Decision trees
• Code examples
• Commands
• Keywords"]
ASSETS["assets/
━━━━━━━━━━━━━━
• Code templates
• JSON schemas
• Config examples"]
REFS["references/
━━━━━━━━━━━━━━
• Local doc paths
• No web URLs
• Single source"]
end
subgraph DOCS["DOCUMENTATION"]
direction TB
DD["docs/developer-guide/"]
D1["checks.mdx"]
D2["unit-testing.mdx"]
D3["provider.mdx"]
D4["mcp-server.mdx"]
D5["..."]
DD --> D1
DD --> D2
DD --> D3
DD --> D4
DD --> D5
end
ENTRY --> SKILLS
SKILLS --> STRUCTURE
SKILLMD --> ASSETS
SKILLMD --> REFS
REFS -.->|"points to"| DOCS
style ENTRY fill:#1e3a5f,stroke:#4a9eff,color:#fff
style GENERIC fill:#5c4d1a,stroke:#ffd700,color:#fff
style PROWLER fill:#1a4d1a,stroke:#66bb6a,color:#fff
style TESTING fill:#4d1a3d,stroke:#f06292,color:#fff
style META fill:#4a1a4d,stroke:#ba68c8,color:#fff
style STRUCTURE fill:#5c3d1a,stroke:#ffb74d,color:#fff
style DOCS fill:#1a3d4d,stroke:#4dd0e1,color:#fff
```
# Alibaba Cloud Provider
Source: https://docs.prowler.com/developer-guide/alibabacloud-details
This page details the [Alibaba Cloud](https://www.alibabacloud.com/) provider implementation in Prowler.
By default, Prowler will audit all the Alibaba Cloud regions that are available. To configure it, follow the [Alibaba Cloud getting started guide](/user-guide/providers/alibabacloud/getting-started-alibabacloud).
## Alibaba Cloud Provider Classes Architecture
The Alibaba Cloud provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the Alibaba Cloud-specific implementation, highlighting how the generic provider concepts are realized for Alibaba Cloud in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider).
### Main Class
* **Location:** [`prowler/providers/alibabacloud/alibabacloud_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/alibabacloud/alibabacloud_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for Alibaba Cloud-specific logic, session management, credential validation, and configuration.
* **Key Alibaba Cloud Responsibilities:**
* Initializes and manages Alibaba Cloud sessions (supports Access Keys, STS Temporary Credentials, RAM Role Assumption, ECS RAM Role, OIDC Authentication, and Credentials URI).
* Validates credentials using STS GetCallerIdentity.
* Loads and manages configuration, mutelist, and fixer settings.
* Discovers and manages Alibaba Cloud regions.
* Provides properties and methods for downstream Alibaba Cloud service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/alibabacloud/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/alibabacloud/models.py)
* **Purpose:** Define structured data for Alibaba Cloud identity, session, credentials, and region info.
* **Key Alibaba Cloud Models:**
* `AlibabaCloudCallerIdentity`: Stores caller identity information from STS GetCallerIdentity (account\_id, principal\_id, arn, identity\_type).
* `AlibabaCloudIdentityInfo`: Holds Alibaba Cloud identity metadata including account ID, user info, profile, and audited regions.
* `AlibabaCloudCredentials`: Stores credentials (access\_key\_id, access\_key\_secret, security\_token).
* `AlibabaCloudRegion`: Represents an Alibaba Cloud region with region\_id and region\_name.
* `AlibabaCloudSession`: Manages the session and provides methods to create service clients.
### `AlibabaCloudService` (Service Base Class)
* **Location:** [`prowler/providers/alibabacloud/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/alibabacloud/lib/service/service.py)
* **Purpose:** Abstract base class that all Alibaba Cloud service-specific classes inherit from. This implements the generic service pattern (described in [service page](/developer-guide/services#service-base-class)) specifically for Alibaba Cloud.
* **Key Alibaba Cloud Responsibilities:**
* Receives an `AlibabacloudProvider` instance to access session, identity, and configuration.
* Manages regional clients for services that are region-specific.
* Provides `__threading_call__` method to make API calls in parallel by region or resource.
* Exposes common audit context (`audited_account`, `audited_account_name`, `audit_resources`, `audit_config`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/alibabacloud/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/alibabacloud/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for Alibaba Cloud-specific error handling.
* **Key Alibaba Cloud Exceptions:**
* `AlibabaCloudClientError`: General client errors
* `AlibabaCloudNoCredentialsError`: No credentials found
* `AlibabaCloudInvalidCredentialsError`: Invalid credentials provided
* `AlibabaCloudSetUpSessionError`: Session setup failures
* `AlibabaCloudAssumeRoleError`: RAM role assumption failures
* `AlibabaCloudInvalidRegionError`: Invalid region specified
* `AlibabaCloudHTTPError`: HTTP/API errors
### Session and Utility Helpers
* **Location:** [`prowler/providers/alibabacloud/lib/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/alibabacloud/lib/)
* **Purpose:** Helpers for argument parsing, mutelist management, and other cross-cutting concerns.
## Specific Patterns in Alibaba Cloud Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the currently implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/alibabacloud/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/alibabacloud/services)
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new service is following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and taking other services already implemented as reference. In next subsection you can find a list of common patterns that are used across all Alibaba Cloud services.
### Alibaba Cloud Service Common Patterns
* Services communicate with Alibaba Cloud using the official Alibaba Cloud Python SDKs. Documentation for individual services can be found in the [Alibaba Cloud SDK documentation](https://www.alibabacloud.com/help/en/sdk).
* Every Alibaba Cloud service class inherits from `AlibabaCloudService`, ensuring access to session, identity, configuration, and client utilities.
* The constructor (`__init__`) always calls `super().__init__` with the service name, provider, and optionally `global_service=True` for services that are not regional (e.g., RAM).
* Resource containers **must** be initialized in the constructor. For regional services, resources are typically stored in dictionaries keyed by region and resource ID.
* All Alibaba Cloud resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes.
* Alibaba Cloud SDK functions are wrapped in try/except blocks, with specific handling for errors, always logging errors.
* Regional services use `self.regional_clients` to maintain clients for each audited region.
* The `__threading_call__` method is used for parallel execution across regions or resources.
### Example Service Implementation
```python theme={null}
from prowler.lib.logger import logger
from prowler.providers.alibabacloud.lib.service.service import AlibabaCloudService
class MyService(AlibabaCloudService):
def __init__(self, provider):
# Initialize parent class with service name
super().__init__("myservice", provider)
# Initialize resource containers
self.resources = {}
# Discover resources using threading
self.__threading_call__(self._describe_resources)
def _describe_resources(self, regional_client):
try:
region = regional_client.region
response = regional_client.describe_resources()
for resource in response.body.resources:
self.resources[resource.id] = MyResource(
id=resource.id,
name=resource.name,
region=region,
# ... other attributes
)
except Exception as error:
logger.error(
f"{error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
```
## Specific Patterns in Alibaba Cloud Checks
The Alibaba Cloud checks pattern is described in [checks page](/developer-guide/checks). You can find all the currently implemented checks:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/alibabacloud/services/ram/ram_no_root_access_key/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/alibabacloud/services/ram/ram_no_root_access_key))
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new check is following the [check implementation documentation](/developer-guide/checks#creating-a-check) and taking other similar checks as reference.
### Check Report Class
The `CheckReportAlibabaCloud` class models a single finding for an Alibaba Cloud resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`CheckReportAlibabaCloud` extends the base report structure with Alibaba Cloud-specific fields, enabling detailed tracking of the resource, resource ID, ARN, and region associated with each finding.
#### Constructor and Attribute Population
When you instantiate `CheckReportAlibabaCloud`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its Alibaba Cloud-specific attributes from the resource, using the following logic:
* **`resource_id`**:
* Uses `resource.id` if present.
* Otherwise, uses `resource.name` if present.
* Defaults to an empty string if not available.
* **`resource_arn`**:
* Uses `resource.arn` if present.
* Defaults to an empty string if not available.
* **`region`**:
* Uses `resource.region` if present.
* Defaults to an empty string if not available.
If the resource object does not contain the required attributes, you must set them manually in the check logic.
Other attributes are inherited from the `Check_Report` class, from which you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
from prowler.lib.check.models import Check, CheckReportAlibabaCloud
from prowler.providers.alibabacloud.services.myservice.myservice_client import myservice_client
class myservice_example_check(Check):
def execute(self) -> list[CheckReportAlibabaCloud]:
findings = []
for resource in myservice_client.resources.values():
report = CheckReportAlibabaCloud(
metadata=self.metadata(),
resource=resource
)
report.region = resource.region
report.resource_id = resource.id
report.resource_arn = f"acs:myservice::{myservice_client.audited_account}:resource/{resource.id}"
if resource.is_compliant:
report.status = "PASS"
report.status_extended = f"Resource {resource.name} is compliant."
else:
report.status = "FAIL"
report.status_extended = f"Resource {resource.name} is not compliant."
findings.append(report)
return findings
```
## Authentication Methods
The Alibaba Cloud provider supports multiple authentication methods, prioritized in the following order:
1. **Credentials URI** - Retrieve credentials from an external URI endpoint
2. **OIDC Role Authentication** - For applications running in ACK with RRSA enabled
3. **ECS RAM Role** - For ECS instances with attached RAM roles
4. **RAM Role Assumption** - Cross-account access with role assumption
5. **STS Temporary Credentials** - Pre-obtained temporary credentials
6. **Permanent Access Keys** - Static access key credentials
7. **Default Credential Chain** - Automatic credential discovery
For detailed authentication configuration, see the [Authentication documentation](/user-guide/providers/alibabacloud/authentication).
## Regions
Alibaba Cloud has multiple regions across the globe. By default, Prowler audits all available regions. You can specify specific regions using the `--region` CLI argument:
```bash theme={null}
prowler alibabacloud --region cn-hangzhou cn-shanghai
```
The list of supported regions is maintained in [`prowler/providers/alibabacloud/config.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/alibabacloud/config.py).
# AWS Provider
Source: https://docs.prowler.com/developer-guide/aws-details
In this page you can find all the details about [Amazon Web Services (AWS)](https://aws.amazon.com/) provider implementation in Prowler.
By default, Prowler will audit just one account and organization settings per scan. To configure it, follow the [AWS getting started guide](/user-guide/providers/aws/getting-started-aws).
## AWS Provider Classes Architecture
The AWS provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the AWS-specific implementation, highlighting how the generic provider concepts are realized for AWS in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider). In next subsection you can find a list of the main classes of the AWS provider.
### `AwsProvider` (Main Class)
* **Location:** [`prowler/providers/aws/aws_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/aws_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for AWS-specific logic, session management, credential validation, role assumption, region and organization discovery, and configuration.
* **Key AWS Responsibilities:**
* Initializes and manages AWS sessions (with or without role assumption, MFA, etc.).
* Validates credentials and sets up the AWS identity context.
* Loads and manages configuration, mutelist, and fixer settings.
* Discovers enabled AWS regions and organization metadata.
* Provides properties and methods for downstream AWS service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/aws/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/models.py)
* **Purpose:** Define structured data for AWS identity, session, credentials, organization info, and more.
* **Key AWS Models:**
* `AWSOrganizationsInfo`: Holds AWS Organizations metadata, to be used by the checks.
* `AWSCredentials`, `AWSAssumeRoleInfo`, `AWSAssumeRoleConfiguration`: Used for role assumption and session management.
* `AWSIdentityInfo`: Stores account, user, partition, and region context for the scan.
* `AWSSession`: Wraps the current and original [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) sessions and config.
### `AWSService` (Service Base Class)
* **Location:** [`prowler/providers/aws/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/lib/service/service.py)
* **Purpose:** Abstract base class that all AWS service-specific classes inherit from. This implements the generic service pattern (described in [service page](/developer-guide/services#service-base-class)) specifically for AWS.
* **Key AWS Responsibilities:**
* Receives an `AwsProvider` instance to access session, identity, and configuration.
* Manages clients for all services by regions.
* Provides `__threading_call__` method to make boto3 calls in parallel. By default, this calls are made by region, but it can be overridden with the first parameter of the method and use by resource.
* Exposes common audit context (`audited_account`, `audited_account_arn`, `audited_partition`, `audited_resources`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/aws/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for AWS-specific error handling, such as credential and role errors.
### Session and Utility Helpers
* **Location:** [`prowler/providers/aws/lib/`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/lib/)
* **Purpose:** Helpers for session setup, ARN parsing, mutelist management, and other cross-cutting concerns.
## Specific Patterns in AWS Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the right now implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/aws/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/aws/services)
* In the [Prowler Hub](https://hub.prowler.com/). For a more human-readable view.
The best reference to understand how to implement a new service is following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and taking other services already implemented as reference. In next subsection you can find a list of common patterns that are used accross all AWS services.
### AWS Service Common Patterns
* Services communicate with AWS using boto3, you can find the documentation with all the services [here](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/index.html).
* Every AWS service class inherits from `AWSService`, ensuring access to session, identity, configuration, and threading utilities.
* The constructor (`__init__`) always calls `super().__init__` with the service name and provider (e.g. `super().__init__(__class__.__name__, provider))`). Ensure that the service name in boto3 is the same that you use in the constructor. Usually is used the `__class__.__name__` to get the service name because it is the same as the class name.
* Resource containers **must** be initialized in the constructor. They should be dictionaries, with the key being the resource ARN or equivalent unique identifier and the value being the resource object.
* Resource discovery and attribute collection are parallelized using `self.__threading_call__`, typically by region or resource, for performance. The first parameter of the method is the iterator, if not provided, it will be the region; but if present indicate an array of the resources to be processed.
* Resource filtering is consistently enforced using `self.audit_resources` attribute and `is_resource_filtered` function, it is used to see if user has provided some resource that is not in the audit scope, so we can skip it in the service logic. Normally it is used befor storing the resource in the service container as follows: `if not self.audit_resources or (is_resource_filtered(resource["arn"], self.audit_resources)):`.
* All AWS resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes.
* AWS API calls are wrapped in try/except blocks, with specific handling for `ClientError` and generic exceptions, always logging errors.
* If ARN is not present for some resource, it can be constructed using string interpolation, always including partition, service, region, account, and resource ID.
* Tags and additional attributes that cannot be retrieved from the default call, should be collected and stored for each resource using dedicated methods and threading using the resource object list as iterator.
* When accessing dictionary values from AWS API responses, always use `.get()` with a default value instead of direct dictionary access (e.g., `response.get("Policies", {})` instead of `response["Policies"]`). AWS API responses may not always include all keys, and direct access can cause `KeyError` exceptions that break the entire scan for that service.
### Extending an Existing Service with New Attributes
When adding a new check that requires data not yet collected by an existing service, you need to extend the service by adding new attributes to its resource models and updating the data collection methods. This is a common contributor task that follows a consistent pattern:
1. **Identify the missing data**: Determine which AWS API call provides the data you need and whether it's already being called by the service.
2. **Add new attributes to the resource model**: Extend the Pydantic `BaseModel` class for the resource with the new fields. Use `Optional` types with `None` as the default value to maintain backward compatibility with existing checks.
3. **Update the data collection method**: Modify the existing method that fetches resource details to also extract and store the new attributes. If no existing method fetches the data, add a new method and call it in the constructor using `self.__threading_call__` if possible.
4. **Use safe dictionary access**: When extracting values from API responses, always use `.get()` with appropriate defaults to prevent `KeyError` exceptions when the API doesn't return certain fields.
#### Example: Adding DKIM Status to SES Identities
```python theme={null}
# Step 1 & 2: Add new fields to the resource model
class Identity(BaseModel):
name: str
arn: str
region: str
type: Optional[str]
policy: Optional[dict] = None
tags: Optional[list] = []
# New attributes for DKIM check
dkim_status: Optional[str] = None
dkim_signing_attributes_origin: Optional[str] = None
# Step 3: Update the data collection method
def _get_email_identities(self, identity):
try:
regional_client = self.regional_clients[identity.region]
identity_attributes = regional_client.get_email_identity(
EmailIdentity=identity.name
)
# Step 4: Use .get() for safe dictionary access
for content_key, content_value in identity_attributes.get("Policies", {}).items():
identity.policy = loads(content)
identity.tags = identity_attributes.get("Tags", [])
# Extract new DKIM attributes
identity.dkim_status = identity_attributes.get("DkimStatus")
identity.dkim_signing_attributes_origin = (
identity_attributes.get("DkimSigningAttributesOrigin")
)
except Exception as error:
logger.error(
f"{regional_client.region} -- {error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
```
5. **Update the service tests**: Add the new attributes to the test mock data and assertions to verify correct data extraction.
## Specific Patterns in AWS Checks
The AWS checks pattern is described in [checks page](/developer-guide/checks). You can find all the right now implemented checks:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/aws/services/s3/s3_bucket_acl_prohibited/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/aws/services/s3/s3_bucket_acl_prohibited))
* In the [Prowler Hub](https://hub.prowler.com/). For a more human-readable view.
The best reference to understand how to implement a new check is following the [check creation documentation](/developer-guide/checks#creating-a-check) and taking other similar checks as reference.
### Check Report Class
The `Check_Report_AWS` class models a single finding for an AWS resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`Check_Report_AWS` extends the base report structure with AWS-specific fields, enabling detailed tracking of the resource, ARN, and region associated with each finding.
#### Constructor and Attribute Population
When you instantiate `Check_Report_AWS`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its AWS-specific attributes from the resource, using the following logic (in order of precedence):
* **`resource_id`**:
* Uses `resource.id` if present.
* Otherwise, uses `resource.name` if present.
* Defaults to an empty string if none are available.
* **`resource_arn`**:
* Uses `resource.arn` if present.
* Defaults to an empty string if ARN is not present in the resource object.
* **`region`**:
* Uses `resource.region` if present.
* Defaults to an empty string if region is not present in the resource object.
If the resource object does not contain the required attributes, you must set them manually in the check logic.
Other attributes are inherited from the `Check_Report` class, from that ones you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
report = Check_Report_AWS(
metadata=check_metadata,
resource=resource_object
)
report.status = "PASS"
report.status_extended = "Resource is compliant."
```
# Azure Provider
Source: https://docs.prowler.com/developer-guide/azure-details
In this page you can find all the details about [Microsoft Azure](https://azure.microsoft.com/) provider implementation in Prowler.
By default, Prowler will audit all the subscriptions that it is able to list in the Microsoft Entra tenant, and tenant Entra ID service. To configure it, follow the [Azure getting started guide](/user-guide/providers/azure/getting-started-azure).
## Azure Provider Classes Architecture
The Azure provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the Azure-specific implementation, highlighting how the generic provider concepts are realized for Azure in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider). In next subsection you can find a list of the main classes of the Azure provider.
### `AzureProvider` (Main Class)
* **Location:** [`prowler/providers/azure/azure_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/azure/azure_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for Azure-specific logic, session management, credential validation, and configuration.
* **Key Azure Responsibilities:**
* Initializes and manages Azure sessions (supports Service Principal, CLI, Browser, and Managed Identity authentication).
* Validates credentials and sets up the Azure identity context.
* Loads and manages configuration, mutelist, and fixer settings.
* Retrieves subscription(s) metadata.
* Provides properties and methods for downstream Azure service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/azure/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/azure/models.py)
* **Purpose:** Define structured data for Azure identity, session, region configuration, and subscription info.
* **Key Azure Models:**
* `AzureIdentityInfo`: Holds Azure identity metadata, including tenant ID, domain, subscription names and IDs, and locations.
* `AzureRegionConfig`: Stores the specific region that will be audited. That can be: Global, US Government or China.
* `AzureSubscription`: Represents a subscription with ID, display name, and state.
### `AzureService` (Service Base Class)
* **Location:** [`prowler/providers/azure/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/azure/lib/service/service.py)
* **Purpose:** Abstract base class that all Azure service-specific classes inherit from. This implements the generic service pattern (described in [service page](/developer-guide/services#service-base-class)) specifically for Azure.
* **Key Azure Responsibilities:**
* Receives an `AzureProvider` instance to access session, identity, and configuration.
* Manages clients for all services by subscription.
* Exposes common audit context (`subscriptions`, `locations`, `audit_config`, `fixer_config`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/azure/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/azure/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for Azure-specific error handling, such as credential, region, and session errors.
### Session and Utility Helpers
* **Location:** [`prowler/providers/azure/lib/`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/azure/lib/)
* **Purpose:** Helpers for argument parsing, region setup, mutelist management, and other cross-cutting concerns.
## Specific Patterns in Azure Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the currently implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/azure/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/azure/services)
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new service is following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and taking other services already implemented as reference. In next subsection you can find a list of common patterns that are used accross all Azure services.
### Azure Service Common Patterns
* Services communicate with Azure using the Azure Python SDK, mainly using the Azure Management Client (except for the Microsoft Entra ID service, that is using the Microsoft Graph API), you can find the documentation with all the management services [here](https://learn.microsoft.com/en-us/python/api/overview/azure/?view=azure-python).
* Every Azure service class inherits from `AzureService`, ensuring access to session, identity, configuration, and client utilities.
* The constructor (`__init__`) always calls `super().__init__` with the service Azure Management Client and Prowler provider object (e.g `super().__init__(WebSiteManagementClient, provider)`).
* Resource containers **must** be initialized in the constructor, and they should be dictionaries, with the key being the subscription ID, the value being a dictionary with the resource ID as key and the resource object as value.
* All Azure resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes. Some are represented as dataclasses due to legacy reasons, but new resources should be represented as Pydantic `BaseModel` classes.
* Azure SDK functions are wrapped in try/except blocks, with specific handling for errors, always logging errors. It is a best practice to create a custom function for every Azure SDK call, in that way we can handle the errors in a more specific way.
## Specific Patterns in Azure Checks
The Azure checks pattern is described in [checks page](/developer-guide/checks). You can find all the currently implemented checks:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/azure/services/storage/storage_blob_public_access_level_is_disabled/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/azure/services/storage/storage_blob_public_access_level_is_disabled))
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new check is the [Azure check implementation documentation](/developer-guide/checks#creating-a-check) and taking other similar checks as reference.
### Check Report Class
The `Check_Report_Azure` class models a single finding for an Azure resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`Check_Report_Azure` extends the base report structure with Azure-specific fields, enabling detailed tracking of the resource, resource ID, name, subscription, and location associated with each finding.
#### Constructor and Attribute Population
When you instantiate `Check_Report_Azure`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its Azure-specific attributes from the resource, using the following logic (in order of precedence):
* **`resource_id`**:
* Uses `resource.id` if present.
* Otherwise, uses `resource.resource_id` if present.
* Defaults to an empty string if not available.
* **`resource_name`**:
* Uses `resource.name` if present.
* Otherwise, uses `resource.resource_name` if present.
* Defaults to an empty string if not available.
* **`subscription`**:
* Defaults to an empty string, it **must** be set in the check logic.
* **`location`**:
* Uses `resource.location` if present.
* Defaults to an empty string if not available.
If the resource object does not contain the required attributes, you must set them manually in the check logic.
Other attributes are inherited from the `Check_Report` class, from which you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
report = Check_Report_Azure(
metadata=check_metadata,
resource=resource_object
)
report.subscription = subscription_id
report.status = "PASS"
report.status_extended = "Resource is compliant."
```
# Check Metadata Guidelines
Source: https://docs.prowler.com/developer-guide/check-metadata-guidelines
## Introduction
This guide provides comprehensive guidelines for creating check metadata in Prowler. For basic information on check metadata structure, refer to the [check metadata](/developer-guide/checks#metadata-structure-for-prowler-checks) section.
## Check Title Guidelines
### Writing Guidelines
1. **Determine Resource Finding Scope (Singular vs. Plural)**:
When determining whether to use singular or plural in the check title, examine the code for certain patterns. If the code contains a loop that generates an individual report for each resource, use the singular form. If the code produces a single report that covers all resources collectively, use the plural form. For organization- or account-wide checks, select the scope that best matches the breadth of the evaluation. Additionally, review the `status_extended` field messages in the code, as they often provide clues about whether the check is scoped to individual resources or to groups of resources.
Analyze the detection code to determine if the check reports on individual resources or aggregated resources:
* **Singular**: Use when the check creates one report per resource (e.g., "EC2 instance has IMDSv2 enforced", "S3 bucket does not allow public write access").
* **Plural**: Use when the check creates one report for all resources together (e.g., "All EC2 instances have IMDSv2 enforced", "S3 buckets do not allow public write access").
2. **Describe the Compliant (*PASS*) State**:
Always write the title to describe the **desired, compliant state** of the resources. The title should reflect what it looks like when the audited resource is following the check's requirements.
3. **Be Specific and Factual**:
Include the exact secure configuration being verified. Avoid vague or generic terms like "properly configured".
4. **Avoid Redundant or Action Words**:
Do not include verbs like "Check", "Verify", "Ensure", or "Monitor". The title is a declarative statement of the secure condition.
5. **Length Limit**:
Keep the title under 150 characters.
### Common Mistakes to Avoid
* Starting with verbs like "Check", "Verify", "Ensure", "Make sure". Always start with the affected resource instead.
* Being too vague or generic (e.g., "Ensure security groups are properly configured", what does it mean? "properly configured" is not a clear description of the compliant state).
* Focusing on the non-compliant state instead of the compliant state.
* Using unclear scope and resource identification.
## Check Type Guidelines (AWS Only)
### AWS Security Hub Type Format
AWS Security Hub uses a three-part type taxonomy:
* **Namespace**: The top-level security domain.
* **Category**: The security control family or area.
* **Classifier**: The specific security concern (optional).
A partial path may be defined (e.g., `TTPs` or `TTPs/Defense Evasion` are valid).
### Selection Guidelines
1. **Be Specific**: Use the most specific classifier that accurately describes the check.
2. **Standard Compliance**: Consider if the check relates to specific compliance standards.
3. **Multiple Types**: You can specify multiple types if the check addresses multiple concerns.
## Description Guidelines
### Writing Guidelines
1. **Focus on the Finding**: All fields should address how the finding affects the security posture, rather than the control itself.
2. **Use Natural Language**: Write in simple, clear paragraphs with complete, grammatically correct sentences.
3. **Use Markdown Formatting**: Enhance readability with:
* Use **bold** for emphasis on key security concepts.
* Use *italic* for a secondary emphasis. Use it for clarifications, conditions, or optional notes. But don't abuse it.
* Use `code` formatting for specific configuration values, or technical details. Don't use it for service names or common technical terms.
* Use one or two line breaks (`\n` or `\n\n`) to separate distinct ideas.
* Use bullet points (`-`) for listing multiple concepts or actions.
* Use numbers for listing steps or sequential actions.
4. **Be Concise**: Maximum 400 characters (spaces count). Every word should add value.
5. **Explain What the Finding Means**: Focus on what the security control evaluates and what it means when it passes or fails, but without explicitly stating the pass or fail state.
6. **Be Technical but Clear**: Use appropriate technical terminology while remaining understandable.
7. **Avoid Risk Descriptions**: Do not describe potential risks, threats, or consequences.
8. **CheckTitle and Description can be the same**: If the check is very simple and the title is already clear, you can use the same text for the description.
### Common Mistakes to Avoid
* **Technical Implementation Details**: "The control loops through all instances and calls the describe\_instances API...".
* **Vague Descriptions**: "This control verifies proper configuration of resources". What does it mean? "proper configuration" is not a clear description of the compliant state.
* **Risk Descriptions**: "This could lead to data breaches" or "This poses a security threat".
* **Starting with Verbs**: "Check if...", "Verify...", "Ensure...". Always start with the affected resource instead.
* **References to Pass/Fail States**: Avoid using words like "pass" or "fail".
## Risk Guidelines
### Writing Guidelines
1. **Explain the Cybersecurity Impact**: Focus on how the finding affects confidentiality, integrity, or availability (CIA triad). If the CIA triad does not apply, explain the risk in terms of the organization's business objectives.
2. **Be Specific About Threats**: Clearly state what could happen if this security control is not in place. What attacks or incidents become possible?
3. **Focus on Risk Context**: Explain the specific security implications of the finding, not just generic security risks.
4. **Use Markdown Formatting**: Enhance readability with markdown formatting:
* Use **bold** for emphasis on key security concepts.
* Use *italic* for a secondary emphasis. Use it for clarifications, conditions, or optional notes. But don't abuse it.
* Use `code` formatting for specific configuration values, or technical details. Don't use it for service names or common technical terms.
* Use one or two line breaks (`\n` or `\n\n`) to separate distinct ideas.
* Use bullet points (`-`) for listing multiple concepts or actions.
* Use numbers for listing steps or sequential actions.
5. **Be Concise**: Maximum 400 characters. Make every word count.
### Common Mistakes to Avoid
* **Generic Risks**: "This could lead to security issues" or "Regulatory compliance violations".
* **Technical Implementation Focus**: "The API call might fail and return incorrect results...".
* **Overly Broad Statements**: "This is a serious security risk that could impact everything".
* **Vague Threats**: "This could be exploited by threat actors" without explaining how.
## Recommendation Guidelines
### Writing Guidelines
1. **Provide Actionable Best Practice Guidance**: Explain what should be done to maintain security posture. Focus on preventive measures and proactive security practices.
2. **Be Principle-Based**: Reference established security principles (least privilege, defense in depth, zero trust, separation of duties) where applicable.
3. **Focus on Prevention**: Explain best practices that prevent the security issue from occurring, not just detection or remediation.
4. **Use Markdown Formatting**: Enhance readability with markdown formatting:
* Use **bold** for emphasis on key security concepts.
* Use *italic* for a secondary emphasis. Use it for clarifications, conditions, or optional notes. But don't abuse it.
* Use `code` formatting for specific configuration values, or technical details. Don't use it for service names or common technical terms.
* Use one or two line breaks (`\n` or `\n\n`) to separate distinct ideas.
* Use bullet points (`-`) for listing multiple concepts or actions.
* Use numbers for listing steps or sequential actions.
5. **Be Concise**: Maximum 400 characters.
### Common Mistakes to Avoid
* **Specific Remediation Steps**: "1. Go to the console\n2. Click on settings..." - Focus on principles, not click-by-click instructions.
* **Implementation Details**: "Configure the JSON policy with the following IAM actions..." - Explain what to achieve, not how.
* **Vague Guidance**: "Follow security best practices..." without explaining what those practices are.
* **Resource-Specific Recommendations**: "Enable MFA on user [john.doe@example.com](mailto:john.doe@example.com)" - Keep it general.
* **Missing Context**: Not explaining why the best practice is important for security.
### Good Examples
* *"Avoid exposing sensitive resources directly to the Internet; configure access controls to limit exposure."*
* *"Apply the principle of least privilege when assigning permissions to users and services."*
* *"Regularly review and update your security configurations to align with current best practices."*
## Remediation Code Guidelines
### Critical Requirement
The **fundamental principle** is to focus on the **specific change** that converts the finding from non-compliant to compliant.
Also is important to keep all code examples as short as possible, including the essential code to fix the issue. Remove any extra configuration, optional parameters, or nice-to-have settings and add comments to explain the code when possible.
### Common Guidelines for All Code Fields
1. **Be Minimal**: Keep code blocks as short as possible - only include what is absolutely necessary.
2. **Focus on the Fix**: Remove any extra configuration, optional parameters, or nice-to-have settings.
3. **Be Accurate**: Ensure all commands and code are syntactically correct.
4. **Use Markdown Formatting**: Format code properly using code blocks and appropriate syntax highlighting.
5. **Follow Best Practices**: Use the most secure and recommended approaches for each platform.
### CLI Guidelines
* Only provide a single command that directly changes the finding from fail to pass.
* The command must be executable as-is and resolve the security issue completely.
* Use proper command syntax for the provider (AWS CLI, Azure CLI, gcloud, kubectl, etc.).
* Do not use markdown formatting or code blocks - just the raw command.
* Do not include multiple commands, comments, or explanations.
* If the issue cannot be resolved with a single command, leave this field empty.
### Native IaC Guidelines
* **Keep It Minimal**: Only include the specific resource/configuration that fixes the security issue.
* Format as markdown code blocks with proper syntax highlighting.
* Include only the required properties to fix the issue.
* Add comments indicating the critical line(s) that remediate the check.
* Use `example_resource` as the generic name for all resources and IDs.
### Terraform Guidelines
* **Keep It Minimal**: Only include the specific resource/configuration that fixes the security issue.
* Provide valid HCL (HashiCorp Configuration Language) code with an example of a compliant configuration.
* Use the latest Terraform syntax and provider versions.
* Include only the required arguments to fix the issue - skip optional parameters.
* Format as markdown code blocks with `hcl` syntax highlighting.
* Add comments indicating the critical line(s) that remediate the check.
* Use `example_resource` as the generic name for all resources and IDs.
* Skip provider requirements unless critical for the fix.
### Other (Manual Steps) Guidelines
* **Keep It Minimal**: Only include the exact steps needed to fix the security issue.
* Provide step-by-step instructions for manual remediation through web interfaces.
* Use numbered lists for sequential steps.
* Be specific about menu locations, button names, and settings.
* Skip optional configurations or nice-to-have settings.
* Format using markdown for better readability.
## Categories Guidelines
### Selection Guidelines
1. **Be Specific**: Only select categories that directly relate to what the automated control evaluates.
2. **Primary Focus**: Consider the primary security concern the automated control addresses.
3. **Avoid Over-Categorization**: Do not select categories just because they are tangentially related.
### Available Categories
| Category | Definition |
| ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| encryption | Ensures data is encrypted in transit and/or at rest, including key management practices |
| internet-exposed | Checks that limit or flag public access to services, APIs, or assets from the Internet |
| logging | Ensures appropriate logging of events, activities, and system interactions for traceability |
| secrets | Manages and protects credentials, API keys, tokens, and other sensitive information |
| resilience | Ensures systems can maintain availability and recover from disruptions, failures, or degradation. Includes redundancy, fault-tolerance, auto-scaling, backup, disaster recovery, and failover strategies |
| threat-detection | Identifies suspicious activity or behaviors using IDS, malware scanning, or anomaly detection |
| trust-boundaries | Enforces isolation or segmentation between different trust levels (e.g., VPCs, tenants, network zones) |
| vulnerabilities | Detects or remediates known software, infrastructure, or config vulnerabilities (e.g., CVEs) |
| cluster-security | Secures Kubernetes cluster components such as API server, etcd, and role-based access |
| container-security | Ensures container images and runtimes follow security best practices |
| node-security | Secures nodes running containers or services |
| gen-ai | Checks related to safe and secure use of generative AI services or models |
| ci-cd | Ensures secure configurations in CI/CD pipelines |
| identity-access | Governs user and service identities, including least privilege, MFA, and permission boundaries |
| email-security | Ensures detection and protection against phishing, spam, spoofing, etc. |
| forensics-ready | Ensures systems are instrumented to support post-incident investigations. Any digital trace or evidence (logs, volume snapshots, memory dumps, network captures, etc.) preserved immutably and accompanied by integrity guarantees, which can be used in a forensic analysis |
| software-supply-chain | Detects or prevents tampering, unauthorized packages, or third-party risks in software supply chain |
| e3 | M365 and Azure Entra checks enabled by or dependent on an E3 license (e.g., baseline security policies, conditional access) |
| e5 | M365 and Azure Entra checks enabled by or dependent on an E5 license (e.g., advanced threat protection, audit, DLP, and eDiscovery) |
| privilege-escalation | Detects IAM policies or permissions that allow identities to elevate their privileges beyond their intended scope, potentially gaining administrator or higher-level access through specific action combinations |
| ec2-imdsv1 | Identifies EC2 instances using Instance Metadata Service version 1 (IMDSv1), which is vulnerable to SSRF attacks and should be replaced with IMDSv2 for enhanced security |
| vercel-hobby-plan | Vercel checks whose audited feature is available on the Hobby plan (and therefore also on Pro and Enterprise plans) |
| vercel-pro-plan | Vercel checks whose audited feature requires a Pro plan or higher, including features also available on Enterprise or via supported paid add-ons for Pro plans |
| vercel-enterprise-plan | Vercel checks whose audited feature requires the Enterprise plan |
# Prowler Checks
Source: https://docs.prowler.com/developer-guide/checks
This guide explains how to create new checks in Prowler.
## Introduction
Checks are the core component of Prowler. A check is a piece of code designed to validate whether a configuration aligns with cybersecurity best practices. Execution of a check yields a finding, which includes the result and contextual metadata (e.g., outcome, risks, remediation).
### Creating a Check
The most common high level steps to create a new check are:
1. Prerequisites:
* Verify the check does not already exist by searching [Prowler Hub](https://hub.prowler.com) or checking `prowler/providers//services///`.
* Ensure required provider and service exist. If not, follow the [Provider](/developer-guide/provider) and [Service](/developer-guide/services) documentation to create them.
* Confirm the service has implemented all required methods and attributes for the check (in most cases, you will need to add or modify some methods in the service to get the data you need for the check).
2. Navigate to the service directory. The path should be as follows: `prowler/providers//services/`.
3. Create a check-specific folder. The path should follow this pattern: `prowler/providers//services//`. Adhere to the [Naming Format for Checks](#naming-format-for-checks).
4. Populate the folder with files as specified in [File Creation](#file-creation).
5. Run the check locally to ensure it works as expected. For checking you can use the CLI in the next way:
* To ensure the check has been detected by Prowler: `uv run python prowler-cli.py --list-checks | grep `.
* To run the check, to find possible issues: `uv run python prowler-cli.py --log-level ERROR --verbose --check `.
6. Create comprehensive tests for the check that cover multiple scenarios including both PASS (compliant) and FAIL (non-compliant) cases. For detailed information about test structure and implementation guidelines, refer to the [Testing](/developer-guide/unit-testing) documentation.
7. If the check and its corresponding tests are working as expected, you can submit a PR to Prowler.
### Naming Format for Checks
If you already know the check name when creating a request or implementing a check, use a descriptive identifier with lowercase letters and underscores only.
Recommended patterns:
* `__`
The name components are:
* `service` – The main service or product area being audited (e.g., ec2, entra, iam, bedrock).
* `resource` – The resource, feature, or configuration being evaluated. It can be a single word or a compound phrase joined with underscores (e.g., instance, policy, guardrail, sensitive\_information\_filter).
* `best_practice` – The expected secure state or best practice being checked (e.g., enabled, encrypted, restricted, configured, not\_publicly\_accessible).
Additional guidance:
* Use underscores only. Do not use hyphens.
* Keep the name specific enough to describe the behavior of the check.
* The first segment should match the service or product area whenever possible.
Examples:
* `s3_bucket_versioning_enabled`
* `bedrock_guardrail_sensitive_information_filter_enabled`
### File Creation
Each check in Prowler follows a straightforward structure. Within the newly created folder, three files must be added to implement the check logic:
* `__init__.py` (empty file) – Ensures Python treats the check folder as a package.
* `.py` (code file) – Contains the check logic, following the prescribed format. Please refer to the [prowler's check code structure](/developer-guide/checks#prowlers-check-code-structure) for more information.
* `.metadata.json` (metadata file) – Defines the check's metadata for contextual information. Please refer to the [check metadata](/developer-guide/checks#metadata-structure-for-prowler-checks) for more information.
## Prowler's Check Code Structure
Prowler's check structure is designed for clarity and maintainability. It follows a dynamic loading approach based on predefined paths, ensuring seamless integration of new checks into a provider's service without additional manual steps.
Below the code for a generic check is presented. It is strongly recommended to consult other checks from the same provider and service to understand provider-specific details and patterns. This will help ensure consistency and proper implementation of provider-specific requirements.
Report fields are the most dependent on the provider, consult the `CheckReport` class for more information on what can be included in the report [here](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py).
Legacy providers (AWS, Azure, GCP, Kubernetes) follow the `Check_Report_` naming convention. This is not recommended for current instances. Newer providers adopt the `CheckReport` naming convention. Learn more at [Prowler Code](https://github.com/prowler-cloud/prowler/tree/master/prowler/lib/check/models.py).
```python title="Generic Check Class" theme={null}
# Required Imports
# Import the base Check class and the provider-specific CheckReport class
from prowler.lib.check.models import Check, CheckReport
# Import the provider service client
from prowler.providers..services.._client import _client
# Defining the Check Class
# Each check must be implemented as a Python class with the same name as its corresponding file.
# The class must inherit from the Check base class.
class (Check):
"""
Ensure that meets .
This check evaluates whether to ensure .
- PASS: .
- FAIL: .
"""
def execute(self):
"""Execute the check logic.
Returns:
A list of reports containing the result of the check.
"""
findings = []
# Iterate over the target resources using the provider service client
for resource in _client.:
# Initialize the provider-specific report class, passing metadata and resource
report = Check_Report_(metadata=self.metadata(), resource=resource)
# Set required fields and implement check logic
report.status = "PASS"
report.status_extended = f""
# If some of the information needed for the report is not inside the resource, it can be set it manually here.
# This depends on the provider and the resource that is being audited.
# report.region = resource.region
# report.resource_tags = getattr(resource, "tags", [])
# ...
# Example check logic (replace with actual logic):
if :
report.status = "FAIL"
report.status_extended = f""
findings.append(report)
return findings
```
### Data Requirements for Checks in Prowler
One of the most important aspects when creating a new check is ensuring that all required data is available from the service client. Often, default API calls are insufficient. Extending the service class with new methods or resource attributes may be required to fetch and store requisite data.
### Statuses for Checks in Prowler
Required Fields: status and status\_extended
Each check **must** populate the `report.status` and `report.status_extended` fields according to the following criteria:
* Status field: `report.status`
* `PASS` – Assigned when the check confirms compliance with the configured value.
* `FAIL` – Assigned when the check detects non-compliance with the configured value.
* `MANUAL` – This status must not be used unless manual verification is necessary to determine whether the status (`report.status`) passes (`PASS`) or fails (`FAIL`).
* Status extended field: `report.status_extended`
* It **must** end with a period (`.`).
* It **must** include the audited service, the resource, and a concise explanation of the check result, for instance: `EC2 AMI ami-0123456789 is not public.`.
### Prowler's Check Severity Levels
The severity of each check is defined in the metadata file using the `Severity` field. Severity values are always lowercase and must be one of the predefined categories below.
* `critical` – Highest potential impact with broad exposure that could affect core security boundaries or business operations.
* `high` – Substantial potential impact with significant exposure that could affect important security controls or resources.
* `medium` – Moderate potential impact with limited exposure that weakens defense layers but has contained scope.
* `low` – Minimal potential impact with negligible exposure that represents minor gaps in security posture.
* `informational` – Provides valuable information but does not affect the security posture.
If the check involves multiple scenarios that may alter its severity, adjustments can be made dynamically within the check's logic using the severity `report.check_metadata.Severity` attribute:
```python theme={null}
if :
report.status = "PASS"
report.check_metadata.Severity = "informational"
report.status_extended = f" is compliant with ."
elif :
report.status = "FAIL"
report.check_metadata.Severity = "low"
report.status_extended = f" is not compliant with : ."
elif :
report.status = "FAIL"
report.check_metadata.Severity = "medium"
report.status_extended = f" is not compliant with : ."
elif :
report.status = "FAIL"
report.check_metadata.Severity = "high"
report.status_extended = f" is not compliant with : ."
else:
report.status = "FAIL"
report.check_metadata.Severity = "critical"
report.status_extended = f" is not compliant with : ."
```
### Resource Identification in Prowler
Each check **must** populate the report with an unique identifier for the audited resource. This identifier or identifiers are going to depend on the provider and the resource that is being audited. Here are the criteria for each provider:
* AWS
* Amazon Resource ID — `report.resource_id`.
* The resource identifier. This is the name of the resource, the ID of the resource, or a resource path. Some resource identifiers include a parent resource (sub-resource-type/parent-resource/sub-resource) or a qualifier such as a version (resource-type:resource-name:qualifier).
* If the resource ID cannot be retrieved directly from the audited resource, it can be extracted from the ARN. It is the last part of the ARN after the last slash (`/`) or colon (`:`).
* If no actual resource to audit exists, this format can be used: `/unknown`
* Amazon Resource Name — `report.resource_arn`.
* The [Amazon Resource Name (ARN)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html) of the audited entity.
* If the ARN cannot be retrieved directly from the audited resource, construct a valid ARN using the `resource_id` component as the audited entity. Examples:
* Bedrock — `arn::bedrock:::model-invocation-logging`.
* DirectConnect — `arn::directconnect:::dxcon`.
* If no actual resource to audit exists, this format can be used: `arn:::::/unknown`.
* Examples:
* AWS Security Hub — `arn::security-hub:::hub/unknown`.
* Access Analyzer — `arn::access-analyzer:::analyzer/unknown`.
* GuardDuty — `arn::guardduty:::detector/unknown`.
* GCP
* Resource ID — `report.resource_id`.
* Resource ID represents the full, [unambiguous path to a resource](https://google.aip.dev/122#full-resource-names), known as the full resource name. Typically, it follows the format: `//{api_service/resource_path}`.
* If the resource ID cannot be retrieved directly from the audited resource, by default the resource name is used.
* Resource Name — `report.resource_name`.
* Resource Name usually refers to the name of a resource within its service.
* Azure
* Resource ID — `report.resource_id`.
* Resource ID represents the full Azure Resource Manager path to a resource, which follows the format: `/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/{resourceProviderNamespace}/{resourceType}/{resourceName}`.
* Resource Name — `report.resource_name`.
* Resource Name usually refers to the name of a resource within its service.
* If the [resource name](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/resource-name-rules) cannot be retrieved directly from the audited resource, the last part of the resource ID can be used.
* Kubernetes
* Resource ID — `report.resource_id`.
* The UID of the Kubernetes object. This is a system-generated string that uniquely identifies the object within the cluster for its entire lifetime. See [Kubernetes Object Names and IDs - UIDs](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#uids).
* Resource Name — `report.resource_name`.
* The name of the Kubernetes object. This is a client-provided string that must be unique for the resource type within a namespace (for namespaced resources) or cluster (for cluster-scoped resources). Names typically follow DNS subdomain or label conventions. See [Kubernetes Object Names and IDs - Names](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names).
* M365
* Resource ID — `report.resource_id`.
* If the audited resource has a globally unique identifier such as a `guid`, use it as the `resource_id`.
* If no `guid` exists, use another unique and relevant identifier for the resource, such as the tenant domain, the internal policy ID, or a representative string following the format `/`.
* Resource Name — `report.resource_name`.
* Use the visible or descriptive name of the audited resource. If no explicit name is available, use a clear description of the resource or configuration being evaluated.
* Examples:
* For an organization:
* `resource_id`: Organization GUID
* `resource_name`: Organization name
* For a policy:
* `resource_id`: Unique policy ID
* `resource_name`: Policy display name
* For global configurations:
* `resource_id`: Tenant domain or representative string (e.g., "userSettings")
* `resource_name`: Description of the configuration (e.g., "SharePoint Settings")
* GitHub
* Resource ID — `report.resource_id`.
* The ID of the Github resource. This is a system-generated integer that uniquely identifies the resource within the Github platform.
* Resource Name — `report.resource_name`.
* The name of the Github resource. In the case of a repository, this is just the repository name. For full repository names use the resource `full_name`.
### Configurable Checks in Prowler
See [Configurable Checks](/developer-guide/configurable-checks) for detailed information on making checks configurable using the `audit_config` object and configuration file.
## Metadata Structure for Prowler Checks
Each Prowler check must include a metadata file named `.metadata.json` that must be located in its directory. This file supplies crucial information for execution, reporting, and context.
### Example Metadata File
Below is a generic example of a check metadata file. **Do not include comments in actual JSON files.**
````json theme={null}
{
"Provider": "aws",
"CheckID": "service_resource_security_setting",
"CheckTitle": "Service resource has security setting enabled",
"CheckType": [],
"ServiceName": "service",
"SubServiceName": "",
"ResourceIdTemplate": "",
"Severity": "medium",
"ResourceType": "Other",
"ResourceGroup": "security",
"Description": "This check verifies that the service resource has the required **security setting** enabled to protect against potential vulnerabilities.\n\nIt ensures that the resource follows security best practices and maintains proper access controls. The check evaluates whether the security configuration is properly implemented and active.",
"Risk": "Without proper security settings, the resource may be vulnerable to:\n\n- **Unauthorized access** - Malicious actors could gain entry\n- **Data breaches** - Sensitive information could be compromised\n- **Security threats** - Various attack vectors could be exploited\n\nThis could result in compliance violations and potential financial or reputational damage.",
"RelatedUrl": "",
"AdditionalURLs": ["https://example.com/security-documentation", "https://example.com/best-practices"],
"Remediation": {
"Code": {
"CLI": "provider-cli service enable-security-setting --resource-id resource-123",
"NativeIaC": "```yaml\nType: Provider::Service::Resource\nProperties:\n SecuritySetting: enabled\n ResourceId: resource-123\n```",
"Other": "1. Open the provider management console\n2. Navigate to the service section\n3. Select the resource\n4. Enable the security setting\n5. Save the configuration",
"Terraform": "```hcl\nresource \"provider_service_resource\" \"example\" {\n resource_id = \"resource-123\"\n security_setting = true\n}\n```"
},
"Recommendation": {
"Text": "Enable security settings on all service resources to ensure proper protection. Regularly review and update security configurations to align with current best practices.",
"Url": "https://hub.prowler.com/check/service_resource_security_setting"
}
},
"Categories": ["internet-exposed", "secrets"],
"DependsOn": [],
"RelatedTo": ["service_resource_security_setting", "service_resource_security_setting_2"],
"Notes": "This is a generic example check that should be customized for specific provider and service requirements."
}
````
### Metadata Fields and Their Purpose
#### Provider
The Prowler provider related to the check. The name **must** be lowercase and match the provider folder name. For supported providers refer to [Prowler Hub](https://hub.prowler.com/check) or directly to [Prowler Code](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers).
#### CheckID
The unique identifier for the check inside the provider. This field **must** match the check's folder, Python file, and JSON metadata file name. For more information about naming, refer to the [Naming Format for Checks](#naming-format-for-checks) section.
#### CheckTitle
The `CheckTitle` field must be plain text, clearly and succinctly define **the best practice being evaluated and which resource(s) each finding applies to**. The title should be specific, concise (no more than 150 characters), and reference the relevant resource(s) involved.
**Always write the `CheckTitle` to describe the *PASS* case**, the desired secure or compliant state of the resource(s). This helps ensure that findings are easy to interpret and that the title always reflects the best practice being met.
For detailed guidelines on writing effective check titles, including how to determine singular vs. plural scope and common mistakes to avoid, see [Check Title Guidelines](/developer-guide/check-metadata-guidelines#check-title-guidelines).
#### CheckType
This field is only applicable to the AWS provider.
It follows the [AWS Security Hub Types](https://docs.aws.amazon.com/securityhub/latest/userguide/asff-required-attributes.html#Types) format using the pattern `namespace/category/classifier`.
For the complete AWS Security Hub selection guidelines, see [Check Type Guidelines](/developer-guide/check-metadata-guidelines#check-type-guidelines-aws-only).
#### ServiceName
The name of the provider service being audited. Must be lowercase and match the service folder name. For supported services refer to [Prowler Hub](https://hub.prowler.com/check) or the [Prowler Code](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers).
#### SubServiceName
This field is in the process of being deprecated and should be **left empty**.
#### ResourceIdTemplate
This field is in the process of being deprecated and should be **left empty**.
#### Severity
Severity level if the check fails. Must be one of: `critical`, `high`, `medium`, `low`, or `informational`, and written in lowercase. See [Prowler's Check Severity Levels](#prowlers-check-severity-levels) for details.
#### ResourceType
The type of resource being audited. This field helps categorize and organize findings by resource type for better analysis and reporting. For each provider:
* **AWS**: Use [Security Hub resource types](https://docs.aws.amazon.com/securityhub/latest/userguide/asff-resources.html) or PascalCase CloudFormation types removing the `::` separator used in CloudFormation templates (e.g., in CloudFormation template the type of an EC2 instance is `AWS::EC2::Instance` but in the check it should be `AwsEc2Instance`). Use `Other` if none apply.
* **Azure**: Use types from [Azure Resource Graph](https://learn.microsoft.com/en-us/azure/governance/resource-graph/reference/supported-tables-resources), for example: `Microsoft.Storage/storageAccounts`.
* **Google Cloud**: Use [Cloud Asset Inventory asset types](https://cloud.google.com/asset-inventory/docs/asset-types), for example: `compute.googleapis.com/Instance`.
* **Kubernetes**: Use types shown under `KIND` from `kubectl api-resources`.
* **Oracle Cloud Infrastructure**: Use types from [Oracle Cloud Infrastructure documentation](https://docs.public.oneportal.content.oci.oraclecloud.com/en-us/iaas/Content/Search/Tasks/queryingresources_topic-Listing_Supported_Resource_Types.htm).
* **OpenStack**: Use types from [OpenStack Heat resource types](https://docs.openstack.org/heat/latest/template_guide/openstack.html).
* **Alibaba Cloud**: Use types from [Alibaba Cloud ROS resource types](https://www.alibabacloud.com/help/en/ros/developer-reference/list-of-resource-types-by-service).
* **Any other provider**: Use `NotDefined` due to lack of standardized resource types in their SDK or documentation.
#### ResourceGroup
A high-level classification that groups checks by the type of cloud resource they audit. This field enables filtering and organizing findings by resource category across all providers. The value must be one of the following predefined groups:
| Group | Description |
| --------------- | ----------------------------------------------------------------------------- |
| `compute` | Virtual machines, instances, auto-scaling groups, workspaces, streaming |
| `container` | Container orchestration, Kubernetes, registries, pods |
| `serverless` | Functions, step functions, event-driven compute |
| `database` | Relational, NoSQL, caches, search engines, data warehouses, graph databases |
| `storage` | Object storage, block storage, file systems, backups, archives |
| `network` | VPCs, subnets, load balancers, DNS, VPN, firewalls, CDN |
| `IAM` | IAM users, roles, policies, access keys, service accounts, directories |
| `messaging` | Queues, topics, event buses, streaming, email services |
| `security` | WAF, secrets, KMS, certificates, security tools, defenders, DDoS protection |
| `monitoring` | Logs, metrics, alerts, audit trails, observability, config tracking |
| `api_gateway` | API management, REST APIs, GraphQL endpoints |
| `ai_ml` | Machine learning, AI services, notebooks, training, LLM |
| `governance` | Accounts, organizations, projects, policies, settings, compliance tools |
| `collaboration` | Productivity SaaS apps (Exchange, Teams, SharePoint) |
| `devops` | CI/CD, infrastructure as code, automation, code repositories, version control |
| `analytics` | Data warehouses, query engines, ETL pipelines, BI tools, data lakes |
The group is determined by the resource type being audited, not the service. For example, an EC2 security group check would use `network` (not `compute`), while an EC2 instance check would use `compute`.
#### Description
A concise, natural language explanation that **clearly describes what the finding means**, focusing on clarity and context rather than technical implementation details. Use simple paragraphs with line breaks if needed, but avoid sections, code blocks, or complex formatting. This field is limited to maximum 400 characters.
For detailed writing guidelines and common mistakes to avoid, see [Description Guidelines](/developer-guide/check-metadata-guidelines#description-guidelines).
#### Risk
A clear, natural language explanation of **why this finding poses a cybersecurity risk**. Focus on how it may impact confidentiality, integrity, or availability. If those do not apply, describe any relevant operational or financial risks. Use simple paragraphs with line breaks if needed, but avoid sections, code blocks, or complex formatting. Limit your explanation to 400 characters.
For detailed writing guidelines and common mistakes to avoid, see [Risk Guidelines](/developer-guide/check-metadata-guidelines#risk-guidelines).
#### RelatedUrl
*Deprecated*. Use `AdditionalURLs` for adding your URLs references.
#### AdditionalURLs
URLs must be valid and not repeated.
A list of official documentation URLs for further reading. These should be authoritative sources that provide additional context, best practices, or detailed information about the security control being checked. Prefer official provider documentation, security standards, or well-established security resources. Avoid third-party blogs or unofficial sources unless they are highly reputable and directly relevant.
#### Remediation
Provides both code examples and best practice recommendations for addressing the security issue.
* **Code**: Contains remediation examples in different formats:
* **CLI**: Command-line interface commands to make the finding compliant in runtime.
* **NativeIaC**: Native Infrastructure as Code templates with an example of a compliant configuration. For now it applies to:
* **AWS**: CloudFormation YAML formatted code (do not use JSON format).
* **Azure**: Bicep formatted code (do not use ARM templates).
* **Terraform**: HashiCorp Configuration Language (HCL) code with an example of a compliant configuration.
* **Other**: Manual steps through web interfaces or other tools to make the finding compliant.
For detailed guidelines on writing remediation code, see [Remediation Code Guidelines](/developer-guide/check-metadata-guidelines#remediation-code-guidelines).
* **Recommendation**
* **Text**: Generic best practice guidance in natural language using Markdown format (maximum 400 characters). For writing guidelines, see [Recommendation Guidelines](/developer-guide/check-metadata-guidelines#recommendation-guidelines).
* **Url**: [Prowler Hub URL](https://hub.prowler.com/) of the check. This URL is always composed by `https://hub.prowler.com/check/`.
#### Categories
One or more functional groupings used for execution filtering (e.g., `internet-exposed`). Categories must match the predefined values enforced by `CheckMetadata`; adding a new category requires updating the validator and the metadata documentation.
For the complete list of available categories, see [Categories Guidelines](/developer-guide/check-metadata-guidelines#categories-guidelines).
#### DependsOn
List of check IDs of checks that if are compliant, this check will be a compliant too or it is not going to give any finding.
#### RelatedTo
List of check IDs of checks that are conceptually related, even if they do not share a technical dependency.
#### Notes
Any additional information not covered in the above fields.
### Python Model Reference
The metadata structure is enforced in code using a Pydantic model. For reference, see the [`CheckMetadata`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py).
## Generic Check Patterns and Best Practices
### Common Patterns
* Every check is implemented as a class inheriting from `Check` (from `prowler.lib.check.models`).
* The main logic is implemented in the `execute()` method (**only method that must be implemented**), which always returns a list of provider-specific report objects (e.g., `CheckReport`)—one per finding/resource. If there are no findings/resources, return an empty list.
* **Never** use the provider's client directly; instead, use the service client (e.g., `_client`) and iterate over its resources.
* For each resource, create a provider-specific report object, populate it with metadata, resource details, status (`PASS`, `FAIL`, etc.), and a human-readable `status_extended` message.
* Use the `metadata()` method to attach check metadata to each report.
* Checks are designed to be idempotent and stateless: they do not modify resources, only report on their state.
### Best Practices
* Use clear, actionable, and user-friendly language in `status_extended` to explain the result. Always provide information to identify the resource.
* Use helper functions/utilities for repeated logic to avoid code duplication. Save them in the `lib` folder of the service.
* Handle exceptions gracefully: catch errors per resource, log them, and continue processing other resources.
* Document the check with a class and function level docstring explaining what it does, what it checks, and any caveats or provider-specific behaviors.
* Use type hints for the `execute()` method (e.g., `-> list[CheckReport]`) for clarity and static analysis.
* Ensure checks are efficient; avoid excessive nested loops. If the complexity is high, consider refactoring the check.
* Keep the check logic focused: one check = one control/requirement. Avoid combining unrelated logic in a single check.
## Specific Check Patterns
Details for specific providers can be found in documentation pages named using the pattern `-details`.
Checks that scan resources for plaintext secrets follow a dedicated batched structure. Refer to [Secret-Scanning Checks](/developer-guide/secret-scanning-checks) before creating or updating one.
# Configurable Checks in Prowler
Source: https://docs.prowler.com/developer-guide/configurable-checks
Prowler empowers users to extend and adapt cloud security coverage by making checks configurable through the use of the `audit_config` object. This approach enables customization of checks to meet specific requirements through a configuration file.
## Understanding the `audit_config` Object
The `audit_config` object is a dictionary attached to each provider's service client (for example, `_client.audit_config`). This object loads configuration values from the main configuration file (`prowler/config/config.yaml`). Use `audit_config` to make checks flexible and user-configurable.
## Using `audit_config` to Configure Checks
Retrieve configuration values in a check by using the `.get()` method on the `audit_config` object. For example, to get the minimum number of Availability Zones for Lambda from the configuration file, use the following code. If the value is not set in the configuration, the check defaults to 2:
```python theme={null}
LAMBDA_MIN_AZS = awslambda_client.audit_config.get("lambda_min_azs", 2)
```
Always provide a default value in `.get()` to ensure the check works even if the configuration is missing the variable.
### Example: Security Group Rule Limit
```python title="ec2_securitygroup_with_many_ingress_egress_rules.py" theme={null}
class ec2_securitygroup_with_many_ingress_egress_rules(Check):
def execute(self):
findings = []
max_security_group_rules = ec2_client.audit_config.get(
"max_security_group_rules", 50
)
for security_group_arn, security_group in ec2_client.security_groups.items():
# ... check logic ...
```
## Required File Updates for Configurable Variables
When adding a new configurable check to Prowler, update the following files:
* **Configuration File:** Add the new variable under the relevant provider or service section in `prowler/config/config.yaml`.
```yaml theme={null}
# aws.awslambda_function_vpc_multi_az
lambda_min_azs: 2
```
* **Provider Schema:** Add the typed field to the provider's Pydantic schema in `prowler/config/schema/.py`. This is required: the loader validates user configs against these schemas and the shipped `config.yaml` must round-trip with zero warnings. See [Adding a Parameter to the Provider Schema](#adding-a-parameter-to-the-provider-schema) below.
* **Test Fixtures:** If tests depend on this configuration, add the variable to `tests/config/fixtures/config.yaml`.
* **Documentation:** Document the new variable in the list of configurable checks in [Configuration File](/user-guide/cli/tutorials/configuration_file) (`docs/user-guide/cli/tutorials/configuration_file.mdx`).
For a complete list of checks that already support configuration, see the [Configuration File Tutorial](/user-guide/cli/tutorials/configuration_file).
Because a configurable check's verdict depends on the `audit_config` value it reads, a compliance requirement can lose meaning if the scan ran with a looser threshold than the control demands. Compliance frameworks can guard against this with **configuration guardrails**: a requirement declares the strictest configuration it tolerates and is forced to FAIL when the scan's config falls short. See [Configuration Guardrails for Requirements](/developer-guide/security-compliance-framework#configuration-guardrails-for-requirements).
## Adding a Parameter to the Provider Schema
Most providers have a typed Pydantic schema in `prowler/config/schema/`, registered in `prowler/config/schema/registry.py`. When a config is loaded and the provider has a registered schema, `validate_provider_config` checks each user-supplied key against it, logs a warning, and drops any field that fails validation. The consumer's `.get(key, default)` then falls back to the built-in default. Providers without a registered schema are passed through unchanged.
This catches typos in a value (for example, `0.2` typed as `20`, or `"medium"` for an enum that expects `"MEDIUM"`). It does NOT catch typos in a key name: `disalowed_regions` (one `l` missing) is treated as an unknown key and passes through untouched, because third-party check plugins legitimately rely on unknown keys being preserved. Reviewers should still check that any new key the YAML adds is named exactly the same as the field on the schema.
### Where to Add the Field
1. Open `prowler/config/schema/.py` (for example, `aws.py`).
2. Add a field on the provider's schema class. Always make it `Optional[...] = None` so the absence of the key is valid.
3. Apply the tightest type the value allows. Examples below.
If you are introducing an entirely new provider rather than a new parameter, also add an entry mapping the provider name to its schema class in `prowler/config/schema/registry.py`. The loader uses that registry to find the schema for the provider it is loading.
### Choosing the Right Type
| Value kind | Field declaration |
| ---------------------------------------- | ------------------------------------------------------- |
| Boolean toggle | `Optional[bool] = None` |
| Strictly positive integer (days, counts) | `Optional[int] = Field(default=None, gt=0)` |
| Fraction in 0..1 (threshold) | `Optional[float] = Field(default=None, ge=0.0, le=1.0)` |
| Closed set of strings | `Optional[Literal["A", "B", "C"]] = None` |
| Free-form string | `Optional[str] = None` |
| List of strings or ints | `Optional[list[str]] = None` |
Prefer `Literal[...]` over `str` whenever the value is one of a known set. Prefer `Field(gt=0)` over `int` whenever zero or negative would be nonsensical. The point of the schema is to catch real-world mistakes that previously passed silently.
### Custom Validators (Only When Needed)
If the value has structural rules beyond type and range, add a `field_validator`. Examples already in `aws.py`:
* `_validate_port_range` rejects ports outside `0..65535`.
* `_validate_account_ids` rejects anything that isn't a 12-digit AWS account ID.
* `_validate_trusted_ips` rejects entries that aren't a valid IP or CIDR.
Raise `ValueError` from the validator. The framework converts the error into a warning and drops the offending key.
### Example: Adding a New Parameter
Say a new check needs `max_iam_role_session_hours`, a strictly positive integer that defaults to 12 in code.
1. **Schema** (`prowler/config/schema/aws.py`):
```python theme={null}
# IAM
max_iam_role_session_hours: Optional[int] = Field(default=None, gt=0)
```
2. **Shipped config** (`prowler/config/config.yaml`):
```yaml theme={null}
# aws.iam_role_session_duration_within_limit
max_iam_role_session_hours: 12
```
3. **Consumer** (the check):
```python theme={null}
max_hours = iam_client.audit_config.get("max_iam_role_session_hours", 12)
```
4. **Tests** in `tests/config/schema/aws_schema_test.py`:
* one test for a valid value that round-trips,
* one test for an invalid value (zero, negative, wrong type) that is dropped.
### What the Loader Guarantees
* **Unknown keys pass through.** Third-party check plugins can introduce arbitrary keys without schema edits; they will not be filtered.
* **Invalid values never crash the run.** They produce a single warning per field and the key is dropped.
* **Coerced values are normalized.** A YAML-quoted `"180"` for an `int` field arrives downstream as the integer `180`.
* **The shipped `config.yaml` must round-trip cleanly.** The integration test `test_shipped_default_config_loads_without_warnings` will fail if a key is added to the YAML without a matching schema field, so the two stay in sync.
## Configuration Value Limits
Configurable thresholds enforce hard limits. A value outside the documented range is **dropped with a warning** and the check falls back to its built-in default (the same as if the key were absent). These bounds are intentionally conservative: they are not the absolute service maxima but the range that still produces a meaningful security check.
Use this section as the reference when upgrading an existing config: if a value you set is being rejected, it is outside the range below.
Only fields with a numeric range, a fixed value set, or a length cap are listed. Fields typed as free-form strings or lists (for example `disallowed_regions`, `secrets_ignore_patterns`, `trusted_account_ids`) have no range limit — they are validated for shape only (a 12-digit account ID, a valid IP/CIDR, a dotted version string), not for magnitude.
### AWS
| Key | Allowed range | Notes |
| ------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ |
| `max_unused_access_keys_days` | `30..180` days | CIS AWS 1.13 recommends 45; NIST IA-5 ≤90 |
| `max_console_access_days` | `30..180` days | CIS AWS 1.12 recommends 45 |
| `max_unused_sagemaker_access_days` | `7..180` days | |
| `max_security_group_rules` | `1..1000` | AWS hard limit is 1000 rules per security group |
| `max_ec2_instance_age_in_days` | `1..1095` days | 3 years |
| `ec2_high_risk_ports` | each port `1..65535` | port 0 is reserved |
| `max_idle_disconnect_timeout_in_seconds` | `60..1800` s | NIST AC-12: cap at 30 min |
| `max_disconnect_timeout_in_seconds` | `60..3600` s | |
| `max_session_duration_seconds` | `600..86400` s | 10 min .. 24 h (AppStream per-session hard limit) |
| `lambda_min_azs` | `1..6` | |
| `recommended_cdk_bootstrap_version` | `1..100` | |
| `log_group_retention_days` | one of `1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1096, 1827, 2192, 2557, 2922, 3288, 3653` | only the CloudWatch Logs API-accepted retention values |
| `threat_detection_privilege_escalation_threshold` | `0.0..1.0` | fraction of suspicious actions |
| `threat_detection_privilege_escalation_minutes` | `5..43200` min | under 5 min the signal is mostly false positives |
| `threat_detection_enumeration_threshold` | `0.0..1.0` | |
| `threat_detection_enumeration_minutes` | `5..43200` min | |
| `threat_detection_llm_jacking_threshold` | `0.0..1.0` | |
| `threat_detection_llm_jacking_minutes` | `5..43200` min | |
| `days_to_expire_threshold` (ACM) | `7..365` days | PCI-DSS 4.2.1.1: alert ≥30 days before expiry |
| `elb_min_azs` | `1..6` | |
| `elbv2_min_azs` | `1..6` | |
| `minimum_snapshot_retention_period` | `1..35` days | ElastiCache service hard limit |
| `max_days_secret_unused` | `7..365` days | |
| `max_days_secret_unrotated` | `1..180` days | NIST IA-5: rotate quarterly; CIS ≤90 |
| `min_kinesis_stream_retention_hours` | `24..8760` h | 1 day .. 1 year |
| `shodan_api_key` | ≤512 chars | |
### Azure
| Key | Allowed range | Notes |
| --------------------------------------------- | -------------- | -------------------------------------------------------------------- |
| `vm_backup_min_daily_retention_days` | `7..9999` days | Azure Backup hard limit; under 7 days defeats DR/ransomware recovery |
| `apim_threat_detection_llm_jacking_threshold` | `0.0..1.0` | fraction of suspicious actions |
| `apim_threat_detection_llm_jacking_minutes` | `5..43200` min | under 5 min the signal is mostly false positives |
| `shodan_api_key` | ≤512 chars | |
### GCP
| Key | Allowed range | Notes |
| ---------------------------- | -------------- | ------- |
| `mig_min_zones` | `1..5` | |
| `max_snapshot_age_days` | `1..1095` days | 3 years |
| `max_unused_account_days` | `30..365` days | |
| `storage_min_retention_days` | `1..3650` days | |
| `shodan_api_key` | ≤512 chars | |
### Kubernetes
| Key | Allowed range | Notes |
| --------------------- | -------------- | ----------------------------------------- |
| `audit_log_maxbackup` | `2..1000` | CIS Kubernetes 1.2.18 recommends ≥10 |
| `audit_log_maxsize` | `10..10000` MB | CIS Kubernetes 1.2.19 recommends ≥100 MB |
| `audit_log_maxage` | `7..3650` days | CIS Kubernetes 1.2.17 recommends ≥30 days |
### M365
| Key | Allowed range | Notes |
| ----------------------------------------------- | --------------- | ----------------------------------------------------------- |
| `sign_in_frequency` | `1..168` h | 1 h .. 7 days; Conditional Access baseline for admins ≤24 h |
| `recommended_mailtips_large_audience_threshold` | `5..10000` | Microsoft default 25 |
| `audit_log_age` | `30..3650` days | M365 E3 default 90 days; SEC/FINRA require ≥7 years |
### GitHub
| Key | Allowed range | Notes |
| -------------------------------------- | --------------- | ------------------------- |
| `inactive_not_archived_days_threshold` | `30..3650` days | CIS GitHub recommends 180 |
### Cloudflare
| Key | Allowed range | Notes |
| ------------- | ------------- | ------------------ |
| `max_retries` | `0..10` | 0 disables retries |
### MongoDB Atlas
| Key | Allowed range | Notes |
| ------------------------------------------- | ------------- | -------------- |
| `max_service_account_secret_validity_hours` | `1..720` h | 1 h .. 30 days |
### Vercel
| Key | Allowed range | Notes |
| --------------------------------- | --------------- | -------------------------------------------------- |
| `days_to_expire_threshold` | `7..365` days | PCI-DSS 4.2.1.1: alert ≥30 days before expiry |
| `stale_token_threshold_days` | `30..3650` days | NIST AC-2(3) typical window 30..90 days |
| `stale_invitation_threshold_days` | `7..365` days | |
| `max_owner_percentage` | `1..50` % | guidance recommends ≤25% |
| `max_owners` | `1..1000` | absolute cap, overrides percentage for large teams |
These bounds live in the provider schemas under `prowler/config/schema/`; each field's `Field(ge=..., le=...)` (or `field_validator`) is the source of truth and the descriptions there carry the full rationale.
This approach ensures that checks are easily configurable, making Prowler highly adaptable to different environments and requirements.
# Debugging in Prowler
Source: https://docs.prowler.com/developer-guide/debugging
Debugging in Prowler simplifies the development process, allowing developers to efficiently inspect and resolve unexpected issues during execution.
## Debugging with Visual Studio Code
Visual Studio Code (also referred to as VSCode) provides an integrated debugger for executing and analyzing Prowler code. Refer to the official VSCode debugger [documentation](https://code.visualstudio.com/docs/editor/debugging) for detailed instructions.
### Debugging Configuration Example
The following file is an example of a [debugging configuration](https://code.visualstudio.com/docs/editor/debugging#_launch-configurations) file for [Virtual Studio Code](https://code.visualstudio.com/).
This file must be placed inside the *.vscode* directory and named *launch.json*:
```json theme={null}
{
"version": "0.2.0",
"configurations": [
{
"name": "Debug AWS Check",
"type": "debugpy",
"request": "launch",
"program": "prowler-cli.py",
"args": [
"aws",
"--log-level",
"ERROR",
"-c",
""
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "Debug Azure Check",
"type": "debugpy",
"request": "launch",
"program": "prowler-cli.py",
"args": [
"azure",
"--sp-env-auth",
"--log-level",
"ERROR",
"-c",
""
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "Debug GCP Check",
"type": "debugpy",
"request": "launch",
"program": "prowler-cli.py",
"args": [
"gcp",
"--log-level",
"ERROR",
"-c",
""
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "Debug K8s Check",
"type": "debugpy",
"request": "launch",
"program": "prowler-cli.py",
"args": [
"kubernetes",
"--log-level",
"ERROR",
"-c",
""
],
"console": "integratedTerminal",
"justMyCode": false
}
]
}
```
# Contributing to Documentation
Source: https://docs.prowler.com/developer-guide/documentation
Prowler documentation is built using [Mintlify](https://www.mintlify.com/docs), allowing contributors to easily add or enhance documentation.
## Documentation Structure
The Prowler documentation is organized into several sections. The main ones are:
* **Getting Started**: Provides an overview of the Prowler platform and its different solutions, including Prowler Cloud/App, Prowler CLI, Prowler MCP Server, Prowler Hub, and Prowler Lighthouse AI. This section helps new users understand which Prowler solution best fits their needs and includes product comparisons.
* **Guides**: Contains practical tutorials and how-to guides organized by product (Prowler Cloud/App, CLI) and provider (AWS, Azure, GCP, Kubernetes, Microsoft 365, GitHub, etc.). This section covers authentication, integrations, compliance, and advanced usage scenarios.
* **Developer Guide**: Documentation for contributors looking to extend Prowler functionality. This includes guides on creating providers, services, checks, output formats, integrations, and compliance frameworks. Provider-specific implementation details and testing strategies are also covered here.
* **Troubleshooting**: Common issues, error messages, and their solutions. This section helps users resolve problems encountered during installation, configuration, or execution.
## AI-Driven Documentation
As mentioned in the [Introduction](/developer-guide/introduction#ai-driven-contributions), we have specialized resources to enhance AI-driven development.
This includes the [AGENTS.md](https://github.com/prowler-cloud/prowler/blob/master/docs/AGENTS.md) file that contains the guidelines and style guide for the AI agents in the Prowler documentation.
## Local Development
```bash theme={null}
npm install --global mint@4.2.560
```
For detailed instructions, check the [Mintlify documentation](https://www.mintlify.com/docs/installation).
Start the local development server to preview changes in real-time.
```bash theme={null}
mint dev
```
A local preview of your documentation will be available at [http://localhost:3000](http://localhost:3000)
Edit existing Markdown (.mdx) files inside the `docs` directory or add new documents.
For reference about formatting, check the [Mintlify documentation](https://www.mintlify.com/docs/create/text).
To add new sections or files, update the [`docs/docs.json`](https://github.com/prowler-cloud/prowler/blob/master/docs/docs.json) file to include them in the navigation.
Once documentation updates are complete, [submit a pull request for review](/developer-guide/introduction#sending-the-pull-request).
The Prowler team will assess and merge contributions.
Your efforts help improve Prowler documentation. Thank you for contributing! 🤘
# End-2-End Tests for Prowler App
Source: https://docs.prowler.com/developer-guide/end2end-testing
End-to-end (E2E) tests validate complete user flows in Prowler App (UI + API). These tests are implemented with [Playwright](https://playwright.dev/) under the `ui/tests` folder and are designed to run against a Prowler App environment.
## General Recommendations
When adding or maintaining E2E tests for Prowler App, follow these guidelines:
1. **Test real user journeys**
Focus on full workflows (for example, sign-up → login → add provider → launch scan) instead of low-level UI details already covered by unit or integration tests.
2. **Group tests by entity or feature area**
* Organize E2E tests by entity or feature area (for example, `providers.spec.ts`, `scans.spec.ts`, `invitations.spec.ts`, `sign-up.spec.ts`).
* Each entity should have its own test file and corresponding page model class (for example, `ProvidersPage`, `ScansPage`, `InvitationsPage`).
* Related tests for the same entity should be grouped together in the same test file to improve maintainability and make it easier to find and update tests for a specific feature.
3. **Use a Page Model (Page Object Model)**
* Encapsulate selectors and common actions in page classes instead of repeating them in each test.
* Leverage and extend the existing Playwright page models in `ui/tests`—such as `ProvidersPage`, `ScansPage`, and others—which are all based on the shared `BasePage`.
* Page models for Prowler App pages should be placed in their respective entity folders (for example, `ui/tests/providers/providers-page.ts`).
* Page models for external pages (not part of Prowler App) should be grouped in the `external` folder (for example, `ui/tests/external/github-page.ts`).
* This approach improves readability, reduces duplication, and makes refactors safer.
4. **Reuse authentication states (StorageState)**
* Multiple authentication setup projects are available that generate pre-authenticated state files stored in `playwright/.auth/`. Each project requires specific environment variables:
* `admin.auth.setup` – Admin users with full system permissions (requires `E2E_ADMIN_USER` / `E2E_ADMIN_PASSWORD`)
* `manage-scans.auth.setup` – Users with scan management permissions (requires `E2E_MANAGE_SCANS_USER` / `E2E_MANAGE_SCANS_PASSWORD`)
* `manage-integrations.auth.setup` – Users with integration management permissions (requires `E2E_MANAGE_INTEGRATIONS_USER` / `E2E_MANAGE_INTEGRATIONS_PASSWORD`)
* `manage-account.auth.setup` – Users with account management permissions (requires `E2E_MANAGE_ACCOUNT_USER` / `E2E_MANAGE_ACCOUNT_PASSWORD`)
* `manage-cloud-providers.auth.setup` – Users with cloud provider management permissions (requires `E2E_MANAGE_CLOUD_PROVIDERS_USER` / `E2E_MANAGE_CLOUD_PROVIDERS_PASSWORD`)
* `unlimited-visibility.auth.setup` – Users with unlimited visibility permissions (requires `E2E_UNLIMITED_VISIBILITY_USER` / `E2E_UNLIMITED_VISIBILITY_PASSWORD`)
* `invite-and-manage-users.auth.setup` – Users with user invitation and management permissions (requires `E2E_INVITE_AND_MANAGE_USERS_USER` / `E2E_INVITE_AND_MANAGE_USERS_PASSWORD`)
If fixtures have been applied (fixtures are used to populate the database with initial development data), you can use the user `e2e@prowler.com` with password `Thisisapassword123@` to configure the Admin credentials by setting `E2E_ADMIN_USER=e2e@prowler.com` and `E2E_ADMIN_PASSWORD=Thisisapassword123@`.
* Within test files, use `test.use({ storageState: "playwright/.auth/admin_user.json" })` to load the pre-authenticated state, avoiding redundant authentication steps in each test. This must be placed at the test level (not inside the test function) to apply the authentication state to all tests in that scope. This approach is preferred over declaring dependencies in `playwright.config.ts` because it provides more control over which authentication states are used in specific tests.
**Example:**
```typescript theme={null}
// Use admin authentication state for all tests in this scope
test.use({ storageState: "playwright/.auth/admin_user.json" });
test("should perform admin action", async ({ page }) => {
// Test implementation
});
```
5. **Tag and document scenarios**
* Follow the existing naming convention for suites and test cases (for example, `SCANS-E2E-001`, `PROVIDER-E2E-003`) and use tags such as `@e2e`, `@serial` and feature tags (for example, `@providers`, `@scans`,`@aws`) to filter and organize tests.
**Example:**
```typescript theme={null}
test(
"should add a new AWS provider with static credentials",
{
tag: [
"@critical",
"@e2e",
"@providers",
"@aws",
"@serial",
"@PROVIDER-E2E-001",
],
},
async ({ page }) => {
// Test implementation
}
);
```
* Document each one in the Markdown files under `ui/tests`, including **Priority**, **Tags**, **Description**, **Preconditions**, **Flow steps**, **Expected results**,**Key verification points** and **Notes**.
**Example**
```Markdown theme={null}
## Test Case: `SCANS-E2E-001` - Execute On-Demand Scan
**Priority:** `critical`
**Tags:**
- type → @e2e, @serial
- feature → @scans
**Description/Objective:** Validates the complete flow to execute an on-demand scan selecting a provider by UID and confirming success on the Scans page.
**Preconditions:**
- Admin user authentication required (admin.auth.setup setup)
- Environment variables configured for : E2E_AWS_PROVIDER_ACCOUNT_ID,E2E_AWS_PROVIDER_ACCESS_KEY and E2E_AWS_PROVIDER_SECRET_KEY
- Remove any existing AWS provider with the same Account ID before starting the test
- This test must be run serially and never in parallel with other tests, as it requires the Account ID Provider to be already registered.
### Flow Steps:
1. Navigate to Scans page
2. Open provider selector and choose the entry whose text contains E2E_AWS_PROVIDER_ACCOUNT_ID
3. Optionally fill scan label (alias)
4. Click "Start now" to launch the scan
5. Verify the success toast appears
6. Verify a row in the Scans table contains the provided scan label (or shows the new scan entry)
### Expected Result:
- Scan is launched successfully
- Success toast is displayed to the user
- Scans table displays the new scan entry (including the alias when provided)
### Key verification points:
- Scans page loads correctly
- Provider select is available and lists the configured provider UID
- "Start now" button is rendered and enabled when form is valid
- Success toast message: "The scan was launched successfully."
- Table contains a row with the scan label or new scan state (queued/available/executing)
### Notes:
- The table may take a short time to reflect the new scan; assertions look for a row containing the alias.
- Provider cleanup performed before each test to ensure clean state
- Tests should run serially to avoid state conflicts.
```
6. **Use environment variables for secrets and dynamic data**
Credentials, provider identifiers, secrets, tokens must come from environment variables (for example, `E2E_AWS_PROVIDER_ACCOUNT_ID`, `E2E_AWS_PROVIDER_ACCESS_KEY`, `E2E_AWS_PROVIDER_SECRET_KEY`, `E2E_GCP_PROJECT_ID`).
Never commit real secrets, tokens, or account IDs to the repository.
7. **Keep tests deterministic and isolated**
* Use Playwright's `test.beforeEach()` and `test.afterEach()` hooks to manage test state:
* **`test.beforeEach()`**: Execute cleanup or setup logic before each test runs (for example, delete existing providers with a specific account ID to ensure a clean state).
* **`test.afterEach()`**: Execute cleanup logic after each test completes (for example, remove test data created during the test execution to prevent interference with subsequent tests).
* Define tests as serial using `test.describe.serial()` when they share state or resources that could interfere with parallel execution (for example, tests that use the same provider account ID or create dependent resources). This ensures tests within the serial group run sequentially, preventing race conditions and data conflicts.
* Use unique identifiers (for example, random suffixes for emails or labels) to prevent data collisions.
8. **Use explicit waiting strategies**
* Avoid using `waitForLoadState('networkidle')` as it is unreliable and can lead to flaky tests or unnecessary delays.
* Leverage Playwright's auto-waiting capabilities by waiting for specific elements to be actionable (for example, `locator.click()`, `locator.fill()`, `locator.waitFor()`).
* **Prioritize selector strategies**: Prefer `page.getByRole()` over other approaches like `page.getByText()`. `getByRole()` is more resilient to UI changes, aligns with accessibility best practices, and better reflects how users interact with the application (by role and accessible name rather than implementation details).
* For dynamic content, wait for specific UI elements that indicate the page is ready (for example, button becoming enabled, a specific text appearing, etc).
* This approach makes tests more reliable, faster, and aligned with how users actually interact with the application.
**Common waiting patterns used in Prowler E2E tests:**
* **Element visibility assertions**: Use `expect(locator).toBeVisible()` or `expect(locator).not.toBeVisible()` to wait for elements to appear or disappear (Playwright automatically waits for these conditions).
* **URL changes**: Use `expect(page).toHaveURL(url)` or `page.waitForURL(url)` to wait for navigation to complete.
* **Element states**: Use `locator.waitFor({ state: "visible" })` or `locator.waitFor({ state: "hidden" })` when you need explicit state control.
* **Text content**: Use `expect(locator).toHaveText(text)` or `expect(locator).toContainText(text)` to wait for specific text to appear.
* **Element attributes**: Use `expect(locator).toHaveAttribute(name, value)` to wait for attributes like `aria-disabled="false"` indicating a button is enabled.
* **Custom conditions**: Use `page.waitForFunction(() => condition)` for complex conditions that cannot be expressed with locators (for example, checking DOM element dimensions or computed styles).
* **Retryable assertions**: Use `expect(async () => { ... }).toPass({ timeout })` for conditions that may take time to stabilize (for example, waiting for table rows to filter after a server request).
* **Scroll into view**: Use `locator.scrollIntoViewIfNeeded()` before interacting with elements that may be outside the viewport.
**Example from Prowler tests:**
```typescript theme={null}
// Wait for page to load by checking main content is visible
await expect(page.locator("main")).toBeVisible();
// Wait for URL change after form submission
await expect(page).toHaveURL("/providers");
// Wait for button to become enabled
await expect(submitButton).toHaveAttribute("aria-disabled", "false");
// Wait for loading spinner to disappear
await expect(page.getByText("Loading")).not.toBeVisible();
// Wait for custom condition
await page.waitForFunction(() => {
const main = document.querySelector("main");
return main && main.offsetHeight > 0;
});
// Wait for retryable condition (e.g., table filtering)
await expect(async () => {
const rowCount = await tableRows.count();
expect(rowCount).toBeLessThanOrEqual(1);
}).toPass({ timeout: 20000 });
```
## Running Prowler Tests
E2E tests for Prowler App run from the `ui` project using Playwright. The Playwright configuration lives in `ui/playwright.config.ts` and defines:
* `testDir: "./tests"` – location of E2E test files (relative to the `ui` project root, so `ui/tests`).
* `webServer` – how to start the Next.js development server and connect to Prowler API.
* `use.baseURL` – base URL for browser interactions (defaults to `http://localhost:3000` or `AUTH_URL` if set).
* `reporter: [["list"]]` – uses the list reporter to display test results in a concise format in the terminal. Other reporter options are available (for example, `html`, `json`, `junit`, `github`), and multiple reporters can be configured simultaneously. See the [Playwright reporter documentation](https://playwright.dev/docs/test-reporters) for all available options.
* `expect.timeout: 20000` – timeout for assertions (20 seconds). This is the maximum time Playwright will wait for an assertion to pass before considering it failed.
* **Test artifacts** (in `use` configuration): By default, `trace`, `screenshot`, and `video` are set to `"off"` to minimize resource usage. To review test failures or debug issues, these can be enabled in `playwright.config.ts` by changing them to `"on"`, `"on-first-retry"`, or `"retain-on-failure"` depending on your needs.
* `outputDir: "/tmp/playwright-tests"` – directory where Playwright stores test artifacts (screenshots, videos, traces) during test execution.
* **CI-specific configuration**: The configuration uses different settings when running in CI environments (detected via `process.env.CI`):
* **Retries**: `2` retries in CI (to handle flaky tests), `0` retries locally (for faster feedback during development).
* **Workers**: `1` worker in CI (sequential execution for stability), `undefined` locally (parallel execution by default for faster test runs).
### Prerequisites
Before running E2E tests:
* **Install root and UI dependencies**
* Follow the [developer guide introduction](/developer-guide/introduction#getting-the-code-and-installing-all-dependencies) to clone the repository and install core dependencies.
* From the `ui` directory, install frontend dependencies:
```bash theme={null}
cd ui
pnpm install
pnpm run test:e2e:install # Install Playwright browsers
```
* **Ensure Prowler API is available**
* By default, Playwright uses `UI_API_BASE_URL=http://localhost:8080/api/v1` (configured in `playwright.config.ts`).
* Start Prowler API so it is reachable on that URL (for example, via `docker-compose-dev.yml` or the development orchestration used locally).
* If a different API URL is required, set `UI_API_BASE_URL` accordingly before running the tests.
* **Ensure Prowler App UI is available**
* Playwright automatically starts the Next.js server through the `webServer` block in `playwright.config.ts` (`pnpm run dev` by default).
* If the UI is already running on `http://localhost:3000`, Playwright will reuse the existing server when `reuseExistingServer` is `true`.
* **Configure E2E environment variables**
* Suite-specific variables (for example, provider account IDs, credentials, and E2E user data) must be provided before running tests.
* They can be defined either:
* As exported environment variables in the shell before executing the Playwright commands, or
* In a `.env.local` or `.env` file under `ui/`, and then loaded into the shell before running tests, for example:
```bash theme={null}
cd ui
set -a
source .env.local # or .env
set +a
```
* Refer to the Markdown documentation files in `ui/tests` for each E2E suite (for example, the `*.md` files that describe sign-up, providers, scans, invitations, and other flows) to see the exact list of required variables and their meaning.
* Each E2E test suite explicitly checks that its required environment variables are defined at runtime and will fail with a clear error message if any mandatory variable is missing, making misconfiguration easy to detect.
### Executing Tests
To execute E2E tests for Prowler App:
1. **Run the full E2E suite (headless)**
From the `ui` directory:
```bash theme={null}
pnpm run test:e2e
```
This command runs Playwright with the configured projects
2. **Run E2E tests with the Playwright UI runner**
```bash theme={null}
pnpm run test:e2e:ui
```
This opens the Playwright test runner UI to inspect, debug, and rerun specific tests or projects.
3. **Debug E2E tests interactively**
```bash theme={null}
pnpm run test:e2e:debug
```
Use this mode to step through flows, inspect selectors, and adjust timings. It runs tests in headed mode with debugging tools enabled.
4. **Run tests in headed mode without debugger**
```bash theme={null}
pnpm run test:e2e:headed
```
This is useful to visually confirm flows while still running the full suite.
5. **View previous test reports**
```bash theme={null}
pnpm run test:e2e:report
```
This opens the latest Playwright HTML report, including traces and screenshots when enabled.
6. **Run specific tests or subsets**
In addition to the predefined scripts, Playwright allows filtering which tests run. These examples use the Playwright CLI directly through `pnpm`:
* **By test ID (`@ID` in the test metadata or description)**
To run a single test case identified by its ID (for example, `@PROVIDER-E2E-001` or `@SCANS-E2E-001`):
```bash theme={null}
pnpm playwright test --grep @PROVIDER-E2E-001
```
* **By tags**
To run all tests that share a common tag (for example, all provider E2E tests tagged with `@providers`):
```bash theme={null}
pnpm playwright test --grep @providers
```
This is useful to focus on a specific feature area such as providers, scans, invitations, or sign-up.
* **By Playwright project**
To run only the tests associated with a given project defined in `playwright.config.ts` (for example, `providers` or `scans`):
```bash theme={null}
pnpm playwright test --project=providers
```
Combining project and grep filters is also supported, enabling very narrow runs (for example, a single test ID within the `providers` project). For additional CLI options and combinations, see the [Playwright command line documentation](https://playwright.dev/docs/test-cli).
For detailed flows, preconditions, and environment variable requirements per feature, always refer to the Markdown files in `ui/tests`. Those documents are the single source of truth for business expectations and validation points in each E2E suite.
# Environment Variable Naming Convention
Source: https://docs.prowler.com/developer-guide/environment-variables
Prowler is a monorepo composed of several runtime components — Prowler App (the web user interface), Prowler API (the backend), Prowler SDK, and Prowler MCP Server (Model Context Protocol) — that frequently share a single `.env` file. To keep that shared configuration unambiguous, each component namespaces its environment variables with a component-specific prefix.
## Component Prefixes
Each component owns a dedicated prefix for the environment variables it reads:
| Component | Prefix | Status |
| --------------------- | ------ | ------- |
| Prowler App (web UI) | `UI_` | Adopted |
| Prowler API (backend) | `API_` | Planned |
| Prowler SDK | `SDK_` | Planned |
| Prowler MCP Server | `MCP_` | Planned |
## Why Component Prefixes Matter
Component prefixes solve three concrete problems in a shared configuration file:
* **Collisions in a shared `.env`:** Several components historically read identically named variables. The API base URL, for example, is consumed by more than one component, so a single unprefixed name is ambiguous. A component prefix removes that ambiguity.
* **Explicit ownership:** A prefix states, at a glance, which component consumes a variable.
* **Reduced accidental exposure:** For Prowler App, scoping browser-facing configuration under one intentional prefix prevents server-only values from leaking into the client bundle.
## Prowler App
Prowler App has adopted the `UI_` prefix. Its public configuration is resolved from the container environment at runtime rather than inlined at build time, so a single pre-built image serves any deployment. For the operational details on changing these values without rebuilding the image, see [Troubleshooting](/troubleshooting).
The former build-time variables map to the new runtime variables as follows:
| Former variable | New variable |
| ------------------------------------------------------ | -------------------------- |
| `NEXT_PUBLIC_API_BASE_URL` | `UI_API_BASE_URL` |
| `NEXT_PUBLIC_API_DOCS_URL` | `UI_API_DOCS_URL` |
| `NEXT_PUBLIC_GOOGLE_TAG_MANAGER_ID` | `UI_GOOGLE_TAG_MANAGER_ID` |
| `NEXT_PUBLIC_SENTRY_DSN`, `SENTRY_DSN` | `UI_SENTRY_DSN` |
| `NEXT_PUBLIC_SENTRY_ENVIRONMENT`, `SENTRY_ENVIRONMENT` | `UI_SENTRY_ENVIRONMENT` |
The build-time-only Sentry variables used for source-map upload — `SENTRY_ORG`, `SENTRY_PROJECT`, `SENTRY_AUTH_TOKEN`, and `SENTRY_RELEASE` — keep their names, as they are not part of the App's runtime configuration.
## Upcoming Breaking Change
Adopting the `API_`, `SDK_`, and `MCP_` prefixes for Prowler API, Prowler SDK, and Prowler MCP Server is a planned breaking change in a future release. Migrate environment configuration to the new names when upgrading.
Prowler API, Prowler SDK, and Prowler MCP Server have not yet adopted the convention. In a future release, the variables each of these components reads will be namespaced under `API_`, `SDK_`, and `MCP_` respectively. The per-component mapping from current to prefixed names will be documented when each change is released.
## Deprecated Names
* **Prowler App:** The bare server-side `SENTRY_DSN` and `SENTRY_ENVIRONMENT` are no longer read; the server and edge runtimes now read `UI_SENTRY_DSN` and `UI_SENTRY_ENVIRONMENT`. The former `NEXT_PUBLIC_*` build-time variables are deprecated but still read at runtime as a fallback when the matching `UI_*` variable is unset. This fallback will be removed in a future release, so set the `UI_*` runtime variables on the running container.
* **Prowler API, Prowler SDK, and Prowler MCP Server:** The current, unprefixed variable names are deprecated. They continue to work today and will be removed once the prefixed convention is adopted for each component, as described in [Upcoming Breaking Change](#upcoming-breaking-change).
# Google Cloud Provider
Source: https://docs.prowler.com/developer-guide/gcp-details
This page details the [Google Cloud Platform (GCP)](https://cloud.google.com/) provider implementation in Prowler.
By default, Prowler will audit all the GCP projects that the authenticated identity can access. To configure it, follow the [GCP getting started guide](/user-guide/providers/gcp/getting-started-gcp).
## GCP Provider Classes Architecture
The GCP provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the GCP-specific implementation, highlighting how the generic provider concepts are realized for GCP in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider).
### Main Class
* **Location:** [`prowler/providers/gcp/gcp_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/gcp/gcp_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for GCP-specific logic, session management, credential validation, project and organization discovery, and configuration.
* **Key GCP Responsibilities:**
* Initializes and manages GCP sessions (supports Application Default Credentials, Service Account, OAuth, and impersonation).
* Validates credentials and sets up the GCP identity context.
* Loads and manages configuration, mutelist, and fixer settings.
* Discovers accessible GCP projects and organization metadata.
* Provides properties and methods for downstream GCP service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/gcp/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/gcp/models.py)
* **Purpose:** Define structured data for GCP identity, project, and organization info.
* **Key GCP Models:**
* `GCPIdentityInfo`: Holds GCP identity metadata, such as the profile name.
* `GCPOrganization`: Represents a GCP organization with ID, name, and display name.
* `GCPProject`: Represents a GCP project with number, ID, name, organization, labels, and lifecycle state.
### `GCPService` (Service Base Class)
* **Location:** [`prowler/providers/gcp/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/gcp/lib/service/service.py)
* **Purpose:** Abstract base class that all GCP service-specific classes inherit from. This implements the generic service pattern (described in [service page](/developer-guide/services#service-base-class)) specifically for GCP.
* **Key GCP Responsibilities:**
* Receives a `GcpProvider` instance to access session, identity, and configuration.
* Manages clients for all services by project.
* Filters projects to only those with the relevant API enabled.
* Provides `__threading_call__` method to make API calls in parallel by project or resource.
* Exposes common audit context (`project_ids`, `projects`, `default_project_id`, `audit_config`, `fixer_config`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/gcp/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/gcp/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for GCP-specific error handling, such as credential, session, and project access errors.
### Session and Utility Helpers
* **Location:** [`prowler/providers/gcp/lib/`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/gcp/lib/)
* **Purpose:** Helpers for argument parsing, mutelist management, and other cross-cutting concerns.
## Retry Configuration
GCP services implement automatic retry functionality for rate limiting errors (HTTP 429). This is configured centrally and must be included in all API calls:
### Required Implementation
```python theme={null}
from prowler.providers.gcp.config import DEFAULT_RETRY_ATTEMPTS
# In discovery.build()
client = discovery.build(
service, version, credentials=credentials,
num_retries=DEFAULT_RETRY_ATTEMPTS
)
# In request.execute()
response = request.execute(num_retries=DEFAULT_RETRY_ATTEMPTS)
```
### Configuration
* **Default Value**: 3 attempts (configurable in `prowler/providers/gcp/config.py`)
* **Command Line Flag**: `--gcp-retries-max-attempts` for runtime configuration
* **Error Types**: HTTP 429 and quota exceeded errors
* **Backoff Strategy**: Exponential backoff with randomization
### Example Service Implementation
```python theme={null}
def _get_instances(self):
for project_id in self.project_ids:
try:
client = discovery.build(
"compute", "v1", credentials=self.credentials,
num_retries=DEFAULT_RETRY_ATTEMPTS
)
request = client.instances().list(project=project_id)
response = request.execute(num_retries=DEFAULT_RETRY_ATTEMPTS)
# Process response...
except Exception as error:
logger.error(f"{error.__class__.__name__}: {error}")
```
## Specific Patterns in GCP Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the currently implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/gcp/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/gcp/services)
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new service is following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and taking other services already implemented as reference. In next subsection you can find a list of common patterns that are used accross all GCP services.
### GCP Service Common Patterns
* Services communicate with GCP using the Google Cloud Python SDK, you can find the documentation with all the services [here](https://cloud.google.com/python/docs/reference).
* Every GCP service class inherits from `GCPService`, ensuring access to session, identity, configuration, and client utilities.
* The constructor (`__init__`) always calls `super().__init__` with the service name, provider, region (default "global"), and API version (default "v1"). Usually, the service name is the class name in lowercase, so it is called like `super().__init__(__class__.__name__, provider)`.
* Resource containers **must** be initialized in the constructor, typically as dictionaries keyed by resource ID and the value is the resource object.
* Only projects with the API enabled are included in the audit scope.
* Resource discovery and attribute collection can be parallelized using `self.__threading_call__`, typically by region/zone or resource.
* All GCP resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes.
* Each GCP API calls are wrapped in try/except blocks, always logging errors.
* **Retry Configuration**: All `request.execute()` calls must include `num_retries=DEFAULT_RETRY_ATTEMPTS` for automatic retry on rate limiting errors (HTTP 429).
* Tags and additional attributes that cannot be retrieved from the default call should be collected and stored for each resource using dedicated methods and threading.
## Specific Patterns in GCP Checks
The GCP checks pattern is described in [checks page](/developer-guide/checks). You can find all the currently implemented checks:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/gcp/services/iam/iam_sa_user_managed_key_unused/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/gcp/services/iam/iam_sa_user_managed_key_unused))
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new check is following the [GCP check implementation documentation](/developer-guide/checks#creating-a-check) and taking other similar checks as reference.
### Check Report Class
The `Check_Report_GCP` class models a single finding for a GCP resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`Check_Report_GCP` extends the base report structure with GCP-specific fields, enabling detailed tracking of the resource, project, and location associated with each finding.
#### Constructor and Attribute Population
When you instantiate `Check_Report_GCP`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its GCP-specific attributes from the resource, using the following logic (in order of precedence):
* **`resource_id`**:
* Uses the explicit `resource_id` argument if provided.
* Otherwise, uses `resource.id` if present.
* Otherwise, uses `resource.name` if present.
* Defaults to an empty string if none are available.
* **`resource_name`**:
* Uses the explicit `resource_name` argument if provided.
* Otherwise, uses `resource.name` if present.
* Defaults to an empty string.
* **`project_id`**:
* Uses the explicit `project_id` argument if provided.
* Otherwise, uses `resource.project_id` if present.
* Defaults to an empty string.
* **`location`**:
* Uses the explicit `location` argument if provided.
* Otherwise, uses `resource.location` if present.
* Otherwise, uses `resource.region` if present.
* Defaults to "global" if none are available.
All these attributes can be overridden by passing the corresponding argument to the constructor. If the resource object does not contain the required attributes, you must set them manually.
Others attributes are inherited from the `Check_Report` class, from that ones you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
report = Check_Report_GCP(
metadata=check_metadata,
resource=resource_object,
resource_id="custom-id", # Optional override
resource_name="custom-name", # Optional override
project_id="my-gcp-project", # Optional override
location="us-central1" # Optional override
)
report.status = "PASS"
report.status_extended = "Resource is compliant."
```
# GitHub Provider
Source: https://docs.prowler.com/developer-guide/github-details
This page details the [GitHub](https://github.com/) provider implementation in Prowler.
By default, Prowler will audit the GitHub account - scanning all repositories, organizations, and applications that your configured credentials can access. To configure it, follow the [GitHub getting started guide](/user-guide/providers/github/getting-started-github).
## GitHub Provider Classes Architecture
The GitHub provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the GitHub-specific implementation, highlighting how the generic provider concepts are realized for GitHub in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider).
### `GithubProvider` (Main Class)
* **Location:** [`prowler/providers/github/github_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/github/github_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for GitHub-specific logic, session management, credential validation, and configuration.
* **Key GitHub Responsibilities:**
* Initializes and manages GitHub sessions (supports Personal Access Token, OAuth App, and GitHub App authentication).
* Validates credentials and sets up the GitHub identity context.
* Loads and manages configuration, mutelist, and fixer settings.
* Provides properties and methods for downstream GitHub service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/github/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/github/models.py)
* **Purpose:** Define structured data for GitHub identity, session, and output options.
* **Key GitHub Models:**
* `GithubSession`: Holds authentication tokens and keys for the session.
* `GithubIdentityInfo`, `GithubAppIdentityInfo`: Store account or app identity metadata.
### `GithubService` (Service Base Class)
* **Location:** [`prowler/providers/github/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/github/lib/service/service.py)
* **Purpose:** Abstract base class for all GitHub service-specific classes.
* **Key GitHub Responsibilities:**
* Receives a `GithubProvider` instance to access session, identity, and configuration.
* Manages GitHub API clients for the authenticated user or app.
* Exposes common audit context (`audit_config`, `fixer_config`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/github/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/github/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for GitHub-specific error handling, such as credential and session errors.
### Session and Utility Helpers
* **Location:** [`prowler/providers/github/lib/`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/github/lib/)
* **Purpose:** Helpers for argument parsing, mutelist management, and other cross-cutting concerns.
## Specific Patterns in GitHub Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the currently implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/github/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/github/services)
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new service is following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and by taking other already implemented services as reference.
### GitHub Service Common Patterns
* Services communicate with GitHub using the PyGithub Python SDK. See the [official documentation](https://pygithub.readthedocs.io/).
* Every GitHub service class inherits from `GithubService`, ensuring access to session, identity, configuration, and client utilities.
* The constructor (`__init__`) always calls `super().__init__` with the service name and provider (e.g. `super().__init__(__class__.__name__, provider))`). Ensure that the service name in PyGithub is the same that you use in the constructor. Usually is used the `__class__.__name__` to get the service name because it is the same as the class name.
* Resource containers **must** be initialized in the constructor, typically as dictionaries keyed by resource ID or name.
* All GitHub resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes.
* GitHub API calls are wrapped in try/except blocks, always logging errors.
## Specific Patterns in GitHub Checks
The GitHub checks pattern is described in [checks page](/developer-guide/checks). You can find all the currently implemented checks in:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/github/services/repository/repository_secret_scanning_enabled/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/github/services/repository/repository_secret_scanning_enabled))
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new check is the [GitHub check implementation documentation](/developer-guide/checks#creating-a-check) and by taking other checks as reference.
### Check Report Class
The `CheckReportGithub` class models a single finding for a GitHub resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`CheckReportGithub` extends the base report structure with GitHub-specific fields, enabling detailed tracking of the resource, name, and owner associated with each finding.
#### Constructor and Attribute Population
When you instantiate `CheckReportGithub`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its GitHub-specific attributes from the resource, using the following logic (in order of precedence):
* **`resource_id`**:
* Uses the explicit `resource_id` argument if provided.
* Otherwise, uses `resource.id` if present.
* Defaults to an empty string if not available.
* **`resource_name`**:
* Uses the explicit `resource_name` argument if provided.
* Otherwise, uses `resource.name` if present.
* Defaults to an empty string if not available.
* **`owner`**:
* Uses the explicit `owner` argument if provided.
* Otherwise, uses `resource.owner` for repositories and `resource.name` for organizations.
* Defaults to an empty string if not available.
If the resource object does not contain the required attributes, you must set them manually in the check logic.
Other attributes are inherited from the `Check_Report` class, from which you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
report = CheckReportGithub(
metadata=check_metadata,
resource=resource_object
)
report.status = "PASS"
report.status_extended = "Resource is compliant."
```
# Integration Tests
Source: https://docs.prowler.com/developer-guide/integration-testing
Coming soon ...
# Creating a New Integration
Source: https://docs.prowler.com/developer-guide/integrations
## Introduction
Integrating Prowler with external tools enhances its functionality and enables seamless workflow automation. Prowler supports a variety of integrations to optimize security assessments and reporting.
### Supported Integration Targets
* Messaging Platforms – Example: Slack
* Project Management Tools – Example: Jira
* Cloud Services – Example: AWS Security Hub
### Integration Guidelines
To integrate Prowler with a specific product:
Refer to the [Prowler Developer Guide](https://docs.prowler.com/projects/prowler-open-source/en/latest/) to understand its architecture and integration mechanisms.
* Identify the most suitable integration method for the intended platform.
## Steps to Create an Integration
### Defining the Integration Purpose
* Before implementing an integration, clearly define its objective. Common purposes include:
* Sending Prowler findings to a platform for alerting, tracking, or further analysis.
* For inspiration and implementation examples, please review the existing integrations in the [`prowler/lib/outputs`](https://github.com/prowler-cloud/prowler/tree/master/prowler/lib/outputs) folder.
### Developing the Integration
* Script Development:
* Write a script to process Prowler’s output and interact with the target platform’s API.
* If the goal is to send findings, parse Prowler’s results and use the platform’s API to create entries or notifications.
* Configuration:
* Ensure the script supports environment-specific settings, such as:
* API endpoints
* Authentication tokens
* Any necessary configurable parameters.
### Fundamental Structure
* Integration Class:
* To implement an integration, create a class that encapsulates the required attributes and methods for interacting with the target platform. Example: Jira Integration
```python title="Jira Class" theme={null}
class Jira:
"""
Jira class to interact with the Jira API
[Note]
This integration is limited to a single Jira Cloud instance, meaning all issues will be created under the same Jira Cloud ID. Future improvements will include the ability to specify a Jira Cloud ID for users associated with multiple accounts.
Attributes
- _redirect_uri: The redirect URI used
- _client_id: The client identifier
- _client_secret: The client secret
- _access_token: The access token
- _refresh_token: The refresh token
- _expiration_date: The authentication expiration
- _cloud_id: The cloud identifier
- _scopes: The scopes needed to authenticate, read:jira-user read:jira-work write:jira-work
- AUTH_URL: The URL to authenticate with Jira
- PARAMS_TEMPLATE: The template for the parameters to authenticate with Jira
- TOKEN_URL: The URL to get the access token from Jira
- API_TOKEN_URL: The URL to get the accessible resources from Jira
Methods
__init__: Initializes the Jira object
- input_authorization_code: Inputs the authorization code
- auth_code_url: Generates the URL to authorize the application
- get_auth: Gets the access token and refreshes it
- get_cloud_id: Gets the cloud identifier from Jira
- get_access_token: Gets the access token
- refresh_access_token: Refreshes the access token from Jira
- test_connection: Tests the connection to Jira and returns a Connection object
- get_projects: Gets the projects from Jira
- get_available_issue_types: Gets the available issue types for a project
- send_findings: Sends the findings to Jira and creates an issue
Raises:
- JiraGetAuthResponseError: Failed to get the access token and refresh token
- JiraGetCloudIDNoResourcesError: No resources were found in Jira when getting the cloud id
- JiraGetCloudIDResponseError: Failed to get the cloud ID, response code did not match 200
- JiraGetCloudIDError: Failed to get the cloud ID from Jira
- JiraAuthenticationError: Failed to authenticate
- JiraRefreshTokenError: Failed to refresh the access token
- JiraRefreshTokenResponseError: Failed to refresh the access token, response code did not match 200
- JiraGetAccessTokenError: Failed to get the access token
- JiraNoProjectsError: No projects found in Jira
- JiraGetProjectsError: Failed to get projects from Jira
- JiraGetProjectsResponseError: Failed to get projects from Jira, response code did not match 200
- JiraInvalidIssueTypeError: The issue type is invalid
- JiraGetAvailableIssueTypesError: Failed to get available issue types from Jira
- JiraGetAvailableIssueTypesResponseError: Failed to get available issue types from Jira, response code did not match 200
- JiraCreateIssueError: Failed to create an issue in Jira
- JiraSendFindingsResponseError: Failed to send the findings to Jira
- JiraTestConnectionError: Failed to test the connection
Usage:
jira = Jira(
redirect_uri="http://localhost:8080",
client_id="client_id",
client_secret="client_secret
)
jira.send_findings(findings=findings, project_key="KEY")
"""
_redirect_uri: str = None
_client_id: str = None
_client_secret: str = None
_access_token: str = None
_refresh_token: str = None
_expiration_date: int = None
_cloud_id: str = None
_scopes: list[str] = None
AUTH_URL = "https://auth.atlassian.com/authorize"
PARAMS_TEMPLATE = {
"audience": "api.atlassian.com",
"client_id": None,
"scope": None,
"redirect_uri": None,
"state": None,
"response_type": "code",
"prompt": "consent",
}
TOKEN_URL = "https://auth.atlassian.com/oauth/token"
API_TOKEN_URL = "https://api.atlassian.com/oauth/token/accessible-resources"
def __init__(
self,
redirect_uri: str = None,
client_id: str = None,
client_secret: str = None,
):
self._redirect_uri = redirect_uri
self._client_id = client_id
self._client_secret = client_secret
self._scopes = ["read:jira-user", "read:jira-work", "write:jira-work"]
auth_url = self.auth_code_url()
authorization_code = self.input_authorization_code(auth_url)
self.get_auth(authorization_code)
# More properties and methods
```
* Test Connection Method:
* Validating Credentials or Tokens
To ensure a successful connection to the target platform, implement a method that validates authentication credentials or tokens.
#### Method Implementation
The following example demonstrates the `test_connection` method for the `Jira` class:
```python title="Test connection" theme={null}
@staticmethod
def test_connection(
redirect_uri: str = None,
client_id: str = None,
client_secret: str = None,
raise_on_exception: bool = True,
) -> Connection:
"""Test the connection to Jira
Args:
- redirect_uri: The redirect URI used
- client_id: The client identifier
- client_secret: The client secret
- raise_on_exception: Whether to raise an exception or not
Returns:
- Connection: The connection object
Raises:
- JiraGetCloudIDNoResourcesError: No resources were found in Jira when getting the cloud id
- JiraGetCloudIDResponseError: Failed to get the cloud ID, response code did not match 200
- JiraGetCloudIDError: Failed to get the cloud ID from Jira
- JiraAuthenticationError: Failed to authenticate
- JiraTestConnectionError: Failed to test the connection
"""
try:
jira = Jira(
redirect_uri=redirect_uri,
client_id=client_id,
client_secret=client_secret,
)
access_token = jira.get_access_token()
if not access_token:
return ValueError("Failed to get access token")
headers = {"Authorization": f"Bearer {access_token}"}
response = requests.get(
f"https://api.atlassian.com/ex/jira/{jira.cloud_id}/rest/api/3/myself",
headers=headers,
)
if response.status_code == 200:
return Connection(is_connected=True)
else:
return Connection(is_connected=False, error=response.json())
except JiraGetCloudIDNoResourcesError as no_resources_error:
logger.error(
f"{no_resources_error.__class__.__name__}[{no_resources_error.__traceback__.tb_lineno}]: {no_resources_error}"
)
if raise_on_exception:
raise no_resources_error
return Connection(error=no_resources_error)
except JiraGetCloudIDResponseError as response_error:
logger.error(
f"{response_error.__class__.__name__}[{response_error.__traceback__.tb_lineno}]: {response_error}"
)
if raise_on_exception:
raise response_error
return Connection(error=response_error)
except JiraGetCloudIDError as cloud_id_error:
logger.error(
f"{cloud_id_error.__class__.__name__}[{cloud_id_error.__traceback__.tb_lineno}]: {cloud_id_error}"
)
if raise_on_exception:
raise cloud_id_error
return Connection(error=cloud_id_error)
except JiraAuthenticationError as auth_error:
logger.error(
f"{auth_error.__class__.__name__}[{auth_error.__traceback__.tb_lineno}]: {auth_error}"
)
if raise_on_exception:
raise auth_error
return Connection(error=auth_error)
except Exception as error:
logger.error(f"Failed to test connection: {error}")
if raise_on_exception:
raise JiraTestConnectionError(
message="Failed to test connection on the Jira integration",
file=os.path.basename(__file__),
)
return Connection(is_connected=False, error=error)
```
* Send Findings Method:
* Add a method to send Prowler findings to the target platform, adhering to its API specifications.
#### Method Implementation
The following example demonstrates the `send_findings` method for the `Jira` class:
```python title="Send findings method" theme={null}
def send_findings(
self,
findings: list[Finding] = None,
project_key: str = None,
issue_type: str = None,
):
"""
Send the findings to Jira
Args:
- findings: The findings to send
- project_key: The project key
- issue_type: The issue type
Raises:
- JiraRefreshTokenError: Failed to refresh the access token
- JiraRefreshTokenResponseError: Failed to refresh the access token, response code did not match 200
- JiraCreateIssueError: Failed to create an issue in Jira
- JiraSendFindingsResponseError: Failed to send the findings to Jira
"""
try:
access_token = self.get_access_token()
if not access_token:
raise JiraNoTokenError(
message="No token was found",
file=os.path.basename(__file__),
)
projects = self.get_projects()
if project_key not in projects:
logger.error("The project key is invalid")
raise JiraInvalidProjectKeyError(
message="The project key is invalid",
file=os.path.basename(__file__),
)
available_issue_types = self.get_available_issue_types(project_key)
if issue_type not in available_issue_types:
logger.error("The issue type is invalid")
raise JiraInvalidIssueTypeError(
message="The issue type is invalid", file=os.path.basename(__file__)
)
headers = {
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json",
}
for finding in findings:
status_color = self.get_color_from_status(finding.status.value)
adf_description = self.get_adf_description(
check_id=finding.metadata.CheckID,
check_title=finding.metadata.CheckTitle,
severity=finding.metadata.Severity.value.upper(),
status=finding.status.value,
status_color=status_color,
status_extended=finding.status_extended,
provider=finding.metadata.Provider,
region=finding.region,
resource_uid=finding.resource_uid,
resource_name=finding.resource_name,
risk=finding.metadata.Risk,
recommendation_text=finding.metadata.Remediation.Recommendation.Text,
recommendation_url=finding.metadata.Remediation.Recommendation.Url,
)
payload = {
"fields": {
"project": {"key": project_key},
"summary": f"[Prowler] {finding.metadata.Severity.value.upper()} - {finding.metadata.CheckID} - {finding.resource_uid}",
"description": adf_description,
"issuetype": {"name": issue_type},
}
}
response = requests.post(
f"https://api.atlassian.com/ex/jira/{self.cloud_id}/rest/api/3/issue",
json=payload,
headers=headers,
)
if response.status_code != 201:
response_error = f"Failed to send finding: {response.status_code} - {response.json()}"
logger.warning(response_error)
raise JiraSendFindingsResponseError(
message=response_error, file=os.path.basename(__file__)
)
else:
logger.info(f"Finding sent successfully: {response.json()}")
except JiraRefreshTokenError as refresh_error:
raise refresh_error
except JiraRefreshTokenResponseError as response_error:
raise response_error
except Exception as e:
logger.error(f"Failed to send findings: {e}")
raise JiraCreateIssueError(
message="Failed to create an issue in Jira",
file=os.path.basename(__file__),
)
```
### Testing the Integration
* Conduct integration testing in a controlled environment to validate expected behavior. Ensure the following:
* Transmission Accuracy – Verify that Prowler findings are correctly sent and processed by the target platform.
* Error Handling – Simulate edge cases to assess robustness and failure recovery mechanisms.
### Documentation
* Ensure the following elements are included:
* Setup Instructions – List all necessary dependencies and installation steps.
* Configuration Details – Specify required environment variables, authentication steps, etc.
* Example Use Cases – Provide practical scenarios demonstrating functionality.
* Troubleshooting Guide – Document common issues and resolution steps.
* Comprehensive and clear documentation improves maintainability and simplifies onboarding.
# Introduction to developing in Prowler
Source: https://docs.prowler.com/developer-guide/introduction
Thanks for your interest in contributing to Prowler!
Prowler can be extended in various ways. This guide provides the different ways to contribute and how to get started.
Maintainers will assess whether a change fits the project roadmap and scope before merging.
## Contributing to Prowler
### Review Current Issues
Check out our [GitHub Issues](https://github.com/prowler-cloud/prowler/issues) page for ideas to contribute.
We tag issues as `good first issue` for new contributors. These are typically well-defined and manageable in scope.
We tag issues as `help wanted` for other issues that require more time to complete.
### Expand Prowler's Capabilities
Prowler is constantly evolving. Contributions to checks, services, or integrations help improve the tool for everyone. Here is how to get involved:
Want to improve Prowler's detection capabilities for your favorite cloud provider? You can contribute by writing new checks.
One key service for your favorite cloud provider is missing? Add it to Prowler! Do not forget to include relevant checks to validate functionality.
If you would like to extend Prowler to work with a new cloud provider, this typically involves setting up new services and checks to ensure compatibility.
Need to ensure Prowler supports a specific compliance framework? Add new security compliance frameworks to map checks against regulatory or industry standards.
Want to tailor how results are displayed or exported? You can add custom output formats.
Prowler can work with other tools and platforms through integrations.
Propose brand-new features or enhancements to existing ones, or help implement community-requested improvements.
### Improve Documentation
Help make Prowler more accessible by enhancing our documentation, fixing typos, or adding examples/tutorials.
Enhance our documentation, fix typos, or add examples/tutorials.
### Bug Fixes
If you find any issues or bugs, you can report them in the [GitHub Issues](https://github.com/prowler-cloud/prowler/issues) page and if you want you can also fix them.
Report or fix issues or bugs.
Remember, our community is here to help! If you need guidance, do not hesitate to ask questions in the issues or join our [ Slack workspace](https://goto.prowler.com/slack).
## Setting up your development environment
### Prerequisites
Before proceeding, ensure the following:
* Git is installed.
* Python 3.10 or higher is installed.
* `uv` is installed to manage dependencies.
### Forking the Prowler Repository
Fork the Prowler GitHub repository to contribute to Prowler. This allows proposing changes, submitting new features, and fixing bugs. For guidance on forking, refer to the [official GitHub documentation](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo?tool=webui#forking-a-repository).
### Cloning Your Forked Repository
Once your fork is created, clone it using the following commands (replace `` with your GitHub username):
```
git clone https://github.com//prowler
cd prowler
```
### Dependency Management and Environment Isolation
To prevent conflicts between environments, we recommend using [`uv`](https://docs.astral.sh/uv/), a fast Python package and project manager. Install it by following the [official instructions](https://docs.astral.sh/uv/getting-started/installation/).
### Installing Dependencies
To install all required dependencies, including those needed for development, run:
```
uv sync
source .venv/bin/activate
```
### Running the Local API Development Stack
For API development, Prowler provides a Makefile-based local stack in addition to the manual and Docker Compose workflows documented in the API README. PostgreSQL, Valkey, and Neo4j run with Docker Compose, while Django and the Celery worker run natively through `uv`.
Before using this method, ensure `docker compose`, `tmux`, and `uv` are installed.
This workflow is designed for macOS and should also work on Linux when Docker, `tmux`, and `uv` are available. Windows requires script changes before it can be supported.
To start the local API stack, run:
```shell theme={null}
make dev
```
This command starts the required services, creates a `tmux` session with panes for the API, worker, and PostgreSQL logs, waits until the API responds, and prints the API URL and log file paths. The API is available at:
```text theme={null}
http://localhost:8080/api/v1
```
Use these commands to manage the stack:
```shell theme={null}
make dev-setup # Bootstrap dependencies, migrations, and fixtures
make dev-attach # Attach to the tmux session
make dev-launch # Start the stack on fixed ports and attach
make dev-stop # Stop the tmux session and containers
make dev-clean # Remove stopped development containers
make dev-wipe # Stop everything and delete local development data
make dev-status # Show development container status
```
The UI is not started by this workflow. Start it separately by following the UI development instructions in the `ui/` directory.
### Pre-Commit Hooks
This repository uses Git pre-commit hooks managed by the [prek](https://prek.j178.dev/) tool, it is installed with `uv sync`. Next, run the following command in the root of this repository:
```shell theme={null}
prek install
```
Successful installation should produce the following output:
```shell theme={null}
prek installed at `.git/hooks/pre-commit`
```
If pre-commit hooks were previously installed, run `prek install --overwrite` to replace the existing hook. Otherwise, both tools will run on each commit.
#### Enable TruffleHog as a Pre-Push Hook
By default, only `pre-commit` hooks are installed. To enable [`TruffleHog`](https://github.com/trufflesecurity/trufflehog) secret scanning on every push, install the `pre-push` hook type explicitly:
```shell theme={null}
prek install --hook-type pre-push
```
Successful installation should produce the following output:
```shell theme={null}
prek installed at `.git/hooks/pre-push`
```
Once installed, TruffleHog runs before each push and blocks the operation when verified secrets are detected.
### Code Quality and Security Checks
Before merging pull requests, several automated checks and utilities ensure code security and updated dependencies:
These should have been already installed if `uv sync` was already run.
* [`bandit`](https://pypi.org/project/bandit/) for code security review.
* [`osv-scanner`](https://github.com/google/osv-scanner) and [`dependabot`](https://github.com/features/security) for dependencies.
* [`hadolint`](https://github.com/hadolint/hadolint) and [`dockle`](https://github.com/goodwithtech/dockle) for container security.
* [`Snyk`](https://docs.snyk.io/integrations/snyk-container-integrations/container-security-with-docker-hub-integration) for container security in Docker Hub.
* [`clair`](https://github.com/quay/clair) for container security in Amazon ECR.
* [`vulture`](https://pypi.org/project/vulture/), [`flake8`](https://pypi.org/project/flake8/), [`black`](https://pypi.org/project/black/), and [`pylint`](https://pypi.org/project/pylint/) for formatting and best practices.
Additionally, ensure the latest version of [`TruffleHog`](https://github.com/trufflesecurity/trufflehog) is installed to scan for sensitive data in the code. Follow the official [installation guide](https://github.com/trufflesecurity/trufflehog?tab=readme-ov-file#floppy_disk-installation) for setup.
### AI-Driven Contributions
If you are using AI assistants to help with your contributions, Prowler provides specialized resources to enhance AI-driven development:
* **Prowler MCP Server**: The [Prowler MCP Server](/getting-started/products/prowler-mcp) provides AI assistants with access to the entire Prowler ecosystem, including security checks, compliance frameworks, documentation, and more. This enables AI tools to better understand Prowler's architecture and help you create contributions that align with project standards.
* **AGENTS.md Files**: Each component of the Prowler monorepo includes an `AGENTS.md` file that contains specific guidelines for AI agents working on that component. These files provide context about project structure, coding standards, and best practices. When working on a specific component, refer to the relevant `AGENTS.md` file (e.g., `prowler/AGENTS.md`, `ui/AGENTS.md`, `api/AGENTS.md`) to ensure your AI assistant follows the appropriate guidelines.
* **AI Skills System**: The [AI Skills system](/developer-guide/ai-skills) provides on-demand patterns, templates, and best practices for AI agents. Skills help AI assistants understand Prowler's conventions and generate code that aligns with project standards. The skills are located in the `skills/` directory and are registered in the `AGENTS.md` files.
These resources help ensure that AI-assisted contributions maintain consistency with Prowler's codebase and development practices.
### Dependency Management
All dependencies are listed in the `pyproject.toml` file.
The SDK keeps direct dependencies pinned to exact versions, while `uv.lock` records the full resolved dependency tree and the artifact hashes for every package. Use `uv sync` from the lock file instead of ad-hoc `pip` installs when you need a reproducible environment.
For proper code documentation, refer to the following and follow the code documentation practices presented there: [Google Python Style Guide - Comments and Docstrings](https://github.com/google/styleguide/blob/gh-pages/pyguide.md#38-comments-and-docstrings).
If you encounter issues when committing to the Prowler repository, use the `--no-verify` flag with the `git commit` command.
### Repository Folder Structure
The Prowler codebase layout helps quickly locate where to add new features, checks, or integrations. The following is a high-level overview from the root of the repository:
```
prowler/
├── prowler/ # Main source code for Prowler SDK (CLI, providers, services, checks, compliances, config, etc.)
├── api/ # API server and related code
├── dashboard/ # Local Dashboard extracted from the CLI output
├── ui/ # Web UI components
├── util/ # Utility scripts and helpers
├── tests/ # Prowler SDK test suite
├── docs/ # Documentation, including this guide
├── examples/ # Example output formats for providers and scripts
├── permissions/ # Permission-related files and policies
├── contrib/ # Community-contributed scripts or modules
├── kubernetes/ # Kubernetes deployment files
├── .github/ # GitHub-related files (workflows, issue templates, etc.)
├── pyproject.toml # Python project configuration (uv)
├── uv.lock # uv lock file
├── README.md # Project overview and getting started
├── Makefile # Common development commands
├── Dockerfile # SDK Docker container
├── docker-compose.yml # Prowler App Docker compose
└── ... # Other supporting files
```
## Sending the Pull Request
When creating or reviewing a pull request in [Prowler](https://github.com/prowler-cloud/prowler), follow [this template](https://github.com/prowler-cloud/prowler/blob/master/.github/pull_request_template.md) and fill it with the relevant information:
* **Context** and **Description** of the change: This will help the reviewers to understand the change and the purpose of the pull request.
* **Steps to review**: A detailed description of how to review the change.
* **Checklist**: A mandatory checklist of the things that should be reviewed before merging the pull request.
## Contribution Appreciation
If you enjoy swag, we'd love to thank you for your contribution with laptop stickers or other Prowler merchandise!
To request swag: Share your pull request details in our [Slack workspace](https://goto.prowler.com/slack).
You can also reach out to Toni de la Fuente on [Twitter](https://twitter.com/ToniBlyx)—his DMs are open!
## Testing a Pull Request from a Specific Branch
To test Prowler from a specific branch (for example, to try out changes from a pull request before it is merged), you can use `pipx` to install directly from GitHub:
```sh theme={null}
pipx install "git+https://github.com/prowler-cloud/prowler.git@branch-name"
```
Replace `branch-name` with the name of the branch you want to test. This will install Prowler in an isolated environment, allowing you to try out the changes safely.
For more details on testing go to the [Testing section](/developer-guide/unit-testing) of this documentation.
# Kubernetes Provider
Source: https://docs.prowler.com/developer-guide/kubernetes-details
This page details the [Kubernetes](https://kubernetes.io/) provider implementation in Prowler.
By default, Prowler will audit all namespaces in the Kubernetes cluster accessible by the configured context. To configure it, see the [In-Cluster Execution](/user-guide/providers/kubernetes/getting-started-k8s#in-cluster-execution) or [Non In-Cluster Execution](/user-guide/providers/kubernetes/getting-started-k8s#non-in-cluster-execution) guides.
## Kubernetes Provider Classes Architecture
The Kubernetes provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the Kubernetes-specific implementation, highlighting how the generic provider concepts are realized for Kubernetes in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider).
### `KubernetesProvider` (Main Class)
* **Location:** [`prowler/providers/kubernetes/kubernetes_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/kubernetes/kubernetes_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for Kubernetes-specific logic, session management, context and namespace discovery, credential validation, and configuration.
* **Key Kubernetes Responsibilities:**
* Initializes and manages Kubernetes sessions (supports kubeconfig file or content, context selection, and namespace scoping).
* Validates credentials and sets up the Kubernetes identity context.
* Loads and manages configuration, mutelist, and fixer settings.
* Discovers accessible namespaces and cluster metadata.
* Provides properties and methods for downstream Kubernetes service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/kubernetes/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/kubernetes/models.py)
* **Purpose:** Define structured data for Kubernetes identity and session info.
* **Key Kubernetes Models:**
* `KubernetesIdentityInfo`: Holds Kubernetes identity metadata, such as context, cluster, and user.
* `KubernetesSession`: Stores the Kubernetes API client and context information.
### `KubernetesService` (Service Base Class)
* **Location:** [`prowler/providers/kubernetes/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/kubernetes/lib/service/service.py)
* **Purpose:** Abstract base class that all Kubernetes service-specific classes inherit from. This implements the generic service pattern (described in [service page](/developer-guide/services#service-base-class)) specifically for Kubernetes.
* **Key Kubernetes Responsibilities:**
* Receives a `KubernetesProvider` instance to access session, identity, and configuration.
* Manages the Kubernetes API client and context.
* Provides a `__threading_call__` method to make API calls in parallel by resource.
* Exposes common audit context (`context`, `api_client`, `audit_config`, `fixer_config`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/kubernetes/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/kubernetes/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for Kubernetes-specific error handling, such as session, API, and configuration errors.
### Session and Utility Helpers
* **Location:** [`prowler/providers/kubernetes/lib/`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/kubernetes/lib/)
* **Purpose:** Helpers for argument parsing, mutelist management, and other cross-cutting concerns.
## Specific Patterns in Kubernetes Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the currently implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/kubernetes/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/kubernetes/services)
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new service is following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and taking other already implemented services as reference.
### Kubernetes Service Common Patterns
* Services communicate with Kubernetes using the Kubernetes Python SDK. See the [official documentation](https://github.com/kubernetes-client/python/blob/master/kubernetes/README.md/).
* Every Kubernetes service class inherits from `KubernetesService`, ensuring access to session, identity, configuration, and client utilities.
* The constructor (`__init__`) always calls `super().__init__` with the provider object, and initializes resource containers (typically as dictionaries keyed by resource UID or name).
* Resource discovery and attribute collection can be parallelized using `self.__threading_call__`.
* All Kubernetes resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes.
* Kubernetes API calls are wrapped in try/except blocks, always logging errors.
* Additional attributes that cannot be retrieved from the default call should be collected and stored for each resource using dedicated methods and threading.
## Specific Patterns in Kubernetes Checks
The Kubernetes checks pattern is described in [checks page](/developer-guide/checks). You can find all the currently implemented checks in:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/kubernetes/services/rbac/rbac_minimize_wildcard_use_roles/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/kubernetes/services/rbac/rbac_minimize_wildcard_use_roles))
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new check is following the [Kubernetes check implementation documentation](/developer-guide/checks#creating-a-check) and taking other checks as reference.
### Check Report Class
The `Check_Report_Kubernetes` class models a single finding for a Kubernetes resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`Check_Report_Kubernetes` extends the base report structure with Kubernetes-specific fields, enabling detailed tracking of the resource, name, and namespace associated with each finding.
#### Constructor and Attribute Population
When you instantiate `Check_Report_Kubernetes`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its Kubernetes-specific attributes from the resource, using the following logic (in order of precedence):
* **`resource_id`**:
* Uses `resource.uid` if present.
* Otherwise, uses `resource.name` if present.
* Defaults to an empty string if none are available.
* **`resource_name`**:
* Uses `resource.name` if present.
* Defaults to an empty string if not available.
* **`namespace`**:
* Uses `resource.namespace` if present.
* Defaults to "cluster-wide" for cluster-scoped resources.
If the resource object does not contain the required attributes, you must set them manually in the check logic.
Other attributes are inherited from the `Check_Report` class, from which you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
report = Check_Report_Kubernetes(
metadata=check_metadata,
resource=resource_object
)
report.status = "PASS"
report.status_extended = "Resource is compliant."
```
# Lighthouse AI Architecture
Source: https://docs.prowler.com/developer-guide/lighthouse-architecture
This document describes the internal architecture of Prowler Lighthouse AI, enabling developers to understand how components interact and where to add new functionality.
**Looking for user documentation?** See:
* [Lighthouse AI Overview](/getting-started/products/prowler-lighthouse-ai) - Capabilities and FAQs
* [How Lighthouse AI Works](/user-guide/tutorials/prowler-app-lighthouse) - Configuration and usage
* [Multi-LLM Provider Setup](/user-guide/tutorials/prowler-app-lighthouse-multi-llm) - Provider configuration
## Architecture Overview
Lighthouse AI operates as a Langchain-based agent that connects Large Language Models (LLMs) with Prowler security data through the Model Context Protocol (MCP).
### Three-Tier Architecture
The system follows a three-tier architecture:
1. **Frontend (Next.js)**: Chat interface, message rendering, model selection
2. **API Route**: Request handling, authentication, stream transformation
3. **Langchain Agent**: LLM orchestration, tool calling through MCP
### Request Flow
When a user sends a message through the Lighthouse chat interface, the system processes it through several stages:
1. **User Submits a Message**.
The chat component (`ui/components/lighthouse/chat.tsx`) captures the user's question (e.g., "What are my critical findings in AWS?") and sends it as an HTTP POST request to the backend API route.
2. **Authentication and Context Assembly**.
The API route (`ui/app/api/lighthouse/analyst/route.ts`) validates the user's session, extracts the JWT token (stored via `auth-context.ts`), and gathers context including the tenant's business context and current security posture data (assembled in `data.ts`).
3. **Agent Initialization**.
The workflow orchestrator (`ui/lib/lighthouse/workflow.ts`) creates a Langchain agent configured with:
* The selected LLM, instantiated through the factory (`llm-factory.ts`)
* A system prompt containing available tools and instructions (`system-prompt.ts`)
* Two meta-tools (`describe_tool` and `execute_tool`) for accessing Prowler data
4. **LLM Reasoning and Tool Calling**.
The agent sends the conversation to the LLM, which decides whether to respond directly or call tools to fetch data. When tools are needed, the meta-tools in `ui/lib/lighthouse/tools/meta-tool.ts` interact with the MCP client (`mcp-client.ts`) to:
* First call `describe_tool` to understand the tool's parameters
* Then call `execute_tool` to retrieve data from the MCP Server
* Continue reasoning with the returned data
5. **Streaming Response**.
As the LLM generates its response, the stream handler (`ui/lib/lighthouse/analyst-stream.ts`) transforms Langchain events into UI-compatible messages and streams tokens back to the browser in real-time using Server-Sent Events. The stream includes both text tokens and tool execution events (displayed as "chain of thought").
6. **Message Rendering**.
The frontend receives the stream and renders it through `message-item.tsx` with markdown formatting. Any tool calls that occurred during reasoning are displayed via `chain-of-thought-display.tsx`.
## Frontend Components
Frontend components reside in `ui/components/lighthouse/` and handle the chat interface and configuration workflows.
### Core Components
| Component | Location | Purpose |
| ------------------------------ | --------------------------- | --------------------------------------------------------------- |
| `chat.tsx` | `ui/components/lighthouse/` | Main chat interface managing message history and input handling |
| `message-item.tsx` | `ui/components/lighthouse/` | Individual message rendering with markdown support |
| `select-model.tsx` | `ui/components/lighthouse/` | Model and provider selection dropdown |
| `chain-of-thought-display.tsx` | `ui/components/lighthouse/` | Displays tool calls and reasoning steps during execution |
### Configuration Components
| Component | Location | Purpose |
| ------------------------------------ | --------------------------------- | --------------------------------------------------- |
| `lighthouse-settings.tsx` | `ui/components/lighthouse/` | Settings panel for business context and preferences |
| `connect-llm-provider.tsx` | `ui/components/lighthouse/` | Provider connection workflow |
| `llm-providers-table.tsx` | `ui/components/lighthouse/` | Provider management table |
| `forms/delete-llm-provider-form.tsx` | `ui/components/lighthouse/forms/` | Provider deletion confirmation dialog |
### Supporting Components
| Component | Location | Purpose |
| ---------------------------------- | --------------------------------------- | -------------------------------------------------------------------------- |
| `banner.tsx` / `banner-client.tsx` | `ui/components/lighthouse/` | Status banners and notifications |
| `workflow/` | `ui/components/lighthouse/workflow/` | Multi-step configuration workflows |
| `ai-elements/` | `ui/components/lighthouse/ai-elements/` | Custom UI primitives for chat interface (input, select, dropdown, tooltip) |
## Library Code
Core library code resides in `ui/lib/lighthouse/` and handles agent orchestration, MCP communication, and stream processing.
### Workflow Orchestrator
**Location:** `ui/lib/lighthouse/workflow.ts`
The workflow module serves as the core orchestrator, responsible for:
* Initializing the Langchain agent with system prompt and tools
* Loading tenant configuration (default provider, model, business context)
* Creating the LLM instance through the factory
* Generating dynamic tool listings from available MCP tools
```typescript theme={null}
// Simplified workflow initialization
export async function initLighthouseWorkflow(runtimeConfig?: RuntimeConfig) {
await initializeMCPClient();
const toolListing = generateToolListing();
const systemPrompt = LIGHTHOUSE_SYSTEM_PROMPT_TEMPLATE.replace(
"{{TOOL_LISTING}}",
toolListing,
);
const llm = createLLM({
provider: providerType,
model: modelId,
credentials,
// ...
});
return createAgent({
model: llm,
tools: [describeTool, executeTool],
systemPrompt,
});
}
```
### MCP Client Manager
**Location:** `ui/lib/lighthouse/mcp-client.ts`
The MCP client manages connections to the Prowler MCP Server using a singleton pattern:
* **Connection Management**: Retry logic with configurable attempts and delays
* **Tool Discovery**: Fetches available tools from MCP server on initialization
* **Authentication Injection**: Automatically adds JWT tokens to `prowler_app_*` tool calls
* **Reconnection**: Supports forced reconnection after server restarts
Key constants:
* `MAX_RETRY_ATTEMPTS`: 3 connection attempts
* `RETRY_DELAY_MS`: 2000ms between retries
* `RECONNECT_INTERVAL_MS`: 5 minutes before retry after failure
```typescript theme={null}
// Authentication injection for Prowler App tools
private handleBeforeToolCall = ({ name, args }) => {
// Only inject auth for prowler_app_* tools (user-specific data)
if (!name.startsWith("prowler_app_")) {
return { args };
}
const accessToken = getAuthContext();
return {
args,
headers: { Authorization: `Bearer ${accessToken}` },
};
};
```
### Meta-Tools
**Location:** `ui/lib/lighthouse/tools/meta-tool.ts`
Instead of registering all MCP tools directly with the agent, Lighthouse uses two meta-tools for dynamic tool discovery and execution:
| Tool | Purpose |
| --------------- | --------------------------------------------------------------- |
| `describe_tool` | Retrieves full schema and parameter details for a specific tool |
| `execute_tool` | Executes a tool with provided parameters |
This pattern reduces the number of tools the LLM must track while maintaining access to all MCP capabilities.
### Additional Library Modules
| Module | Location | Purpose |
| ------------------- | -------------------- | -------------------------------------------------------------------------- |
| `analyst-stream.ts` | `ui/lib/lighthouse/` | Transforms Langchain stream events to UI message format |
| `llm-factory.ts` | `ui/lib/lighthouse/` | Creates LLM instances for OpenAI, Bedrock, and OpenAI-compatible providers |
| `system-prompt.ts` | `ui/lib/lighthouse/` | System prompt template with dynamic tool listing injection |
| `auth-context.ts` | `ui/lib/lighthouse/` | AsyncLocalStorage for JWT token propagation across async boundaries |
| `types.ts` | `ui/lib/lighthouse/` | TypeScript type definitions |
| `constants.ts` | `ui/lib/lighthouse/` | Configuration constants and error messages |
| `utils.ts` | `ui/lib/lighthouse/` | Message conversion and model parameter extraction |
| `validation.ts` | `ui/lib/lighthouse/` | Input validation utilities |
| `data.ts` | `ui/lib/lighthouse/` | Current data section generation for context enrichment |
## API Route
**Location:** `ui/app/api/lighthouse/analyst/route.ts`
The API route handles chat requests and manages the streaming response pipeline:
1. **Request Parsing**: Extracts messages, model, and provider from request body
2. **Authentication**: Validates session and extracts access token
3. **Context Assembly**: Gathers business context and current data
4. **Agent Initialization**: Creates Langchain agent with runtime configuration
5. **Stream Processing**: Transforms agent events to UI-compatible format
6. **Error Handling**: Captures errors with Sentry integration
```typescript theme={null}
export async function POST(req: Request) {
const { messages, model, provider } = await req.json();
const session = await auth();
if (!session?.accessToken) {
return Response.json({ error: "Unauthorized" }, { status: 401 });
}
return await authContextStorage.run(accessToken, async () => {
const app = await initLighthouseWorkflow(runtimeConfig);
const agentStream = app.streamEvents({ messages }, { version: "v2" });
// Transform stream events to UI format
const stream = new ReadableStream({
async start(controller) {
for await (const streamEvent of agentStream) {
// Handle on_chat_model_stream, on_tool_start, on_tool_end, etc.
}
},
});
return createUIMessageStreamResponse({ stream });
});
}
```
## Backend Components
Backend components handle LLM provider configuration, model management, and credential storage.
### Database Models
**Location:** `api/src/backend/api/models.py`
| Model | Purpose |
| --------------------------------- | --------------------------------------------------------------------------- |
| `LighthouseProviderConfiguration` | Per-tenant LLM provider credentials (encrypted with Fernet) |
| `LighthouseTenantConfiguration` | Tenant-level settings including business context and default provider/model |
| `LighthouseProviderModels` | Available models per provider configuration |
All models implement Row-Level Security (RLS) for tenant isolation.
#### LighthouseProviderConfiguration
Stores provider-specific credentials for each tenant:
* **provider\_type**: `openai`, `bedrock`, or `openai_compatible`
* **credentials**: Encrypted JSON containing API keys or AWS credentials
* **base\_url**: Custom endpoint for OpenAI-compatible providers
* **is\_active**: Connection validation status
#### LighthouseTenantConfiguration
Stores tenant-wide Lighthouse settings:
* **business\_context**: Optional context for personalized responses
* **default\_provider**: Default LLM provider type
* **default\_models**: JSON mapping provider types to default model IDs
#### LighthouseProviderModels
Catalogs available models for each provider:
* **model\_id**: Provider-specific model identifier
* **model\_name**: Human-readable display name
* **default\_parameters**: Optional model-specific parameters
### Background Jobs
**Location:** `api/src/backend/tasks/jobs/lighthouse_providers.py`
#### check\_lighthouse\_provider\_connection
Validates provider credentials by making a test API call:
* OpenAI: Lists models via `client.models.list()`
* Bedrock: Lists foundation models via `bedrock_client.list_foundation_models()`
* OpenAI-compatible: Lists models via custom base URL
Updates `is_active` status based on connection result.
#### refresh\_lighthouse\_provider\_models
Synchronizes available models from provider APIs:
* Fetches current model catalog from provider
* Filters out non-chat models (DALL-E, Whisper, TTS, embeddings)
* Upserts model records in `LighthouseProviderModels`
* Removes stale models no longer available
**Excluded OpenAI model prefixes:**
```python theme={null}
EXCLUDED_OPENAI_MODEL_PREFIXES = (
"dall-e", "whisper", "tts-", "sora",
"text-embedding", "text-moderation",
# Legacy models
"text-davinci", "davinci", "curie", "babbage", "ada",
)
```
## MCP Server Integration
Lighthouse AI communicates with the Prowler MCP Server to access security data. For detailed MCP Server architecture, see [Extending the MCP Server](/developer-guide/mcp-server).
### Tool Namespacing
MCP tools are organized into three namespaces based on authentication requirements:
| Namespace | Auth Required | Description |
| ---------------- | ------------- | ----------------------------------------------------------------- |
| `prowler_app_*` | Yes (JWT) | Prowler Cloud/App tools for findings, providers, scans, resources |
| `prowler_hub_*` | No | Security checks catalog, compliance frameworks |
| `prowler_docs_*` | No | Documentation search and retrieval |
### Authentication Flow
1. User authenticates with Prowler App, receiving a JWT token
2. Token is stored in session and propagated via `authContextStorage`
3. MCP client injects `Authorization: Bearer ` header for `prowler_app_*` calls
4. MCP Server validates token and applies RLS filtering
### Tool Execution Pattern
The agent uses meta-tools rather than direct tool registration:
```
Agent needs data → describe_tool("prowler_app_search_findings")
→ Returns parameter schema → execute_tool with parameters
→ MCP client adds auth header → MCP Server executes
→ Results returned to agent → Agent continues reasoning
```
## Extension Points
### Adding New LLM Providers
To add a new LLM provider:
1. **Frontend**: Update `ui/lib/lighthouse/llm-factory.ts` with provider-specific initialization
2. **Backend**: Add provider type to `LighthouseProviderConfiguration.LLMProviderChoices`
3. **Jobs**: Add credential extraction and model fetching in `lighthouse_providers.py`
4. **UI**: Add connection workflow in `ui/components/lighthouse/workflow/`
### Modifying System Prompt
The system prompt template lives in `ui/lib/lighthouse/system-prompt.ts`. The `{{TOOL_LISTING}}` placeholder is dynamically replaced with available MCP tools during agent initialization.
### Adding Stream Events
To handle new Langchain stream events, modify `ui/lib/lighthouse/analyst-stream.ts`. Current handlers include:
* `on_chat_model_stream`: Token-by-token text streaming
* `on_chat_model_end`: Model completion with tool call detection
* `on_tool_start`: Tool execution started
* `on_tool_end`: Tool execution completed
### Adding MCP Tools
See [Extending the MCP Server](/developer-guide/mcp-server) for detailed instructions on adding new tools to the Prowler MCP Server.
## Configuration
### Environment Variables
| Variable | Description |
| ------------------------ | --------------------------------------------------------- |
| `PROWLER_MCP_SERVER_URL` | MCP server endpoint (e.g., `https://mcp.prowler.com/mcp`) |
### Database Configuration
Provider credentials are stored encrypted in `LighthouseProviderConfiguration`:
* **OpenAI**: `{"api_key": "sk-..."}`
* **Bedrock**: `{"access_key_id": "...", "secret_access_key": "...", "region": "us-east-1"}` or `{"api_key": "...", "region": "us-east-1"}`
* **OpenAI-compatible**: `{"api_key": "..."}` with `base_url` field
### Tenant Configuration
Business context and default settings are stored in `LighthouseTenantConfiguration`:
```python theme={null}
{
"business_context": "Optional organization context for personalized responses",
"default_provider": "openai",
"default_models": {
"openai": "gpt-4o",
"bedrock": "anthropic.claude-3-5-sonnet-20240620-v1:0"
}
}
```
## Related Documentation
Adding new tools to the Prowler MCP Server
Capabilities, FAQs, and limitations
Configuring multiple LLM providers
User-facing architecture and setup guide
# LLM Provider
Source: https://docs.prowler.com/developer-guide/llm-details
This page details the [Large Language Model (LLM)](https://en.wikipedia.org/wiki/Large_language_model) provider implementation in Prowler.
The LLM provider enables security testing of language models using red team techniques. By default, Prowler uses the built-in LLM configuration that targets OpenAI models with comprehensive security test suites. To configure it, follow the [LLM getting started guide](/user-guide/providers/llm/getting-started-llm).
## LLM Provider Classes Architecture
The LLM provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the LLM-specific implementation, highlighting how the generic provider concepts are realized for LLM security testing in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider).
### Main Class
* **Location:** [`prowler/providers/llm/llm_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/llm/llm_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for LLM-specific logic, configuration management, and integration with promptfoo for red team testing.
* **Key LLM Responsibilities:**
* Initializes and manages LLM configuration using promptfoo.
* Validates configuration and sets up the LLM testing context.
* Loads and manages red team test configuration, plugins, and target models.
* Provides properties and methods for downstream LLM security testing.
* Integrates with promptfoo for comprehensive LLM security evaluation.
### Data Models
* **Location:** [`prowler/providers/llm/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/llm/models.py)
* **Purpose:** Define structured data for LLM output options and configuration.
* **Key LLM Models:**
* `LLMOutputOptions`: Customizes output filename logic for LLM-specific reporting.
### LLM Security Testing Integration
* **Location:** [`prowler/providers/llm/llm_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/llm/llm_provider.py)
* **Purpose:** Integrates with promptfoo for comprehensive LLM security testing.
* **Key LLM Responsibilities:**
* Executes promptfoo red team evaluations against target LLMs.
* Processes security test results and converts them to Prowler reports.
* Manages test concurrency and progress tracking.
* Handles real-time streaming of test results.
### Configuration Management
The LLM provider uses promptfoo configuration files to define:
* **Target Models**: The LLM models to test (e.g., OpenAI GPT, Anthropic Claude)
* **Red Team Plugins**: Security test suites (OWASP, MITRE, NIST, EU AI Act)
* **Test Parameters**: Concurrency, test counts, and evaluation criteria
### Default Configuration
Prowler includes a comprehensive default LLM configuration that:
* Targets OpenAI models by default
* Includes multiple security test frameworks (OWASP, MITRE, NIST, EU AI Act)
* Provides extensive test coverage for LLM security vulnerabilities
* Supports custom configuration for specific testing needs
## Specific Patterns in LLM Security Testing
The LLM provider implements security testing through integration with promptfoo, following these patterns:
### Red Team Testing Framework
* **Plugin-based Architecture**: Uses promptfoo plugins for different security test categories
* **Comprehensive Coverage**: Includes OWASP LLM Top 10, MITRE ATLAS, NIST AI Risk Management, and EU AI Act compliance
* **Real-Time Evaluation**: Streams test results as they are generated
* **Progress Tracking**: Provides detailed progress information during test execution
### Test Execution Flow
1. **Configuration Loading**: Loads promptfoo configuration with target models and test plugins
2. **Test Generation**: Generates security test cases based on configured plugins
3. **Concurrent Execution**: Runs tests with configurable concurrency limits
4. **Result Processing**: Converts promptfoo results to Prowler security reports
5. **Progress Monitoring**: Tracks and displays test execution progress
### Security Test Categories
The LLM provider supports comprehensive security testing across multiple frameworks:
* **OWASP LLM Top 10**: Covers prompt injection, data leakage, and model security
* **MITRE ATLAS**: Adversarial threat landscape for AI systems
* **NIST AI Risk Management**: AI system risk assessment and mitigation
* **EU AI Act**: European Union AI regulation compliance
* **Custom Tests**: Support for organization-specific security requirements
## Error Handling and Validation
The LLM provider includes comprehensive error handling for:
* **Configuration Validation**: Ensures valid promptfoo configuration files
* **Model Access**: Handles authentication and access issues with target LLMs
* **Test Execution**: Manages test failures and timeout scenarios
* **Result Processing**: Handles malformed or incomplete test results
## Integration with Prowler Ecosystem
The LLM provider seamlessly integrates with Prowler's existing infrastructure:
* **Output Formats**: Supports all Prowler output formats (JSON, CSV, HTML, etc.)
* **Compliance Frameworks**: Integrates with Prowler's compliance reporting
* **Fixer Integration**: Supports automated remediation recommendations
* **Dashboard Integration**: Compatible with Prowler App for centralized management
# Microsoft 365 (M365) Provider
Source: https://docs.prowler.com/developer-guide/m365-details
This page details the [Microsoft 365 (M365)](https://www.microsoft.com/en-us/microsoft-365) provider implementation in Prowler.
By default, Prowler will audit the Microsoft Entra ID tenant and its supported services. To configure it, follow the [M365 getting started guide](/user-guide/providers/microsoft365/getting-started-m365).
***
## PowerShell Requirements for M365 Checks
> **Most Microsoft 365 checks in Prowler require PowerShell, not just the Microsoft Graph API.**
* **PowerShell is essential** for retrieving data from Exchange Online, Teams, Defender, Purview, and other M365 services. Many checks cannot be performed using only the Graph API.
* **PowerShell 7.4 or higher is required** (7.5 recommended). PowerShell 5.1 and earlier versions are not supported for M365 checks.
* **Required modules:**
* [ExchangeOnlineManagement](https://www.powershellgallery.com/packages/ExchangeOnlineManagement/3.6.0) (≥ 3.6.0)
* [MicrosoftTeams](https://www.powershellgallery.com/packages/MicrosoftTeams/6.6.0) (≥ 6.6.0)
* If you use Prowler Cloud or the official containers, PowerShell is pre-installed. For local or pip installations, you must install PowerShell and the modules yourself. See [Authentication: Supported PowerShell Versions](/user-guide/providers/microsoft365/authentication#supported-powershell-versions) and [Needed PowerShell Modules](/user-guide/providers/microsoft365/authentication#required-powershell-modules).
* For more details and troubleshooting, see [Use of PowerShell in M365](/user-guide/providers/microsoft365/use-of-powershell).
***
## M365 Provider Classes Architecture
The M365 provider implementation follows the general [Provider structure](/developer-guide/provider). This section focuses on the M365-specific implementation, highlighting how the generic provider concepts are realized for M365 in Prowler. For a full overview of the provider pattern, base classes, and extension guidelines, see [Provider documentation](/developer-guide/provider).
### `M365Provider` (Main Class)
* **Location:** [`prowler/providers/m365/m365_provider.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/m365_provider.py)
* **Base Class:** Inherits from `Provider` (see [base class details](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/common/provider.py)).
* **Purpose:** Central orchestrator for M365-specific logic, session management, credential validation, region/authority configuration, and identity context.
* **Key M365 Responsibilities:**
* Initializes and manages M365 sessions (supports Service Principal, environment variables, Azure CLI, browser, and user/password authentication).
* Validates credentials and sets up the M365 identity context.
* Manages the Microsoft Graph API client and the PowerShell client.
* Loads and manages configuration, mutelist, and fixer settings.
* Provides properties and methods for downstream M365 service classes to access session, identity, and configuration data.
### Data Models
* **Location:** [`prowler/providers/m365/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/models.py)
* **Purpose:** Define structured data for M365 identity, session, region configuration, and credentials.
* **Key M365 Models:**
* `M365IdentityInfo`: Holds M365 identity metadata, including tenant ID, domain(s), user, and location.
* `M365RegionConfig`: Stores the specific region/authority and API base URL for the tenant.
* `M365Credentials`: Represents credentials for authentication (user, password, client ID, client secret, tenant ID, etc.).
### `M365Service` (Service Base Class)
* **Location:** [`prowler/providers/m365/lib/service/service.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/lib/service/service.py)
* **Purpose:** Abstract base class for all M365 service-specific classes.
* **Key M365 Responsibilities:**
* Receives an `M365Provider` instance to access session, identity, and configuration.
* Manages the Microsoft Graph API client for the service.
* Initializes a PowerShell client for most services if credentials and identity are available.
* Exposes common audit context (`audit_config`, `fixer_config`) to subclasses.
### Exception Handling
* **Location:** [`prowler/providers/m365/exceptions/exceptions.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/exceptions/exceptions.py)
* **Purpose:** Custom exception classes for M365-specific error handling, such as credential, session, region, and argument errors.
### Session and Utility Helpers
* **Location:** [`prowler/providers/m365/lib/`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/lib/)
* **Purpose:** Helpers for argument parsing, region/authority setup, mutelist management, PowerShell integration, and other cross-cutting concerns.
> **Key File: [`m365_powershell.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/lib/powershell/m365_powershell.py)**
>
> This is the core module for Microsoft 365 PowerShell integration. It manages authentication, session handling, and provides a comprehensive set of methods for interacting with Microsoft Teams, Exchange Online, and Defender policies via PowerShell.
>
> This module provides secure credential management and authentication using MSAL and PowerShell. It handles automated installation and initialization of required PowerShell modules. The module offers a rich set of methods for retrieving and managing Teams, Exchange, and Defender configurations. It serves as the central component for all M365 provider operations that require PowerShell automation.
## Specific Patterns in M365 Services
The generic service pattern is described in [service page](/developer-guide/services#service-structure-and-initialisation). You can find all the currently implemented services in the following locations:
* Directly in the code, in location [`prowler/providers/m365/services/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/m365/services)
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new service is by following the [service implementation documentation](/developer-guide/services#adding-a-new-service) and by taking other already implemented services as reference.
### M365 Service Common Patterns
* Services communicate with Microsoft 365 using the Microsoft Graph API **and/or PowerShell**. See the [official documentation](https://learn.microsoft.com/en-us/graph/api/overview) and [PowerShell reference](https://learn.microsoft.com/en-us/powershell/).
* Every M365 service class inherits from `M365Service`, ensuring access to session, identity, configuration, and client utilities.
* The constructor (`__init__`) always calls `super().__init__` with the provider object, and initializes the Graph client and the PowerShell client.
* Resource containers **must** be initialized in the constructor, typically as objects that represent the different settings of the service.
* All M365 resources are represented as Pydantic `BaseModel` classes, providing type safety and structured access to resource attributes.
* Microsoft Graph API and PowerShell calls are wrapped in try/except blocks, always logging errors.
* To retrieve some data in the services, it is so common that you have to create a new method also in the `m365_powershell.py` file to later be called in the service.
## Specific Patterns in M365 Checks
The M365 checks pattern is described in [checks page](/developer-guide/checks). You can find all the currently implemented checks in:
* Directly in the code, within each service folder, each check has its own folder named after the name of the check. (e.g. [`prowler/providers/m365/services/entra/entra_users_mfa_enabled/`](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers/m365/services/entra/entra_users_mfa_enabled))
* In the [Prowler Hub](https://hub.prowler.com/) for a more human-readable view.
The best reference to understand how to implement a new check is following the [M365 check implementation documentation](/developer-guide/checks#creating-a-check) and by taking other checks as reference.
### Check Report Class
The `CheckReportM365` class models a single finding for a Microsoft 365 resource in a check report. It is defined in [`prowler/lib/check/models.py`](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/check/models.py) and inherits from the generic `Check_Report` base class.
#### Purpose
`CheckReportM365` extends the base report structure with M365-specific fields, enabling detailed tracking of the resource, name, and location associated with each finding.
#### Constructor and Attribute Population
When you instantiate `CheckReportM365`, you must provide the check metadata and a resource object. The class will attempt to automatically populate its M365-specific attributes from the resource, using the following logic (in order of precedence):
* **`resource_id`**: A required field that **must** be explicitly set in the constructor to identify the resource being checked.
* **`resource_name`**: A required field that **must** be explicitly set in the constructor to provide a human-readable name for the resource.
* **`location`**: A required field that can be explicitly set in the constructor to indicate where the resource is located. If not specified, defaults to "global".
If the resource object does not contain the required attributes, you must set them manually in the check logic.
Other attributes are inherited from the `Check_Report` class, from which you **always** have to set the `status` and `status_extended` attributes in the check logic.
#### Example Usage
```python theme={null}
report = CheckReportM365(
metadata=check_metadata,
resource=resource_object
)
report.status = "PASS"
report.status_extended = "Resource is compliant."
```
# Extending the MCP Server
Source: https://docs.prowler.com/developer-guide/mcp-server
This guide explains how to extend the Prowler MCP Server with new tools and features.
**New to Prowler MCP Server?** Start with the user documentation:
* [Overview](/getting-started/products/prowler-mcp) - Key capabilities, use cases, and deployment options
* [Installation](/getting-started/installation/prowler-mcp) - Install locally or use the managed server
* [Configuration](/getting-started/basic-usage/prowler-mcp) - Configure Claude Desktop, Cursor, and other MCP hosts
* [Tools Reference](/getting-started/basic-usage/prowler-mcp-tools) - Complete list of all available tools
## Introduction
The Prowler MCP Server brings the entire Prowler ecosystem to AI assistants through the [Model Context Protocol (MCP)](https://modelcontextprotocol.io). It enables seamless integration with AI tools like Claude Desktop, Cursor, and other MCP clients.
The server follows a modular architecture with three independent sub-servers:
| Sub-Server | Auth Required | Description |
| --------------------- | ------------- | -------------------------------------------------------------------------------------------- |
| Prowler App | Yes | Full access to Prowler Cloud and Self-Managed features |
| Prowler Hub | No | Security checks catalog with **over 1000 checks**, fixers, and **70+ compliance frameworks** |
| Prowler Documentation | No | Full-text search and retrieval of official documentation |
For a complete list of tools and their descriptions, see the [Tools Reference](/getting-started/basic-usage/prowler-mcp-tools).
## Architecture Overview
The MCP Server architecture is illustrated in the [Overview documentation](/getting-started/products/prowler-mcp#mcp-server-architecture). AI assistants connect through the MCP protocol to access Prowler's three main components.
### Server Structure
The main server orchestrates three sub-servers with prefixed namespacing:
```
mcp_server/prowler_mcp_server/
├── server.py # Main orchestrator
├── main.py # CLI entry point
├── prowler_hub/
├── prowler_app/
│ ├── tools/ # Tool implementations
│ ├── models/ # Pydantic models
│ └── utils/ # API client, auth, loader
└── prowler_documentation/
```
### Tool Registration Patterns
The MCP Server uses two patterns for tool registration:
1. **Direct Decorators** (Prowler Hub/Docs): Tools are registered using `@mcp.tool()` decorators
2. **Auto-Discovery** (Prowler App): All public methods of `BaseTool` subclasses are auto-registered
## Adding Tools to Prowler App
### Step 1: Create the Tool Class
Create a new file or add to an existing file in `prowler_app/tools/`:
```python theme={null}
# prowler_app/tools/new_feature.py
from typing import Any
from pydantic import Field
from prowler_mcp_server.prowler_app.models.new_feature import (
FeatureListResponse,
DetailedFeature,
)
from prowler_mcp_server.prowler_app.tools.base import BaseTool
class NewFeatureTools(BaseTool):
"""Tools for managing new features."""
async def list_features(
self,
status: str | None = Field(
default=None,
description="Filter by status (active, inactive, pending)"
),
page_size: int = Field(
default=50,
description="Number of results per page (1-100)"
),
) -> dict[str, Any]:
"""List all features with optional filtering.
Returns a lightweight list of features optimized for LLM consumption.
Use get_feature for complete information about a specific feature.
"""
# Validate parameters
self.api_client.validate_page_size(page_size)
# Build query parameters
params: dict[str, Any] = {"page[size]": page_size}
if status:
params["filter[status]"] = status
# Make API request
clean_params = self.api_client.build_filter_params(params)
response = await self.api_client.get("/api/v1/features", params=clean_params)
# Transform to LLM-friendly format
return FeatureListResponse.from_api_response(response).model_dump()
async def get_feature(
self,
feature_id: str = Field(description="The UUID of the feature"),
) -> dict[str, Any]:
"""Get detailed information about a specific feature.
Returns complete feature details including configuration and metadata.
"""
try:
response = await self.api_client.get(f"/api/v1/features/{feature_id}")
return DetailedFeature.from_api_response(response["data"]).model_dump()
except Exception as e:
self.logger.error(f"Failed to get feature {feature_id}: {e}")
return {"error": str(e), "status": "failed"}
```
### Step 2: Create the Models
Create corresponding models in `prowler_app/models/`:
```python theme={null}
# prowler_app/models/new_feature.py
from typing import Any
from pydantic import Field
from prowler_mcp_server.prowler_app.models.base import MinimalSerializerMixin
class SimplifiedFeature(MinimalSerializerMixin):
"""Lightweight feature for list operations."""
id: str = Field(description="Unique feature identifier")
name: str = Field(description="Feature name")
status: str = Field(description="Current status")
@classmethod
def from_api_response(cls, data: dict[str, Any]) -> "SimplifiedFeature":
"""Transform API response to simplified format."""
attributes = data.get("attributes", {})
return cls(
id=data["id"],
name=attributes["name"],
status=attributes["status"],
)
class DetailedFeature(SimplifiedFeature):
"""Extended feature with complete details."""
description: str | None = Field(default=None, description="Feature description")
configuration: dict[str, Any] | None = Field(default=None, description="Configuration")
created_at: str = Field(description="Creation timestamp")
updated_at: str = Field(description="Last update timestamp")
@classmethod
def from_api_response(cls, data: dict[str, Any]) -> "DetailedFeature":
"""Transform API response to detailed format."""
attributes = data.get("attributes", {})
return cls(
id=data["id"],
name=attributes["name"],
status=attributes["status"],
description=attributes.get("description"),
configuration=attributes.get("configuration"),
created_at=attributes["created_at"],
updated_at=attributes["updated_at"],
)
class FeatureListResponse(MinimalSerializerMixin):
"""Response wrapper for feature list operations."""
count: int = Field(description="Total number of features")
features: list[SimplifiedFeature] = Field(description="List of features")
@classmethod
def from_api_response(cls, response: dict[str, Any]) -> "FeatureListResponse":
"""Transform API response to list format."""
data = response.get("data", [])
features = [SimplifiedFeature.from_api_response(item) for item in data]
return cls(count=len(features), features=features)
```
### Step 3: Verify Auto-Discovery
No manual registration is needed. The `tool_loader.py` automatically discovers and registers all `BaseTool` subclasses. Verify your tool is loaded by checking the server logs:
```
INFO - Auto-registered 2 tools from NewFeatureTools
INFO - Loaded and registered: NewFeatureTools
```
## Adding Tools to Prowler Hub/Docs
For Prowler Hub or Documentation tools, use the `@mcp.tool()` decorator directly:
```python theme={null}
# prowler_hub/server.py
from fastmcp import FastMCP
hub_mcp_server = FastMCP("prowler-hub")
@hub_mcp_server.tool()
async def get_new_artifact(
artifact_id: str,
) -> dict:
"""Fetch a specific artifact from Prowler Hub.
Args:
artifact_id: The unique identifier of the artifact
Returns:
Dictionary containing artifact details
"""
response = prowler_hub_client.get(f"/artifact/{artifact_id}")
response.raise_for_status()
return response.json()
```
## Model Design Patterns
### MinimalSerializerMixin
All models should use `MinimalSerializerMixin` to optimize responses for LLM consumption:
```python theme={null}
from prowler_mcp_server.prowler_app.models.base import MinimalSerializerMixin
class MyModel(MinimalSerializerMixin):
"""Model that excludes empty values from serialization."""
required_field: str
optional_field: str | None = None # Excluded if None
empty_list: list = [] # Excluded if empty
```
This mixin automatically excludes:
* `None` values
* Empty strings
* Empty lists
* Empty dictionaries
### Two-Tier Model Pattern
Use two-tier models for efficient responses:
* **Simplified**: Lightweight models for list operations
* **Detailed**: Extended models for single-item retrieval
```python theme={null}
class SimplifiedItem(MinimalSerializerMixin):
"""Use for list operations - minimal fields."""
id: str
name: str
status: str
class DetailedItem(SimplifiedItem):
"""Use for get operations - extends simplified with details."""
description: str | None = None
configuration: dict | None = None
created_at: str
updated_at: str
```
### Factory Method Pattern
Always implement `from_api_response()` for API transformation:
```python theme={null}
@classmethod
def from_api_response(cls, data: dict[str, Any]) -> "MyModel":
"""Transform API response to model.
This method handles the JSON:API format used by Prowler API,
extracting attributes and relationships as needed.
"""
attributes = data.get("attributes", {})
return cls(
id=data["id"],
name=attributes["name"],
# ... map other fields
)
```
## API Client Usage
The `ProwlerAPIClient` is a singleton that handles authentication and HTTP requests:
```python theme={null}
class MyTools(BaseTool):
async def my_tool(self) -> dict:
# GET request
response = await self.api_client.get("/api/v1/endpoint", params={"key": "value"})
# POST request
response = await self.api_client.post(
"/api/v1/endpoint",
json_data={"data": {"type": "items", "attributes": {...}}}
)
# PATCH request
response = await self.api_client.patch(
f"/api/v1/endpoint/{id}",
json_data={"data": {"attributes": {...}}}
)
# DELETE request
response = await self.api_client.delete(f"/api/v1/endpoint/{id}")
```
### Helper Methods
The API client provides useful helper methods:
```python theme={null}
# Validate page size (1-1000)
self.api_client.validate_page_size(page_size)
# Normalize date range with max days limit
date_range = self.api_client.normalize_date_range(date_from, date_to, max_days=2)
# Build filter parameters (handles type conversion)
clean_params = self.api_client.build_filter_params({
"filter[status]": "active",
"filter[severity__in]": ["high", "critical"], # Converts to comma-separated
"filter[muted]": True, # Converts to "true"
})
# Poll async task until completion
result = await self.api_client.poll_task_until_complete(
task_id=task_id,
timeout=60,
poll_interval=1.0
)
```
## Best Practices
### Tool Docstrings
Tool docstrings become description that is going to be read by the LLM. Provide clear usage instructions and common workflows:
```python theme={null}
async def search_items(self, status: str = Field(...)) -> dict:
"""Search items with advanced filtering.
Returns a lightweight list optimized for LLM consumption.
Use get_item for complete details about a specific item.
Common workflows:
- Find critical items: status="critical"
- Find recent items: Use date_from parameter
"""
```
### Error Handling
Return structured error responses instead of raising exceptions:
```python theme={null}
async def get_item(self, item_id: str) -> dict:
try:
response = await self.api_client.get(f"/api/v1/items/{item_id}")
return DetailedItem.from_api_response(response["data"]).model_dump()
except Exception as e:
self.logger.error(f"Failed to get item {item_id}: {e}")
return {"error": str(e), "status": "failed"}
```
### Parameter Descriptions
Use Pydantic `Field()` with clear descriptions. This also helps LLMs understand
the purpose of each parameter, so be as descriptive as possible:
```python theme={null}
async def list_items(
self,
severity: list[str] = Field(
default=[],
description="Filter by severity levels (critical, high, medium, low)"
),
status: str | None = Field(
default=None,
description="Filter by status (PASS, FAIL, MANUAL)"
),
page_size: int = Field(
default=50,
description="Results per page"
),
) -> dict:
```
## Development Commands
```bash theme={null}
# Navigate to MCP server directory
cd mcp_server
# Run in STDIO mode (default)
uv run prowler-mcp
# Run in HTTP mode
uv run prowler-mcp --transport http --host 0.0.0.0 --port 8000
# Run with environment variables
PROWLER_APP_API_KEY="pk_xxx" uv run prowler-mcp
```
For complete installation and deployment options, see:
* [Installation Guide](/getting-started/installation/prowler-mcp#from-source-development) - Development setup instructions
* [Configuration Guide](/getting-started/basic-usage/prowler-mcp) - MCP client configuration
For development I recommend to use the [Model Context Protocol Inspector](https://github.com/modelcontextprotocol/inspector) as MCP client to test and debug your tools.
## Related Documentation
Key capabilities, use cases, and deployment options
Complete reference of all available tools
Security checks and compliance frameworks catalog
AI-powered security analyst
## Additional Resources
* [MCP Protocol Specification](https://modelcontextprotocol.io) - Model Context Protocol details
* [Prowler API Documentation](https://api.prowler.com/api/v1/docs) - API reference
* [Prowler Hub API](https://hub.prowler.com/api/docs) - Hub API reference
* [GitHub Repository](https://github.com/prowler-cloud/prowler) - Source code
# Create a Custom Output Format
Source: https://docs.prowler.com/developer-guide/outputs
## Introduction
Prowler supports multiple output formats, allowing users to tailor findings presentation to their needs. Custom output formats are valuable when integrating Prowler with third-party tools, generating specialized reports, or adapting data for specific workflows. By defining a custom output format, users can refine how findings are structured, extracting and displaying only the most relevant information.
* Output Organization in Prowler
Prowler outputs are managed within the `/lib/outputs` directory. Each format—such as JSON, CSV, HTML, SARIF—is implemented as a Python class.
* Outputs are generated based on scan findings, which are stored as structured dictionaries containing details such as:
* Resource IDs
* Severities
* Descriptions
* Other relevant metadata
* Creation Guidelines
Refer to the [Prowler Developer Guide](https://docs.prowler.com/projects/prowler-open-source/en/latest/) for insights into Prowler’s architecture and best practices for creating custom outputs.
* Identify the most suitable integration method for the output being targeted.
## Steps to Create a Custom Output Format
### Schema
* Output Class:
* The class must inherit from `Output`. Review the [Output Class](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/outputs/output.py).
* Create a class that encapsulates the required attributes and methods for interacting with the target platform. Below the code for the `CSV` class is presented:
```python title="CSV Class" theme={null}
class CSV(Output):
def transform(self, findings: List[Finding]) -> None:
"""Transforms the findings into the CSV format.
Args:
findings (list[Finding]): a list of Finding objects
"""
...
```
* Transform Method:
* This method will transform the findings provided by Prowler to a specific format.
#### Method Implementation
The following example demonstrates the `transform` method for the `CSV` class:
```python title="Transform" theme={null}
def transform(self, findings: List[Finding]) -> None:
"""Transforms the findings into the CSV format.
Args:
findings (list[Finding]): a list of Finding objects
"""
try:
for finding in findings:
finding_dict = {}
finding_dict["AUTH_METHOD"] = finding.auth_method
finding_dict["TIMESTAMP"] = finding.timestamp
finding_dict["ACCOUNT_UID"] = finding.account_uid
finding_dict["ACCOUNT_NAME"] = finding.account_name
finding_dict["ACCOUNT_EMAIL"] = finding.account_email
finding_dict["ACCOUNT_ORGANIZATION_UID"] = (
finding.account_organization_uid
)
finding_dict["ACCOUNT_ORGANIZATION_NAME"] = (
finding.account_organization_name
)
finding_dict["ACCOUNT_TAGS"] = unroll_dict(
finding.account_tags, separator=":"
)
finding_dict["FINDING_UID"] = finding.uid
finding_dict["PROVIDER"] = finding.metadata.Provider
finding_dict["CHECK_ID"] = finding.metadata.CheckID
finding_dict["CHECK_TITLE"] = finding.metadata.CheckTitle
finding_dict["CHECK_TYPE"] = unroll_list(finding.metadata.CheckType)
finding_dict["STATUS"] = finding.status.value
finding_dict["STATUS_EXTENDED"] = finding.status_extended
finding_dict["MUTED"] = finding.muted
finding_dict["SERVICE_NAME"] = finding.metadata.ServiceName
finding_dict["SUBSERVICE_NAME"] = finding.metadata.SubServiceName
finding_dict["SEVERITY"] = finding.metadata.Severity.value
finding_dict["RESOURCE_TYPE"] = finding.metadata.ResourceType
finding_dict["RESOURCE_UID"] = finding.resource_uid
finding_dict["RESOURCE_NAME"] = finding.resource_name
finding_dict["RESOURCE_DETAILS"] = finding.resource_details
finding_dict["RESOURCE_TAGS"] = unroll_dict(finding.resource_tags)
finding_dict["PARTITION"] = finding.partition
finding_dict["REGION"] = finding.region
finding_dict["DESCRIPTION"] = finding.metadata.Description
finding_dict["RISK"] = finding.metadata.Risk
finding_dict["RELATED_URL"] = finding.metadata.RelatedUrl
finding_dict["ADDITIONAL_URLS"] = unroll_list(finding.metadata.AdditionalURLs)
finding_dict["REMEDIATION_RECOMMENDATION_TEXT"] = (
finding.metadata.Remediation.Recommendation.Text
)
finding_dict["REMEDIATION_RECOMMENDATION_URL"] = (
finding.metadata.Remediation.Recommendation.Url
)
finding_dict["REMEDIATION_CODE_NATIVEIAC"] = (
finding.metadata.Remediation.Code.NativeIaC
)
finding_dict["REMEDIATION_CODE_TERRAFORM"] = (
finding.metadata.Remediation.Code.Terraform
)
finding_dict["REMEDIATION_CODE_CLI"] = (
finding.metadata.Remediation.Code.CLI
)
finding_dict["REMEDIATION_CODE_OTHER"] = (
finding.metadata.Remediation.Code.Other
)
finding_dict["COMPLIANCE"] = unroll_dict(
finding.compliance, separator=": "
)
finding_dict["CATEGORIES"] = unroll_list(finding.metadata.Categories)
finding_dict["DEPENDS_ON"] = unroll_list(finding.metadata.DependsOn)
finding_dict["RELATED_TO"] = unroll_list(finding.metadata.RelatedTo)
finding_dict["NOTES"] = finding.metadata.Notes
finding_dict["PROWLER_VERSION"] = finding.prowler_version
self._data.append(finding_dict)
except Exception as error:
logger.error(
f"{error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
```
* Batch Write Data To File Method:
* This method will write the modeled object to a file.
#### Method Implementation
The following example demonstrates the `batch_write_data_to_file` method for the `CSV` class:
```python title="Batch Write Data To File" theme={null}
def batch_write_data_to_file(self) -> None:
"""Writes the findings to a file using the CSV format using the `Output._file_descriptor`."""
try:
if (
getattr(self, "_file_descriptor", None)
and not self._file_descriptor.closed
and self._data
):
csv_writer = DictWriter(
self._file_descriptor,
fieldnames=self._data[0].keys(),
delimiter=";",
)
csv_writer.writeheader()
for finding in self._data:
csv_writer.writerow(finding)
self._file_descriptor.close()
except Exception as error:
logger.error(
f"{error.__class__.__name__}[{error.__traceback__.tb_lineno}]: {error}"
)
```
### Integrating the Custom Output Format into Prowler
Once the custom output format is created, it must be integrated into Prowler to ensure compatibility with the existing architecture.
#### Reviewing Current Supported Outputs
Before implementing the new output format, examine the usage of currently supported formats to understand their structure and integration approach. Example: CSV Output Creation in Prowler
Below is an example of how Prowler generates and processes CSV output within its [codebase](https://github.com/prowler-cloud/prowler/blob/master/prowler/__main__.py):
```python title="CSV creation" theme={null}
if mode == "csv":
csv_output = CSV(
findings=finding_outputs,
create_file_descriptor=True,
file_path=f"{filename}{csv_file_suffix}",
)
generated_outputs["regular"].append(csv_output)
# Write CSV Finding Object to file.
csv_output.batch_write_data_to_file()
```
### Testing
* Verify that Prowler’s findings are accurately typed in the desired output format.
* Error Handling – Simulate edge cases to assess robustness and failure recovery mechanisms.
### Documentation
* Ensure the following elements are included:
* Setup Instructions – List all necessary dependencies and installation steps.
* Configuration details.
* Example Use Cases – Provide practical scenarios demonstrating functionality.
* Troubleshooting Guide – Document common issues and resolution steps.
* Comprehensive and clear documentation improves maintainability and simplifies onboarding of new users.
# Prowler Providers
Source: https://docs.prowler.com/developer-guide/provider
## Introduction
Providers form the backbone of Prowler, enabling security assessments across various cloud environments.
A provider is any platform or service that offers resources, data, or functionality that can be audited for security and compliance. This includes:
* Cloud Infrastructure Providers (like Amazon Web Services, Microsoft Azure, and Google Cloud)
* Software as a Service (SaaS) Platforms (like Microsoft 365)
* Development Platforms (like GitHub)
* Container Orchestration Platforms (like Kubernetes)
* Database-as-a-Service Platforms (like MongoDB Atlas)
For providers supported by Prowler, refer to [Prowler Hub](https://hub.prowler.com/).
There are some custom providers added by the community, like [NHN Cloud](https://www.nhncloud.com/), that are not maintained by the Prowler team, but can be used in the Prowler CLI. The main purpose of this documentation is to guide you through creating a new provider and integrating it not only in the CLI, but also in the API and UI. Non official providers can be checked directly at the [Prowler GitHub repository](https://github.com/prowler-cloud/prowler/tree/master/prowler/providers).
***
## Provider Types in Prowler
Prowler supports several types of providers, each with its own implementation pattern and use case. Understanding these differences is key to designing your provider correctly.
### Classifying your Provider
Before implementing a new provider, you need to determine which type it belongs to. This classification will guide your implementation approach and help you choose the right patterns and libraries.
#### Decision Criteria
Once you have decided the provider you want or need to add to Prowler, the next step is to study how to retrieve data from it. Based on that, the provider will fall into one of the following types: SDK, API or Tool/Wrapper (maybe in the future there will be new types but for now this are the only ones).
**Choose SDK Provider if:**
* The target platform/service has an **official Python SDK** available
* The target platform/service has a **non-official Python SDK** available but it's been updated and maintained
* You need to support **multiple authentication methods** (profiles, service principals, IAM roles, etc.)
* The SDK provides **built-in session management**, retry logic, and error handling
* You want to leverage **SDK-specific features** like credential chaining, role assumption, etc
* The platform is a **major cloud provider** (AWS, Azure, GCP, etc.) or has mature SDK support
**Choose API Provider if:**
* The target platform has a **REST API** but **no official Python SDK**
* The target platform has a **non-official Python SDK** available but it's not updated and maintained
* You need to implement **custom authentication flows** (OAuth, token-based, etc.)
* The platform is a **custom or community service** without official SDK support
* You want to use **standard HTTP libraries** like `requests` for API calls
* The platform exposes **well-documented REST endpoints** but lacks SDK tooling
**Choose Tool/Wrapper Provider if:**
* You're integrating a **third-party security tool** or library
* The tool provides **scanning capabilities** that need to be adapted to Prowler's interface
* You don't need **authentication or session management** (the tool handles this)
* You need to **map tool arguments** and **convert outputs** to Prowler's format
**Special Case - Hybrid Providers:**
* Some providers may **combine multiple approaches** (e.g., SDK + Tool wrapper, SDK + API, etc.)
* Example: M365 uses **msgraph SDK** for authentication and some checks, and **PowerShell wrapper** for other checks that the SDK doesn't support
* These require **custom implementation patterns** that blend different provider types
#### Classification Examples
| Provider | Type | Reasoning |
| ------------- | ------ | ---------------------------------------------------------------------------------------------- |
| AWS | SDK | Official boto3 SDK, multiple auth methods, mature ecosystem |
| Azure | SDK | Official azure-identity SDK, service principals, managed identity |
| GCP | SDK | Official google-auth SDK, service accounts, ADC support |
| Kubernetes | SDK | Official kubernetes SDK, service accounts, ADC support |
| NHN Cloud | API | Custom REST API, no official SDK, community provider |
| MongoDB Atlas | API | Custom REST API, no official SDK |
| IAC | Tool | Third-party security tool that uses trivy, no auth needed, output conversion |
| M365 | Hybrid | Combines msgraph SDK for auth + PowerShell wrapper for operations |
| GitHub | Hybrid | Non-Official PyGithub SDK but it's been updated and maintained + Official graphql API requests |
#### Questions to Ask Yourself
**1. Does the platform have an official Python SDK?**
* Yes → Consider SDK Provider
* No → Continue to question 2
**2. Does the platform have a non-official Python SDK?**
* Yes → Then if the SDK is updated and maintained, consider SDK Provider, otherwise continue to question 3.
* No → Continue to question 3
**3. Is this a third-party security tool or library?**
* Yes → Consider Tool/Wrapper Provider
* No → Continue to question 4
**4. Does the platform expose a REST API?**
* Yes → Consider API Provider
* No → You may need a custom approach
#### Implementation Complexity
* **SDK Providers**: Low complexity. You have mature examples like AWS, Azure, GCP, Kubernetes, etc. that you can leverage to implement your provider.
* **API Providers**: Medium complexity. You need to implement the authentication and session management, and the API calls to the provider. You now have NHN and MongoDB Atlas as example to follow.
* **Tool/Wrapper Providers**: High complexity. You need to implement the argument/output mapping to the provider and handle problems that the tool/wrapper may have. You now have IAC and the PowerShell wrapper as example to follow.
* **Hybrid Providers**: High complexity. You need to "customize" your provider, mixing the other types of providers in order to achieve the desired result. You have M365 (msgraph SDK + PowerShell wrapper) and Github (PyGithub SDK + graphql API requests) as examples.
### Determining Regional vs Non-Regional Architecture
After classifying your provider type, the next critical decision is determining whether your provider operates with **regional concepts** or is **global/non-regional**. This decision fundamentally affects how your provider and services are structured and executed.
#### Regional Providers
Regional providers operate across multiple geographic locations and require region-specific resource discovery and iteration.
**Examples:**
* **AWS**: Has regions like `us-east-1`, `eu-west-1`, `ap-southeast-2`
* **Azure**: Has regions like `East US`, `West Europe`, `Australia East`
* **GCP**: Has regions like `us-central1`, `europe-west1`, `asia-southeast1`
**Implementation Requirements:**
* Must implement region discovery and iteration
* Services must be instantiated per region or handle multi-region data
* Checks must execute across all available/specified regions
* Resource ARNs/IDs must include region information
* Region-specific client initialization
**Execution Pattern:**
```python theme={null}
# Regional provider execution pattern
for region in provider.get_regions():
regional_client = service.get_regional_client(region)
regional_resources = regional_client.discover_resources()
# Process regional resources
```
#### Non-Regional (Global) Providers
Non-regional providers operate globally without geographic partitioning.
**Examples:**
* **GitHub**: Repositories, organizations are global concepts
* **M365**: Tenants operate globally across Microsoft datacenters
* **Kubernetes**: Clusters are independent units without regional concepts
**Implementation Requirements:**
* Single global client/session
* No region iteration required
* Global resource discovery
* Simpler resource identification (no region in ARNs/IDs)
* Single audit execution
**Execution Pattern:**
```python theme={null}
# Non-regional provider execution pattern
global_client = service.get_client()
global_resources = global_client.discover_resources()
# Process all resources in single iteration
```
#### Decision Matrix
| Aspect | Regional Provider | Non-Regional Provider |
| ------------------------ | ------------------------ | ---------------------- |
| **Client Init** | Per-region clients | Single global client |
| **Resource Discovery** | Iterate through regions | Single discovery call |
| **Resource ARN/ID** | Include region | Global identifier/None |
| **Audit Execution** | Multi-region loops | Single execution |
| **Service Architecture** | Region-aware services | Global services |
| **Performance** | Parallelizable by region | Linear execution |
#### Region Discovery
Region discovery is the process of getting the list of regions that are available for the account. This is done by the provider and is stored in the `prowler/providers//lib/regions/_regions.py` file.
```python theme={null}
# File: prowler/providers/aws/aws_provider.py
def get_aws_enabled_regions(self, current_session: Session) -> set:
"""get_aws_enabled_regions returns a set of enabled AWS regions"""
try:
# EC2 Client to check enabled regions
service = "ec2"
default_region = self.get_default_region(service)
ec2_client = current_session.client(service, region_name=default_region)
enabled_regions = set()
# With AllRegions=False we only get the enabled regions for the account
for region in ec2_client.describe_regions(AllRegions=False).get("Regions", []):
enabled_regions.add(region.get("RegionName"))
return enabled_regions
except Exception as error:
logger.error(f"{error.__class__.__name__}: {error}")
return set()
```
The function returns a JSON file containing the list of regions for the provider. It is used to retrieve the provider’s regions and to validate the region specified by the user.
```json theme={null}
# File: prowler/providers/aws/aws_regions_by_service.json (extract)
{
"services": {
"ec2": {
"regions": {
"aws": [
"af-south-1", "ap-east-1", "ap-northeast-1", "ap-northeast-2",
"ap-northeast-3", "ap-south-1", "ap-southeast-1", "ap-southeast-2",
"ca-central-1", "eu-central-1", "eu-north-1", "eu-south-1",
"eu-west-1", "eu-west-2", "eu-west-3", "me-south-1",
"sa-east-1", "us-east-1", "us-east-2", "us-west-1", "us-west-2"
],
"aws-cn": ["cn-north-1", "cn-northwest-1"],
"aws-eusc": ["eusc-de-east-1"],
"aws-us-gov": ["us-gov-east-1", "us-gov-west-1"]
}
}
}
}
```
### Regional Service Implementation
For detailed guidance on implementing services for regional services, including code examples, service architecture, and check execution patterns, see the [Regional Service Implementation](./services#regional-service-implementation) section in the Services documentation.
**Key concepts covered:**
* Threading and parallel processing across regions
* Service implementation patterns for regional providers
* Cross-region resource attribution and ARN handling
* Best practices for performance and error isolation
## Step 1: Create the Provider Backend (CLI Integration)
Once the type of provider and its regional architecture are determined, the next step is to start creating the code of the provider.
### SDK Providers
General aspects to consider when implementing a new SDK provider:
**Definition:**
* Use the official SDK of the provider to interact with its resources and APIs.
* Examples: AWS (boto3), Azure (azure-identity), GCP (google-auth), Kubernetes (kubernetes), M365 (msal/msgraph), GitHub (PyGithub).
**Typical Use Cases:**
* Cloud platforms and services with mature Python SDKs.
* Need to support multiple authentication methods (profiles, service principals, etc).
* Providers that offer comprehensive Python libraries for resource management.
**Key Characteristics:**
* Authentication and session management handled by the SDK.
* Arguments: Depends on the provider, but for example we can have `profile`, `region`, `tenant_id`, `client_id`, `client_secret`, etc.
* Outputs: Standardized via SDK models and responses.
* Session objects that can be reused across multiple API calls.
* Built-in retry logic and error handling.
**Implementation Details:**
* SDK providers typically use credential objects or session objects provided by the official SDK.
* They often support multiple authentication methods (several types of credentials, configuration files, IAM roles, etc.).
* Session management includes token refresh, connection pooling, and retry mechanisms.
* Resource discovery and enumeration is usually straightforward through SDK methods.
***
### Implementation Guide for SDK Providers
Now it's time to start creating the code needed to implement the provider.
#### Step 1: Create the Provider Structure
**Explanation:**
SDK providers require a specific folder structure to organize authentication, configuration, and service management. This structure follows Prowler's conventions and ensures proper integration with the CLI and API.
**Required Structure:**
```
prowler/providers//
├── __init__.py
├── _provider.py
├── models.py
├── exceptions/
│ ├── __init__.py
│ └── exceptions.py
├── services/
│ ├── service_name1/
│ └── service_name2/
└── lib/
├── __init__.py
├── arguments/
│ ├── __init__.py
│ └── arguments.py
├── mutelist/
│ ├── __init__.py
│ └── mutelist.py
├── regions/
│ ├── __init__.py
│ └── _regions.py
└── service/
├── __init__.py
└── service.py
```
**Key Components:**
* **`_provider.py`**: Main provider class with authentication and session management
* **`models.py`**: Data structures for identity, session, and provider-specific information
* **`exceptions/`**: Custom exception classes for error handling
* **`services/`**: Folder that contains all the provider services, how to make a new service is explained in another section.
* **`lib/arguments/`**: CLI argument validation and parsing
* **`lib/mutelist/`**: Resource exclusion and muting functionality
* **`lib/regions/`**: Region management and validation. If the provider is NOT regional, this folder will not be created.
* **`lib/service/`**: Base service class for provider-specific services
#### Step 2: Implement the Provider Class
**Explanation:**
The provider class is the core component that handles authentication, session management, and identity information. It inherits from Prowler's base Provider class and implements SDK-specific authentication flows. All providers must share, as far as possible, common patterns for session setup, identity management, and credential validation.
Nevertheless, you may encounter changes and must adapt the implementation logic accordingly. A basic example of a common provider implementation is the following:
**File:** `prowler/providers//_provider.py`
```python theme={null}
import os
from typing import Optional, Union
from prowler.providers.common.provider import Provider
from prowler.providers.common.models import Audit_Metadata, Connection
from prowler.config.config import load_and_validate_config_file, get_default_mute_file_path
from prowler.lib.logger import logger
from prowler.lib.utils.utils import print_boxes
# Import your SDK and all the needed libraries for the provider.
import your_sdk_library
from your_sdk_library.auth_methods import ClientSecretCredential, ProfileCredential, DefaultCredential
# Import the needed exceptions, mutelist and models for the provider.
from prowler.providers..exceptions.exceptions import Exceptions
from prowler.providers..mutelist.mutelist import Mutelist
from prowler.providers..models import NeededModels
class YourProvider(Provider):
"""
YourProvider class is the main class for the Your Provider.
This class is responsible for initializing the provider, setting up the session,
validating credentials, and managing identity information.
Attributes:
_type (str): The provider type.
_session (YourSDKSession): The provider session.
_identity (YourProviderIdentityInfo): The provider identity information.
_audit_config (dict): The audit configuration.
_mutelist (YourProviderMutelist): The provider mutelist.
audit_metadata (Audit_Metadata): The audit metadata.
"""
_type: str = "your_provider"
_session: your_sdk_library.Session
_identity: YourProviderIdentityInfo
_audit_config: dict
_mutelist: YourProviderMutelist
audit_metadata: Audit_Metadata
def __init__(
self,
# Authentication parameters
client_id: str = None,
client_secret: str = None,
tenant_id: str = None,
# Configuration
config_path: str = None,
config_content: dict = None,
mutelist_path: str = None,
mutelist_content: dict = None,
# Additional provider-specific parameters
region: str = None,
profile: str = None,
):
"""
Initializes the YourProvider instance.
Args:
client_id: The client ID for authentication
client_secret: The client secret for authentication
tenant_id: The tenant ID for authentication
config_path: Path to the configuration file
config_content: Configuration content as dictionary
mutelist_path: Path to the mutelist file
mutelist_content: Mutelist content as dictionary
region: The region to use
profile: The profile to use
Raises:
YourProviderSetUpSessionError: If session setup fails
YourProviderInvalidCredentialsError: If credentials are invalid
"""
logger.info("Initializing YourProvider ...")
# Setup session using SDK
self._session = self.setup_session(
client_id, client_secret, tenant_id, region, profile
)
# Get identity information
self._identity = self.setup_identity(self._session)
# Load configuration
if config_content:
self._audit_config = config_content
else:
if not config_path:
config_path = default_config_file_path
self._audit_config = load_and_validate_config_file(self._type, config_path)
# Setup mutelist
if mutelist_content:
self._mutelist = YourProviderMutelist(mutelist_content=mutelist_content)
else:
if not mutelist_path:
mutelist_path = get_default_mute_file_path(self._type)
self._mutelist = YourProviderMutelist(mutelist_path=mutelist_path)
Provider.set_global_provider(self)
@staticmethod
def setup_session(
client_id: str = None,
client_secret: str = None,
tenant_id: str = None,
region: str = None,
profile: str = None,
) -> your_sdk_library.Session:
"""
Sets up the provider session using the provided credentials.
This method handles the authentication flow and creates a session object
that can be used to interact with the provider's services.
Args:
client_id: The client ID for authentication
client_secret: The client secret for authentication
tenant_id: The tenant ID for authentication
region: The region to use
profile: The profile to use
Returns:
YourSDKSession: The authenticated session object
Raises:
YourProviderSetUpSessionError: If session setup fails
"""
try:
logger.debug("Creating session ...")
# Determine authentication method based on provided parameters
if client_id and client_secret and tenant_id:
# Use client credentials authentication
credentials = your_sdk_library.ClientSecretCredential(
tenant_id=tenant_id,
client_id=client_id,
client_secret=client_secret
)
auth_method = "Client Credentials"
elif profile:
# Use profile-based authentication
credentials = your_sdk_library.ProfileCredential(profile=profile)
auth_method = "Profile"
else:
# Use default authentication (environment variables, etc.)
credentials = your_sdk_library.DefaultCredential()
auth_method = "Default"
# Create session with credentials
session = your_sdk_library.Session(
credentials=credentials,
region=region
)
logger.debug(f"Session created using {auth_method} authentication")
return session
except Exception as error:
logger.critical(f"Failed to setup session: {error}")
raise YourProviderSetUpSessionError(
original_exception=error,
file=os.path.basename(__file__),
)
def setup_identity(self, session: your_sdk_library.Session) -> YourProviderIdentityInfo:
"""
Gets identity information from the provider session.
This method retrieves account information, user details, and other
identity-related data from the provider.
Args:
session: The authenticated session object
Returns:
YourProviderIdentityInfo: The identity information
Raises:
YourProviderSetUpIdentityError: If identity setup fails
"""
try:
# Use SDK to get account/identity information
identity_info = session.get_identity()
return YourProviderIdentityInfo(
account_id=identity_info.account_id,
account_name=identity_info.account_name,
region=identity_info.region,
user_id=identity_info.user_id,
# Add other identity fields as needed
)
except Exception as e:
logger.error(f"Failed to get identity information: {e}")
raise YourProviderSetUpIdentityError(
original_exception=e,
file=os.path.basename(__file__),
)
@property
def identity(self):
"""Returns the provider identity information."""
return self._identity
@property
def session(self):
"""Returns the provider session object."""
return self._session
@property
def type(self):
"""Returns the provider type."""
return self._type
@property
def audit_config(self):
"""Returns the audit configuration."""
return self._audit_config
@property
def mutelist(self):
"""Returns the provider mutelist."""
return self._mutelist
def print_credentials(self):
"""
Display account information with color formatting.
This method prints the provider credentials and account information
in a formatted way using colorama for better readability.
"""
from colorama import Fore, Style
from prowler.lib.utils.utils import print_boxes
report_lines = [
f" Account ID: {Fore.YELLOW}{self._identity.account_id}{Style.RESET_ALL}",
f" Account Name: {Fore.YELLOW}{self._identity.account_name}{Style.RESET_ALL}",
f" Region: {Fore.YELLOW}{self._identity.region}{Style.RESET_ALL}",
f" User ID: {Fore.YELLOW}{self._identity.user_id}{Style.RESET_ALL}",
]
report_title = f"{Style.BRIGHT}Using the {self._type.upper()} credentials below:{Style.RESET_ALL}"
print_boxes(report_lines, report_title)
@staticmethod
def test_connection(
client_id: str = None,
client_secret: str = None,
tenant_id: str = None,
region: str = None,
profile: str = None,
raise_on_exception: bool = True,
provider_id: str = None,
) -> Connection:
"""
Test connection to the provider.
This method validates the provided credentials and tests the connection
to the provider's services.
Args:
client_id: The client ID for authentication
client_secret: The client secret for authentication
tenant_id: The tenant ID for authentication
region: The region to test
profile: The profile to use
raise_on_exception: Whether to raise exceptions or return Connection object
provider_id: The provider ID to validate against
Returns:
Connection: Connection test result
Raises:
YourProviderSetUpSessionError: If session setup fails
YourProviderInvalidCredentialsError: If credentials are invalid
"""
try:
# Create temporary session for testing
test_session = YourProvider.setup_session(
client_id, client_secret, tenant_id, region, profile
)
# Test the connection by getting identity
identity = YourProvider.setup_identity(test_session)
# Validate provider ID if provided
if provider_id and identity.account_id != provider_id:
raise YourProviderInvalidProviderIdError(
file=os.path.basename(__file__),
)
return Connection(
status=True,
message=f"Successfully connected to {provider_id or 'provider'}",
error=None,
)
except Exception as e:
if raise_on_exception:
raise e
return Connection(
status=False,
message="Failed to connect",
error=str(e),
)
def get_regions(self) -> set:
"""
Get available regions for the provider.
Returns:
set: Set of available region names
"""
# Implementation depends on your provider
# Example for cloud providers that support regions
return {"region1", "region2", "region3"}
def get_services(self) -> list:
"""
Get available services for the provider.
Returns:
list: List of available service names
"""
# Implementation depends on your provider
return ["service1", "service2", "service3"]
```
#### Step 3: Create Models
**Explanation:**
Models define the data structures used by your provider. They include identity information, session details, and provider-specific configurations. These models ensure type safety and consistent data handling across the provider.
**File:** `prowler/providers//models.py`
```python theme={null}
# Import the needed generic libraries for the provider.
from pydantic import BaseModel
from dataclasses import dataclass
from typing import Optional, List
# Import the needed Prowler libraries for the provider.
from prowler.providers.common.models import ProviderOutputOptions
from prowler.config.config import output_file_timestamp
class YourProviderIdentityInfo:
"""
Identity information for the provider.
This class holds all the identity-related information retrieved
from the provider, including account details and user information.
"""
account_id: str
account_name: str
region: str
user_id: str
# Add other identity fields as needed
class YourProviderSession:
"""
Session object that contains the credentials and authentication details for the provider.
This class holds the actual credentials and authentication information needed
to establish a connection with the provider's services.
"""
# Authentication credentials
access_key: str
secret_key: str
# Or for other providers:
# client_id: str
# client_secret: str
# tenant_id: str
# Connection details
region: str
```
#### Step 4: Implement Arguments
**Explanation:**
Argument validation ensures that the provider receives valid configuration parameters. This step is crucial for preventing runtime errors and providing clear error messages to users. The validation should check for required parameters and validate their format.
**File:** `prowler/providers//lib/arguments/arguments.py`
```python theme={null}
def init_parser(self):
"""Init the Provider CLI parser"""
_parser = self.subparsers.add_parser(
"", parents=[self.common_providers_parser], help=" Provider"
)
# Authentication Modes
_auth_subparser = _parser.add_argument_group("Authentication Modes")
_auth_modes_group = _auth_subparser.add_mutually_exclusive_group()
_auth_modes_group.add_argument(
"--credentials-file",
nargs="?",
metavar="FILE_PATH",
help="Authenticate using a Service Account Application Credentials JSON file",
)
_auth_modes_group.add_argument(
"--impersonate-service-account",
nargs="?",
metavar="SERVICE_ACCOUNT",
help="Impersonate a Service Account",
)
_parser.add_argument(
"--your-provider-region",
help="Your Provider Region",
type=str,
)
_parser.add_argument(
"--env-auth",
action="store_true",
help="Use User and Password environment variables authentication to log in against ",
)
# More arguments for the provider.
```
##### Sensitive CLI Arguments
CLI flags that accept secrets (tokens, passwords, API keys) require special handling to protect credentials from leaking in HTML output and process listings:
1. **Use `nargs="?"` with `default=None`** so the flag works both with and without an inline value. This allows the provider to fall back to an environment variable when no value is passed.
2. **Add a `SENSITIVE_ARGUMENTS` frozenset** at the top of the `arguments.py` file listing every flag that accepts secret values:
```python theme={null}
SENSITIVE_ARGUMENTS = frozenset({"--your-provider-password", "--your-provider-token"})
```
Prowler automatically discovers these frozensets and uses them to redact values in HTML output and warn users who pass secrets directly on the command line.
3. **Document the environment variable** in the `help` text so users know the recommended alternative:
```python theme={null}
_parser.add_argument(
"--your-provider-password",
nargs="?",
default=None,
metavar="PASSWORD",
help="Password for authentication. We recommend using the YOUR_PROVIDER_PASSWORD environment variable instead.",
)
```
Do not add new arguments that require passing secrets as CLI values without an environment variable fallback. Prowler CLI warns users when sensitive flags receive explicit values on the command line.
#### Step 5: Implement Mutelist
**Explanation:**
The mutelist functionality allows users to exclude specific resources or checks from the audit. This is useful for handling false positives or excluding resources that are intentionally configured differently.
**File:** `prowler/providers//lib/mutelist/mutelist.py`
```python theme={null}
from prowler.lib.mutelist.mutelist import Mutelist
from prowler.lib.check.models import CheckReportYourProvider
class YourProviderMutelist(Mutelist):
"""
Mutelist implementation for YourProvider.
This class handles the muting functionality for the provider,
allowing users to exclude specific checks or resources from audits.
"""
def is_finding_muted(self, finding: CheckReportYourProvider) -> bool:
"""
Check if a specific finding is muted.
Args:
finding: The finding to check
"""
return self.is_muted(finding.check_id, finding.resource_id)
```
#### Step 6: Implement Regions
**Explanation:**
Region management is essential for cloud providers that operate across multiple geographic locations. This component handles region validation and provides region-specific functionality.
Regions are optional, only if the provider has regions, for example Github does not have regions, but AWS does.
**File:** `prowler/providers//lib/regions/_regions.py`
```python theme={null}
from typing import List, Set
def get_regions() -> List[str]:
"""
Get list of available regions for the provider.
Returns:
List[str]: List of available region names
"""
return [
"region1",
"region2",
"region3",
# ... other regions
]
def validate_region(region: str) -> bool:
"""
Validate if a region is supported.
Args:
region: The region to validate
Returns:
bool: True if the region is valid, False otherwise
"""
return region in get_regions()
def get_default_region() -> str:
"""
Get the default region for the provider.
Returns:
str: The default region name
"""
return "region1"
def get_global_region() -> str:
"""
Get the global region for the provider.
Returns:
str: The global region name
"""
return "global"
```
#### Step 7: Create Custom Exceptions
**Explanation:**
Custom exceptions are needed to be able to handle the errors in a more specific way. Prowler uses a structured exception system with error codes, messages, and remediation steps.
**File:** `prowler/providers//exceptions.py`
```python theme={null}
from prowler.exceptions.exceptions import ProwlerException
# Exceptions codes from 7000 to 7999 are reserved for YourProvider exceptions (Numbers as example)
class YourProviderBaseException(ProwlerException):
"""Base class for YourProvider Errors."""
YOUR_PROVIDER_ERROR_CODES = {
(7001, "YourProviderCredentialsError"): {
"message": "Error loading credentials for YourProvider",
"remediation": "Check the credentials and ensure they are properly set up. API_KEY and API_SECRET are required.",
},
(7002, "YourProviderAuthenticationError"): {
"message": "Authentication failed with YourProvider",
"remediation": "Check the API credentials and ensure they are valid and have proper permissions.",
},
(7003, "YourProviderInvalidRegionError"): {
"message": "Invalid region provided for YourProvider",
"remediation": "Check the region and ensure it is a valid region for YourProvider.",
},
(7004, "YourProviderSetUpSessionError"): {
"message": "Error setting up session",
"remediation": "Check the session setup and ensure it is properly configured.",
},
(7005, "YourProviderInvalidProviderIdError"): {
"message": "Provider does not match with the expected account_id",
"remediation": "Check the provider and ensure it matches the expected account_id.",
},
}
def __init__(self, code, file=None, original_exception=None, message=None):
provider = "YourProvider"
error_info = self.YOUR_PROVIDER_ERROR_CODES.get((code, self.__class__.__name__))
if message:
error_info["message"] = message
super().__init__(
code=code,
source=provider,
file=file,
original_exception=original_exception,
error_info=error_info,
)
class YourProviderCredentialsError(YourProviderBaseException):
"""Base class for YourProvider credentials errors."""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
7001, file=file, original_exception=original_exception, message=message
)
class YourProviderAuthenticationError(YourProviderCredentialsError):
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
7002, file=file, original_exception=original_exception, message=message
)
class YourProviderInvalidRegionError(YourProviderBaseException):
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
7003, file=file, original_exception=original_exception, message=message
)
class YourProviderSetUpSessionError(YourProviderCredentialsError):
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
7004, file=file, original_exception=original_exception, message=message
)
class YourProviderInvalidProviderIdError(YourProviderBaseException):
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
7005, file=file, original_exception=original_exception, message=message
)
```
#### Step 8: Implement Service Base Class
**Explanation:**
The service base class defines a common interface for all services in your provider, since they will inherit from it. It defines the client to make requests to, the audit configuration and the fixer configuration.
**File:** `prowler/providers//lib/service/service.py`
```python theme={null}
from prowler.providers.._provider import Provider
class YourProviderService(BaseService):
"""
Base service class for YourProvider services.
This class provides common functionality for all services
within the provider, including session management and error handling.
"""
def __init__(self, provider: Provider):
"""
Initialize the service.
Args:
provider: The provider instance
"""
self.client = provider.session.get_client(self.service_name)
self.audit_config = provider.audit_config
self.fixer_config = provider.fixer_config
```
#### Step 9: Register in CLI
**Explanation:**
Add your provider to the available providers in the CLI.
**File:** `prowler/lib/cli/parser.py`
```python theme={null}
class ProwlerArgumentParser:
# Set the default parser
def __init__(self):
# CLI Arguments
self.parser = argparse.ArgumentParser(
prog="prowler",
formatter_class=RawTextHelpFormatter,
usage="prowler [-h] [--version] {aws,azure,gcp,kubernetes,m365,github,nhn,dashboard,iac,your_provider} ...",
epilog="""
Available Providers:
{aws,azure,gcp,kubernetes,m365,github,iac,nhn,your_provider}
aws AWS Provider
azure Azure Provider
gcp GCP Provider
kubernetes Kubernetes Provider
m365 Microsoft 365 Provider
github GitHub Provider
iac IaC Provider (Preview)
nhn NHN Provider (Unofficial)
your_provider Your Provider
Available components:
dashboard Local dashboard
To see the different available options on a specific component, run:
prowler {provider|dashboard} -h|--help
Detailed documentation at https://docs.prowler.com
""",
```
#### Step 10: Register in Main
**Explanation:**
Main registration makes your provider discoverable by Prowler's core system. It's needed to add your provider to the output options and to the compliance evaluation.
**File:** `prowler/__main__.py`
```python theme={null}
# In the prowler setup output options section
if provider == "aws":
output_options = AWSOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "azure":
output_options = AzureOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "gcp":
output_options = GCPOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "kubernetes":
output_options = KubernetesOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "github":
output_options = GithubOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "m365":
output_options = M365OutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "nhn":
output_options = NHNOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
elif provider == "iac":
output_options = IACOutputOptions(
args, bulk_checks_metadata
)
elif provider == "your_provider":
output_options = YourProviderOutputOptions(
args, bulk_checks_metadata, global_provider.identity
)
# Setup Compliance Options
elif provider == "your_provider":
for compliance_name in input_compliance_frameworks:
if compliance_name.startswith("cis_"):
# Generate CIS Finding Object (example of compliance with CIS framework)
filename = (
f"{output_options.output_directory}/compliance/"
f"{output_options.output_filename}_{compliance_name}.csv"
)
cis = YourProviderCIS(
findings=finding_outputs,
compliance=bulk_compliance_frameworks[compliance_name],
file_path=filename,
)
generated_outputs["compliance"].append(cis)
cis.batch_write_data_to_file()
```
#### Step 11: Register in the list of providers
**Explanation:**
This is needed to be able to use the provider in the generic checks. The provider must be registered in the `init_global_provider` method to handle CLI arguments and initialization.
**File:** `prowler/providers/common/provider.py`
```python theme={null}
elif "your_provider" in provider_class_name.lower():
provider_class(
username=arguments.your_provider_username,
password=arguments.your_provider_password,
tenant_id=arguments.your_provider_tenant_id,
config_path=arguments.config_file,
mutelist_path=arguments.mutelist_file,
fixer_config=fixer_config,
)
```
#### Step 12: Add to Config
**Explanation:**
Configuration registration ensures your provider is recognized by Prowler's configuration system. This enables proper handling of provider-specific settings and defaults.
**File:** `prowler/config/config.py`
```python theme={null}
class Provider(str, Enum):
AWS = "aws"
AZURE = "azure"
GCP = "gcp"
KUBERNETES = "kubernetes"
M365 = "m365"
GITHUB = "github"
YOUR_PROVIDER = "your_provider" # Add your provider here
```
In some cases, you may need to create a new configuration file for your provider, for example, the AWS one that is inside `prowler/providers/aws/config.py`.
#### Step 13: Create Compliance Files
**Explanation:**
Compliance files define the security checks and standards that your provider supports. These JSON files map security controls to specific checks and provide remediation guidance. It's needed to create the folder with an init file to ensure the provider will work, however, adding different compliance files is optional.
**Folder:** `prowler/compliance//`
```json theme={null}
{
"Framework": "CIS",
"Version": "1.0",
"Provider": "your_provider",
"Description": "Description of the compliance framework",
# The requirements depends on the framework, for example, CIS has a requirements section with the checks and attributes.
"Requirements": [
{
"Id": "1.1.1",
"Description": "Description of the requirement",
"Checks": ["your_provider_check_1", "your_provider_check_2"],
"Attributes": []
}
]
}
```
#### Step 14: Add Output Support
**Explanation:**
Output support ensures that your provider's results are properly formatted in Prowler's various output formats (CSV, JSON, HTML, etc.). This step integrates your provider with Prowler's reporting system.
**File:** `prowler/lib/outputs/summary_table.py`
```python theme={null}
# Add your provider case in the display_summary_table function
elif provider.type == "your_provider":
entity_type = "Your Entity Type"
audited_entities = provider.identity.your_entity_field
```
**File:** `prowler/lib/outputs/finding.py`
```python theme={null}
# Add your provider case in the fill_common_finding_data function
elif provider.type == "your_provider":
output_data["auth_method"] = f"Your Auth Method: {get_nested_attribute(provider, 'identity.auth_type')}"
output_data["account_uid"] = get_nested_attribute(provider, "identity.account_id")
output_data["account_name"] = get_nested_attribute(provider, "identity.account_name")
output_data["resource_name"] = check_output.resource_name
output_data["resource_uid"] = check_output.resource_id
output_data["region"] = check_output.location # or your location field
```
**File:** `prowler/lib/outputs/outputs.py`
```python theme={null}
# Add your provider case in the stdout_report function
if finding.check_metadata.Provider == "your_provider":
details = finding.your_location_field # e.g., finding.location, finding.namespace, etc.
```
#### Step 15: Generate the HTML Report
**Explanation:**
The HTML file is needed to be able to generate the HTML report. This step involves adding support for your provider in the HTML output generation system to ensure proper display of assessment summaries and findings.
**File:** `prowler/lib/outputs/html/html.py`
```python theme={null}
@staticmethod
def get_your_provider_assessment_summary(provider: Provider) -> str:
"""
get_your_provider_assessment_summary gets the HTML assessment summary for your provider
Args:
provider (Provider): the provider object
Returns:
str: the HTML assessment summary
"""
try:
return f"""
-
Your Entity Type: {provider.identity.your_entity_field}
-
Your Location Field: {provider.identity.your_location_field}
-
Authentication Method: {provider.auth_method}
-
Identity ID: {provider.identity.identity_id}
"""
except Exception as error:
logger.error(
f"{error.__class__.__name__}[{error.__traceback__.tb_lineno}] -- {error}"
)
return ""
```
#### Step 16: Add the Check Report Model
**Explanation:**
Add the provider to the generic models, this is needed to be able to use the provider in the generic checks.
**File:** `prowler/providers/check/models.py`
```python theme={null}
@dataclass
class CheckReportYourProvider(CheckReport):
"""
Check report for YourProvider.
"""
resource_name: str
resource_id: str
def _init_(self, metadata: Dict, resource: Any) -> None:
super()._init_(metadata, resource)
self.resource_name = resource.name
self.resource_id = resource.id
```
#### Step 17: Add Dependencies
**Explanation:**
Dependencies ensure that your provider's required libraries are available when Prowler is installed. This step adds the necessary SDK or API client to Prowler's dependency management.
**File:** `pyproject.toml`
```toml theme={null}
[project]
requires-python = ">=3.10,<3.13"
dependencies = [
# ... other dependencies
"your-sdk-library>=1.0.0,<2.0.0", # Add your SDK dependency
]
```
#### Step 18: Create Tests
**Explanation:**
Testing ensures that your provider works correctly and maintains compatibility as Prowler evolves. Comprehensive tests cover authentication, session management, and provider-specific functionality.
**Folder:** `tests/providers//`
```python theme={null}
import pytest
from prowler.providers.your_provider.your_provider import YourProvider
class TestYourProvider:
"""Test cases for YourProvider."""
def test_provider_initialization_with_client_credentials(self):
"""Test provider initialization with client credentials."""
provider = YourProvider(
client_id="test_client_id",
client_secret="test_client_secret",
tenant_id="test_tenant_id"
)
assert provider.type == "your_provider"
assert provider.identity is not None
assert provider.session is not None
def test_provider_initialization_with_profile(self):
"""Test provider initialization with profile."""
provider = YourProvider(
profile="test_profile"
)
assert provider.type == "your_provider"
assert provider.identity is not None
def test_connection_test(self):
"""Test connection functionality."""
result = YourProvider.test_connection(
client_id="test_client_id",
client_secret="test_client_secret",
tenant_id="test_tenant_id"
)
# Add assertions based on expected behavior
def test_identity_retrieval(self):
"""Test identity information retrieval."""
provider = YourProvider(
client_id="test_client_id",
client_secret="test_client_secret",
tenant_id="test_tenant_id"
)
assert provider.identity.account_id is not None
assert provider.identity.account_name is not None
def test_argument_validation(self):
"""Test argument validation."""
from prowler.providers.your_provider.lib.arguments.arguments import (
validate_your_provider_arguments
)
# Valid arguments
validate_your_provider_arguments(
client_id="test_client_id",
client_secret="test_client_secret",
tenant_id="test_tenant_id"
)
# Invalid arguments
with pytest.raises(ValueError, match="at least one authentication method"):
validate_your_provider_arguments()
```
#### Step 19: Update Documentation
**Explanation:**
Documentation updates ensure that users can find information about your provider in Prowler's documentation. This includes examples, configuration guides, and troubleshooting information.
Update the provider documentation to include your new provider in the examples and implementation guidance.
***
### API Providers
**Definition:**
* Interact directly with the provider's REST API using HTTP requests (e.g., via `requests`).
* Examples: NHN Cloud.
**Typical Use Cases:**
* Providers without an official Python SDK.
* Providers with a non-official Python SDK that is not updated and maintained.
* Providers that expose REST APIs and meet above requirements.
**Key Characteristics:**
* Manual management of authentication (tokens, username/password, etc).
* Arguments: Depends on the provider, for example, `username`, `password`, `tenant_id`, etc.
* Outputs: Dicts or custom models based on API responses.
* Custom HTTP session management with headers and authentication.
* Manual handling of pagination, rate limiting, and error responses.
**Implementation Details:**
* API providers require manual HTTP request management using libraries like `requests`.
* Authentication typically involves obtaining tokens via login endpoints or OAuth flows.
* Session management includes setting appropriate headers (Authorization, Content-Type, etc.).
* Resource discovery often requires multiple API calls to different endpoints.
* Error handling and retry logic must be implemented manually.
***
### Implementation Guide for API Providers
#### Step 1: Create the Provider Structure
**Explanation:**
API providers require the same structure as the SDK providers, the main difference would be that due to the lack of an official Python SDK, some methods could be implemented differently or not implemented at all.
**Required Structure:**
```
prowler/providers//
├── __init__.py
├── _provider.py
├── models.py
├── exceptions/
│ ├── __init__.py
│ └── exceptions.py
├── services/
│ ├── service_name1/
│ └── service_name2/
└── lib/
├── __init__.py
├── arguments/
│ ├── __init__.py
│ └── arguments.py
├── mutelist/
│ ├── __init__.py
│ └── mutelist.py
├── regions/
│ ├── __init__.py
│ └── regions.py
└── service/
├── __init__.py
└── service.py
```
**Key Components:**
* **`_provider.py`**: Main provider class with HTTP session management
* **`models.py`**: Data structures for identity and API responses
* **`exceptions/`**: Custom exception classes for API errors
* **`services/`**: Folder that contains all the provider services
* **`lib/arguments/`**: CLI argument validation and parsing
* **`lib/mutelist/`**: Resource exclusion and muting functionality
* **`lib/regions/`**: Region management and validation. If the provider is NOT regional, this folder will not be created.
* **`lib/service/`**: Base service class for provider-specific services
#### Step 2: Implement the Provider Class
**Explanation:**
The provider class is the core component that handles HTTP session management, authentication, and identity information. It inherits from Prowler's base Provider class and implements API-specific authentication flows using direct HTTP requests.
**File:** `prowler/providers//_provider.py`
```python theme={null}
import os
from typing import Optional
import requests
from prowler.providers.common.provider import Provider
from prowler.providers.common.models import Audit_Metadata, Connection
from prowler.config.config import load_and_validate_config_file, get_default_mute_file_path
from prowler.lib.logger import logger
from prowler.lib.utils.utils import print_boxes
# Import the needed exceptions, mutelist and models for the provider.
from prowler.providers..exceptions.exceptions import Exceptions
from prowler.providers..lib.mutelist.mutelist import Mutelist
from prowler.providers..models import NeededModels
class APIProvider(Provider):
"""
APIProvider class is the main class for the API Provider.
This class is responsible for initializing the provider, setting up the HTTP session,
validating credentials, and managing identity information through direct API calls.
Attributes:
_type (str): The provider type.
_session (requests.Session): The HTTP session for API calls.
_identity (APIIdentityInfo): The provider identity information.
_audit_config (dict): The audit configuration.
_mutelist (APIMutelist): The provider mutelist.
audit_metadata (Audit_Metadata): The audit metadata.
"""
_type: str = "api_provider"
_session: Optional[requests.Session]
_identity: APIIdentityInfo
_audit_config: dict
_mutelist: APIMutelist
audit_metadata: Audit_Metadata
def __init__(
self,
# Authentication parameters
username: str = None,
password: str = None,
tenant_id: str = None,
# Configuration
config_path: str = None,
config_content: dict = None,
mutelist_path: str = None,
mutelist_content: dict = None,
fixer_config: dict = None,
):
"""
Initializes the APIProvider instance.
Args:
username: The API username for authentication
password: The API password for authentication
tenant_id: The tenant ID for authentication
config_path: Path to the configuration file
config_content: Configuration content as dictionary
mutelist_path: Path to the mutelist file
mutelist_content: Mutelist content as dictionary
fixer_config: Fixer configuration dictionary
Raises:
ValueError: If required authentication parameters are missing
"""
logger.info("Initializing APIProvider ...")
# 1) Store argument values with environment variable fallback
self._username = username or os.getenv("YOUR_PROVIDER_USERNAME")
self._password = password or os.getenv("YOUR_PROVIDER_PASSWORD")
self._tenant_id = tenant_id or os.getenv("YOUR_PROVIDER_TENANT_ID")
# Validate required parameters
if not all([self._username, self._password, self._tenant_id]):
raise ValueError("APIProvider requires username, password and tenant_id")
# 2) Load audit_config, fixer_config, mutelist
self._fixer_config = fixer_config if fixer_config else {}
if config_content:
self._audit_config = config_content
else:
if not config_path:
config_path = default_config_file_path
self._audit_config = load_and_validate_config_file(self._type, config_path)
if mutelist_content:
self._mutelist = APIMutelist(mutelist_content=mutelist_content)
else:
if not mutelist_path:
mutelist_path = get_default_mute_file_path(self._type)
self._mutelist = APIMutelist(mutelist_path=mutelist_path)
# 3) Initialize session/token
self._token = None
self._session = None
self.setup_session()
# 4) Create identity object
self._identity = APIIdentityInfo(
tenant_id=self._tenant_id,
username=self._username,
)
Provider.set_global_provider(self)
@property
def type(self) -> str:
"""Returns the type of the provider."""
return self._type
@property
def identity(self) -> APIIdentityInfo:
"""Returns the provider identity information."""
return self._identity
@property
def session(self) -> requests.Session:
"""Returns the HTTP session for API calls."""
return self._session
@property
def audit_config(self) -> dict:
"""Returns the audit configuration."""
return self._audit_config
@property
def fixer_config(self) -> dict:
"""Returns the fixer configuration."""
return self._fixer_config
@property
def mutelist(self) -> APIMutelist:
"""Returns the provider mutelist."""
return self._mutelist
def print_credentials(self) -> None:
"""
Display account information with color formatting.
This method prints the provider credentials and account information
in a formatted way using colorama for better readability.
"""
from colorama import Style
report_lines = [
f" Username: {self._username}",
f" TenantID: {self._tenant_id}",
]
report_title = f"{Style.BRIGHT}Using the {self._type.upper()} credentials below:{Style.RESET_ALL}"
print_boxes(report_lines, report_title)
def setup_session(self) -> None:
"""
Implement API authentication method by calling the provider's authentication endpoint.
This method performs the authentication flow to obtain an access token
and creates a requests.Session with the appropriate headers for API calls.
"""
# Example for a Keystone-like authentication
url = "https://api.your-provider.com/v2.0/tokens"
data = {
"auth": {
"tenantId": self._tenant_id,
"passwordCredentials": {
"username": self._username,
"password": self._password,
},
}
}
try:
response = requests.post(url, json=data, timeout=10)
if response.status_code == 200:
resp_json = response.json()
self._token = resp_json["access"]["token"]["id"]
# Create session with authentication headers
sess = requests.Session()
sess.headers.update({
"X-Auth-Token": self._token,
"Content-Type": "application/json"
})
self._session = sess
logger.info("API token acquired successfully and session is set up.")
else:
logger.critical(
f"Failed to get token. Status: {response.status_code}, Body: {response.text}"
)
raise ValueError("Failed to get API token")
except Exception as e:
logger.critical(f"[setup_session] Error: {e}")
raise e
@staticmethod
def test_connection(
username: str,
password: str,
tenant_id: str,
raise_on_exception: bool = True,
) -> Connection:
"""
Test connection to the API provider by performing:
1) Authentication token request
2) (Optional) a small test API call to confirm credentials are valid
Args:
username: The API username
password: The API password
tenant_id: The tenant ID
raise_on_exception: If True, raise the caught exception;
if False, return Connection(error=exception).
Returns:
Connection: Connection test result
"""
try:
# 1) Validate arguments
if not username or not password or not tenant_id:
error_msg = "API test_connection error: missing username/password/tenant_id"
logger.error(error_msg)
raise ValueError(error_msg)
# 2) Request authentication token
token_url = "https://api.your-provider.com/v2.0/tokens"
data = {
"auth": {
"tenantId": tenant_id,
"passwordCredentials": {
"username": username,
"password": password,
},
}
}
resp = requests.post(token_url, json=data, timeout=10)
if resp.status_code != 200:
error_msg = f"Failed to get token. Status: {resp.status_code}, Body: {resp.text}"
logger.error(error_msg)
if raise_on_exception:
raise Exception(error_msg)
return Connection(error=Exception(error_msg))
# Success
token_json = resp.json()
api_token = token_json["access"]["token"]["id"]
logger.info("API test_connection: Successfully acquired token.")
# 3) (Optional) Test API call to confirm credentials are valid
test_endpoint = f"https://api.your-provider.com/v2/{tenant_id}/test"
headers = {
"X-Auth-Token": api_token,
"Content-Type": "application/json",
}
test_resp = requests.get(test_endpoint, headers=headers, timeout=10)
if test_resp.status_code == 200:
logger.info("API test_connection: Test call success. Credentials valid.")
return Connection(is_connected=True)
else:
error_msg = f"Test call failed. Status: {test_resp.status_code}, Body: {test_resp.text}"
logger.error(error_msg)
if raise_on_exception:
raise Exception(error_msg)
return Connection(error=Exception(error_msg))
except Exception as e:
logger.critical(f"{e.__class__.__name__}[{e.__traceback__.tb_lineno}]: {e}")
if raise_on_exception:
raise e
return Connection(error=e)
@staticmethod
def validate_arguments(username: str, password: str, tenant_id: str) -> None:
"""
Ensures that username, password, and tenant_id are not empty.
Args:
username: The username to validate
password: The password to validate
tenant_id: The tenant ID to validate
Raises:
ValueError: If any required parameter is missing
"""
if not username or not password or not tenant_id:
raise ValueError("API Provider requires username, password and tenant_id.")
```
#### Step 3: Create Models
**Explanation:**
Models define the data structures used by your API provider. They include identity information and API response structures. These models ensure type safety and consistent data handling across the provider.
**File:** `prowler/providers//models.py`
This step is common with SDK providers so you can follow the same pattern as [there](#step-3-create-models).
#### Step 4: Implement Arguments
**Explanation:**
Argument validation ensures that the API provider receives valid configuration parameters. This step is crucial for preventing runtime errors and providing clear error messages to users.
**File:** `prowler/providers//lib/arguments/arguments.py`
Arguments depends on the provider and not the type, so the pattern for this step is the same as the [SDK providers](#step-4-implement-arguments).
#### Step 5: Implement Mutelist
**Explanation:**
The mutelist functionality allows users to exclude specific resources or checks from the audit. This is useful for handling false positives or excluding resources that are intentionally configured differently.
**File:** `prowler/providers//lib/mutelist/mutelist.py`
The implementation of the mutelist is the same as the [SDK providers](#step-5-implement-mutelist).
#### Step 6: Implement Regions
**Explanation:**
Region management is essential for cloud providers that operate across multiple geographic locations. This component handles region validation and provides region-specific functionality.
Regions are optional, only if the provider has regions, for example Github does not have regions, but AWS does.
**File:** `prowler/providers//lib/regions/_regions.py`
The implementation of the regions is the same as the [SDK providers](#step-6-implement-regions).
#### Step 7: Create Custom Exceptions
**Explanation:**
Custom exceptions provide specific error handling for API-related issues, making debugging and error reporting more effective. Prowler uses a structured exception system with error codes, messages, and remediation steps.
**File:** `prowler/providers//exceptions/exceptions.py`
```python theme={null}
from prowler.exceptions.exceptions import ProwlerException
# Exceptions codes from 8000 to 8999 are reserved for API Provider exceptions (example numbers)
class APIProviderBaseException(ProwlerException):
"""Base class for API Provider Errors."""
APIProvider_ERROR_CODES = {
(8000, "APIProviderCredentialsError"): {
"message": "API Provider credentials not found or invalid",
"remediation": "Check the API Provider API credentials and ensure they are properly set.",
},
(8001, "APIProviderAuthenticationError"): {
"message": "API Provider authentication failed",
"remediation": "Check the API Provider API credentials and ensure they are valid.",
},
(8002, "APIProviderSessionError"): {
"message": "API Provider session setup failed",
"remediation": "Check the session setup and ensure it is properly configured.",
},
(8003, "APIProviderIdentityError"): {
"message": "API Provider identity setup failed",
"remediation": "Check credentials and ensure they are properly set up for API Provider.",
},
(8004, "APIProviderAPIError"): {
"message": "API Provider API call failed",
"remediation": "Check the API request and ensure it is properly formatted.",
},
(8005, "APIProviderRateLimitError"): {
"message": "API Provider API rate limit exceeded",
"remediation": "Reduce the number of API requests or wait before making more requests.",
},
}
def __init__(self, code, file=None, original_exception=None, message=None):
provider = "API Provider"
error_info = self.APIProvider_ERROR_CODES.get((code, self.__class__.__name__))
if message:
error_info["message"] = message
super().__init__(
code=code,
source=provider,
file=file,
original_exception=original_exception,
error_info=error_info,
)
class APIProviderCredentialsError(APIProviderBaseException):
"""Exception for API Provider credentials errors"""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
code=8000,
file=file,
original_exception=original_exception,
message=message,
)
class APIProviderAuthenticationError(APIProviderBaseException):
"""Exception for API Provider authentication errors"""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
code=8001,
file=file,
original_exception=original_exception,
message=message,
)
class APIProviderSessionError(APIProviderBaseException):
"""Exception for API Provider session setup errors"""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
code=8002,
file=file,
original_exception=original_exception,
message=message,
)
class APIProviderIdentityError(APIProviderBaseException):
"""Exception for API Provider identity setup errors"""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
code=8003,
file=file,
original_exception=original_exception,
message=message,
)
class APIProviderAPIError(APIProviderBaseException):
"""Exception for API Provider API errors"""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
code=8004,
file=file,
original_exception=original_exception,
message=message,
)
class APIProviderRateLimitError(APIProviderBaseException):
"""Exception for API Provider rate limit errors"""
def __init__(self, file=None, original_exception=None, message=None):
super().__init__(
code=8005,
file=file,
original_exception=original_exception,
message=message,
)
```
#### Step 8: Implement Service Base Class
**Explanation:**
The service base class defines a common interface for all services in your provider, since they will inherit from it. It defines the client to make requests to, the audit configuration and the fixer configuration.
**File:** `prowler/providers//lib/service/service.py`
```python theme={null}
from prowler.providers.._provider import Provider
class APIProviderService(BaseService):
"""
Base service class for API Provider services.
This class provides common functionality for all services
within the provider, including session management and error handling.
"""
def __init__(self, provider: Provider):
"""
Initialize the service.
Args:
provider: The provider instance
"""
self.client = provider.session.get_client(self.service_name)
self.audit_config = provider.audit_config
self.fixer_config = provider.fixer_config
self.session = provider.session
self.base_url = provider.session.base_url
self.auth = HTTPDigestAuth(
provider.session.public_key,
provider.session.private_key,
)
self.headers = {
"Authorization": self.auth.encode(),
"Content-Type": "application/json",
}
```
#### Step 9: Register in CLI
**Explanation:**
Add your provider to the available providers in the CLI.
**File:** `prowler/lib/cli/parser.py`
This step is the same as the [SDK providers](#step-9-register-in-cli).
#### Step 10: Register in Main
**Explanation:**
Main registration makes your provider discoverable by Prowler's core system. It's needed to add your provider to the output options and to the compliance evaluation.
**File:** `prowler/__main__.py`
This step is the same as the [SDK providers](#step-10-register-in-main).
#### Step 11: Register in the list of providers
**Explanation:**
This is needed to be able to use the provider in the generic checks. The provider must be registered in the `init_global_provider` method to handle CLI arguments and initialization.
**File:** `prowler/providers/common/provider.py`
This step is the same as the [SDK providers](#step-11-register-in-the-list-of-providers).
#### Step 12: Add to Config
**Explanation:**
Configuration registration ensures your API provider is recognized by Prowler's configuration system. This enables proper handling of provider-specific settings and defaults.
**File:** `prowler/config/config.py`
This step is the same as the [SDK providers](#step-12-add-to-config).
#### Step 13: Create Compliance Files
**Explanation:**
Compliance files define the security checks and standards that your provider supports. These JSON files map security controls to specific checks and provide remediation guidance. It's needed to create the folder with an init file to ensure the provider will work, however, adding different compliance files is optional.
**Folder:** `prowler/compliance//`
This step is the same as the [SDK providers](#step-13-create-compliance-files).
#### Step 14: Add Output Support
**Explanation:**
Output support ensures that your provider's results are properly formatted in Prowler's various output formats (CSV, JSON, HTML, etc.). This step integrates your provider with Prowler's reporting system.
**File:** `prowler/lib/outputs/summary_table.py`
This step is the same as the [SDK providers](#step-14-add-output-support).
#### Step 15: Generate the HTML Report
**Explanation:**
The HTML file is needed to be able to generate the HTML report. This step involves adding support for your provider in the HTML output generation system to ensure proper display of assessment summaries and findings.
**File:** `prowler/lib/outputs/html/html.py`
This step is the same as the [SDK providers](#step-15-generate-the-html-report).
#### Step 16: Add the Check Report Model
**Explanation:**
Add the provider to the generic models, this is needed to be able to use the provider in the generic checks.
**File:** `prowler/providers/check/models.py`
This step is the same as the [SDK providers](#step-16-add-the-check-report-model).
#### Step 17: Create Tests
**Explanation:**
Testing ensures that your API provider works correctly and maintains compatibility as Prowler evolves. Comprehensive tests cover authentication, session management, and API-specific functionality.
**Folder:** `tests/providers//`
```python theme={null}
import pytest
from prowler.providers.api_provider.api_provider import APIProvider
class TestAPIProvider:
"""Test cases for APIProvider."""
def test_provider_initialization(self):
"""Test provider initialization with valid credentials."""
provider = APIProvider(
username="test_user",
password="test_password",
tenant_id="test_tenant"
)
assert provider.type == "your_api_provider"
assert provider.identity is not None
assert provider.session is not None
def test_connection_test(self):
"""Test connection functionality."""
result = APIProvider.test_connection(
username="test_user",
password="test_password",
tenant_id="test_tenant"
)
# Add assertions based on expected behavior
def test_argument_validation(self):
"""Test argument validation."""
from prowler.providers.api_provider.api_provider import (
APIProvider
)
# Valid arguments
APIProvider.validate_arguments(
username="test_user",
password="test_password",
tenant_id="test_tenant"
)
# Invalid arguments
with pytest.raises(ValueError, match="requires username, password and tenant_id"):
APIProvider.validate_arguments("", "", "")
def test_session_setup(self):
"""Test session setup."""
provider = APIProvider(
username="test_user",
password="test_password",
tenant_id="test_tenant"
)
assert provider.session is not None
assert "X-Auth-Token" in provider.session.headers
```
#### Step 18: Update Documentation
**Explanation:**
Documentation updates ensure that users can find information about your API provider in Prowler's documentation. This includes examples, configuration guides, and troubleshooting information.
Update the provider documentation to include your new API provider in the examples and implementation guidance.
***
### Tool/Wrapper Providers
**Definition:**
* Integrate third-party tools as libraries or subprocesses (e.g., Trivy for IaC).
* Examples: IaC (Trivy).
**Typical Use Cases:**
* Providers that require integration with external security tools.
* Tools that need to be executed as subprocesses or imported as libraries.
* Providers that require specific tool configurations and argument mapping.
* Legacy systems or tools that don't have direct API access.
**Key Characteristics:**
* No session/identity management required (tool handles this internally).
* Arguments: specific to the tool, but for example: `scan_path`, `frameworks`, `exclude_path`, `scan_repository_url`, etc.
* Outputs: Tool-specific output formats that need to be parsed and converted.
* Tool execution and output parsing.
* Configuration file mapping and argument translation.
**Implementation Details:**
* Tool providers typically execute external tools as subprocesses (e.g., `pwsh` or `trivy` command).
* They require mapping between Prowler's interface and the tool's arguments.
* Output parsing and conversion to Prowler's standard format is crucial.
* Tool-specific configuration files and validation.
* Repository cloning and temporary file management for remote scans (if needed).
**Note:** This guide provides a general framework for integrating any external tool. The specific implementation details (like repository cloning, authentication tokens, etc.) will depend on your particular tool's requirements. The core pattern is: integrate with Prowler's CLI, execute your tool via subprocess, and parse the output into Prowler's format.
***
### Implementation Guide for Tool/Wrapper Providers
#### Step 1: Create the Provider Structure
**Explanation:**
Tool/Wrapper providers require a specific folder structure to organize tool integration, configuration, and service management. This structure follows Prowler's conventions and ensures proper integration with the CLI and API.
**Required Structure:**
```
prowler/providers//
├── __init__.py
├── _provider.py
├── models.py
└── lib/
├── __init__.py
└── arguments/
├── __init__.py
└── arguments.py
```
**Key Components:**
* **`_provider.py`**: Main provider class with tool integration
* **`models.py`**: Data structures for tool output and configuration
* **`lib/arguments/`**: CLI argument validation and parsing
#### Step 2: Implement the Provider Class
**Explanation:**
The provider class is the core component that handles tool integration, execution, and output parsing. It inherits from Prowler's base Provider class and implements tool-specific execution flows using subprocesses or library calls.
**File:** `prowler/providers//_provider.py`
```python theme={null}
import json
import subprocess
import sys
from typing import List
from colorama import Fore, Style
from prowler.config.config import (
default_config_file_path,
load_and_validate_config_file,
)
from prowler.lib.check.models import CheckReportYourTool
from prowler.lib.logger import logger
from prowler.lib.utils.utils import print_boxes
from prowler.providers.common.models import Audit_Metadata
from prowler.providers.common.provider import Provider
class ToolProvider(Provider):
"""
ToolProvider class is the main class for the Your Tool Provider.
This class is responsible for initializing the provider, executing the external tool,
parsing tool output, and converting results to Prowler's standard format.
Attributes:
_type (str): The provider type.
_session: Not used for tool providers.
_identity (str): Simple identity for tool providers.
_audit_config (dict): The audit configuration.
audit_metadata (Audit_Metadata): The audit metadata.
"""
_type: str = "your_tool_provider"
audit_metadata: Audit_Metadata
def __init__(
self,
# Tool-specific parameters
scan_path: str = ".",
tool_specific_arg: str = "default_value",
exclude_path: list[str] = [],
# Configuration
config_path: str = None,
config_content: dict = None,
fixer_config: dict = {},
# Authentication (if needed for your tool)
auth_token: str = None,
auth_username: str = None,
):
"""
Initializes the ToolProvider instance.
Args:
scan_path: Path to the folder containing files to scan
tool_specific_arg: Tool-specific argument for your external tool
exclude_path: List of paths to exclude from scan
config_path: Path to the configuration file
config_content: Configuration content as dictionary
fixer_config: Fixer configuration dictionary
auth_token: Authentication token for your tool (if needed)
auth_username: Username for your tool (if needed)
Raises:
ValueError: If required parameters are missing
"""
logger.info("Instantiating YourTool Provider...")
# Store tool-specific parameters
self.scan_path = scan_path
self.tool_specific_arg = tool_specific_arg
self.exclude_path = exclude_path
self.region = "global"
self.audited_account = "local-tool"
self._session = None
self._identity = "prowler"
self._auth_method = "No auth"
# Handle tool authentication if needed
if auth_token:
self.auth_token = auth_token
self._auth_method = "Token"
logger.info("Using token for tool authentication")
elif auth_username:
self.auth_username = auth_username
self._auth_method = "Username"
logger.info("Using username for tool authentication")
logger.info("Using username for tool authentication")
else:
logger.debug("No authentication method provided; proceeding without authentication.")
# Audit Config
if config_content:
self._audit_config = config_content
else:
if not config_path:
config_path = default_config_file_path
self._audit_config = load_and_validate_config_file(self._type, config_path)
# Fixer Config
self._fixer_config = fixer_config
# Mutelist (not needed for tool providers since tools have their own mutelist logic)
self._mutelist = None
Provider.set_global_provider(self)
@property
def auth_method(self):
"""Returns the authentication method used."""
return self._auth_method
@property
def type(self):
"""Returns the type of the provider."""
return self._type
@property
def identity(self):
"""Returns the provider identity."""
return self._identity
@property
def session(self):
"""Returns the session (not used for tool providers)."""
return self._session
@property
def audit_config(self):
"""Returns the audit configuration."""
return self._audit_config
@property
def fixer_config(self):
"""Returns the fixer configuration."""
return self._fixer_config
def setup_session(self):
"""Tool providers don't need a session since they use external tools directly"""
return None
def _process_check(self, finding: dict, check: dict, status: str) -> CheckReportYourTool:
"""
Process a single check (failed or passed) and create a CheckReportYourTool object.
Args:
finding: The finding object from tool output
check: The individual check data
status: The status of the check ("FAIL", "PASS", or "MUTED")
Returns:
CheckReportYourTool: The processed check report
"""
try:
metadata_dict = {
"Provider": "your_tool_provider",
"CheckID": check.get("check_id", ""),
"CheckTitle": check.get("check_name", ""),
"CheckType": ["Your Tool Provider"],
"ServiceName": finding["check_type"],
"SubServiceName": "",
"ResourceIdTemplate": "",
"Severity": (
check.get("severity", "low").lower()
if check.get("severity")
else "low"
),
"ResourceType": "your_tool",
"Description": check.get("check_name", ""),
"Risk": "",
"RelatedUrl": (
check.get("guideline", "") if check.get("guideline") else ""
),
"Remediation": {
"Code": {
"NativeIaC": "",
"Terraform": "",
"CLI": "",
"Other": "",
},
"Recommendation": {
"Text": "",
"Url": (
check.get("guideline", "") if check.get("guideline") else ""
),
},
},
"Categories": [],
"DependsOn": [],
"RelatedTo": [],
"Notes": "",
}
# Convert metadata dict to JSON string
metadata = json.dumps(metadata_dict)
report = CheckReportYourTool(metadata=metadata, finding=check)
report.status = status
report.resource_tags = check.get("entity_tags", {})
report.status_extended = check.get("check_name", "")
if status == "MUTED":
report.muted = True
return report
except Exception as error:
logger.critical(
f"{error.__class__.__name__}:{error.__traceback__.tb_lineno} -- {error}"
)
sys.exit(1)
def run(self) -> List[CheckReportYourTool]:
"""
Main execution method that handles tool execution.
Returns:
List[CheckReportYourTool]: List of check reports from the tool scan
"""
return self.run_scan(self.scan_path, self.exclude_path)
def run_scan(
self, directory: str, exclude_path: list[str]
) -> List[CheckReportYourTool]:
"""
Execute the external tool and parse its output.
Args:
directory: Directory to scan
frameworks: List of frameworks to scan
exclude_path: List of paths to exclude
Returns:
List[CheckReportYourTool]: List of check reports
"""
try:
logger.info(f"Running YourTool scan on {directory} ...")
# Build the tool command
tool_command = [
"your_tool_command",
# Add your tool-specific arguments here, this are just examples
"-d",
directory,
"-o",
"json",
"-f",
",".join(frameworks),
]
if exclude_path:
tool_command.extend(["--skip-path", ",".join(exclude_path)])
# Run the tool with JSON output
process = subprocess.run(
tool_command,
capture_output=True,
text=True,
)
# Log tool's error output if any
if process.stderr:
logger.error(process.stderr)
try:
output = json.loads(process.stdout)
if not output:
logger.warning("No findings returned from YourTool scan")
return []
except Exception as error:
logger.critical(
f"{error.__class__.__name__}:{error.__traceback__.tb_lineno} -- {error}"
)
sys.exit(1)
reports = []
# If only one framework has findings, the output is a dict, otherwise it's a list of dicts
if isinstance(output, dict):
output = [output]
# Process all frameworks findings
for finding in output:
results = finding.get("results", {})
# Process failed checks
failed_checks = results.get("failed_checks", [])
for failed_check in failed_checks:
report = self._process_check(finding, failed_check, "FAIL")
reports.append(report)
# Process passed checks
passed_checks = results.get("passed_checks", [])
for passed_check in passed_checks:
report = self._process_check(finding, passed_check, "PASS")
reports.append(report)
# Process skipped checks (muted)
skipped_checks = results.get("skipped_checks", [])
for skipped_check in skipped_checks:
report = self._process_check(finding, skipped_check, "MUTED")
reports.append(report)
return reports
except Exception as error:
if "No such file or directory: 'your_tool_command'" in str(error):
logger.critical("Please, install your_tool using 'pip install your_tool'")
sys.exit(1)
logger.critical(
f"{error.__class__.__name__}:{error.__traceback__.tb_lineno} -- {error}"
)
sys.exit(1)
def print_credentials(self):
"""
Display scan information with color formatting.
This method prints the tool scan information in a formatted way
using colorama for better readability.
"""
if self.scan_repository_url:
report_title = (
f"{Style.BRIGHT}Scanning remote repository:{Style.RESET_ALL}"
)
report_lines = [
f"Repository: {Fore.YELLOW}{self.scan_repository_url}{Style.RESET_ALL}",
]
else:
report_title = (
f"{Style.BRIGHT}Scanning local directory:{Style.RESET_ALL}"
)
report_lines = [
f"Directory: {Fore.YELLOW}{self.scan_path}{Style.RESET_ALL}",
]
if self.exclude_path:
report_lines.append(
f"Excluded paths: {Fore.YELLOW}{', '.join(self.exclude_path)}{Style.RESET_ALL}"
)
report_lines.append(
f"Frameworks: {Fore.YELLOW}{', '.join(self.frameworks)}{Style.RESET_ALL}"
)
report_lines.append(
f"Authentication method: {Fore.YELLOW}{self.auth_method}{Style.RESET_ALL}"
)
print_boxes(report_lines, report_title)
```
#### Step 3: Create Models
**Explanation:**
Models define the data structures used by your tool provider. They include output options and tool-specific configurations. These models ensure type safety and consistent data handling across the provider.
**File:** `prowler/providers//models.py`
```python theme={null}
from prowler.config.config import output_file_timestamp
from prowler.providers.common.models import ProviderOutputOptions
class YourToolOutputOptions(ProviderOutputOptions):
"""
YourToolOutputOptions overrides ProviderOutputOptions for tool-specific output logic.
For example, generating a filename that includes the tool name.
Attributes inherited from ProviderOutputOptions:
- output_filename (str): The base filename used for generated reports.
- output_directory (str): The directory to store the output files.
- ... see ProviderOutputOptions for more details.
Methods:
- __init__: Customizes the output filename logic for the tool provider.
"""
def __init__(self, arguments, bulk_checks_metadata):
super().__init__(arguments, bulk_checks_metadata)
# If --output-filename is not specified, build a default name.
if not getattr(arguments, "output_filename", None):
self.output_filename = f"prowler-output-your_tool-{output_file_timestamp}"
# If --output-filename was explicitly given, respect that
else:
self.output_filename = arguments.output_filename
```
#### Step 4: Implement Arguments
**Explanation:**
Argument validation ensures that the tool provider receives valid configuration parameters. This step is crucial for preventing runtime errors and providing clear error messages to users.
**File:** `prowler/providers//lib/arguments/arguments.py`
```python theme={null}
# Add your tool-specific choices if needed
TOOL_SPECIFIC_CHOICES = [
"option1",
"option2",
"option3",
# Add your tool's supported options
]
def init_parser(self):
"""Init the Provider CLI parser"""
_parser = self.subparsers.add_parser(
"", parents=[self.common_providers_parser], help=" Provider"
)
# Scan Path
_scan_subparser = _parser.add_argument_group("Scan Path")
_scan_subparser.add_argument(
"--scan-path",
"-P",
dest="scan_path",
default=".",
help="Path to the folder containing your files to scan. Default: current directory.",
)
_scan_subparser.add_argument(
"--tool-specific-arg",
dest="tool_specific_arg",
default="default_value",
choices=TOOL_SPECIFIC_CHOICES,
help="Tool-specific argument for your external tool. Default: default_value",
)
_scan_subparser.add_argument(
"--exclude-path",
dest="exclude_path",
nargs="+",
default=[],
help="Comma-separated list of paths to exclude from the scan. Default: none",
)
# Authentication (if needed for your tool)
_scan_subparser.add_argument(
"--auth-token",
dest="auth_token",
nargs="?",
default=None,
help="Authentication token for your tool. If not provided, will use YOUR_TOOL_AUTH_TOKEN env var.",
)
_scan_subparser.add_argument(
"--auth-username",
dest="auth_username",
nargs="?",
default=None,
help="Username for your tool authentication. If not provided, will use YOUR_TOOL_AUTH_USERNAME env var.",
)
def validate_arguments(arguments):
"""
Validate tool-specific arguments.
Args:
arguments: The parsed arguments
Returns:
tuple: (is_valid, error_message)
"""
scan_path = getattr(arguments, "scan_path", None)
scan_repository_url = getattr(arguments, "scan_repository_url", None)
if scan_path and scan_repository_url:
# If scan_path is set to default ("."), allow scan_repository_url
if scan_path != ".":
return (
False,
"--scan-path (-P) and --scan-repository-url (-R) are mutually exclusive. Please specify only one.",
)
return (True, "")
```
#### Step 5: Register in CLI
**Explanation:**
Add your provider to the available providers in the CLI.
**File:** `prowler/lib/cli/parser.py`
This step is the same as the [SDK providers](#step-9-register-in-cli).
#### Step 6: Register in Main
**Explanation:**
Main registration makes your provider discoverable by Prowler's core system. It's needed to add your provider to the output options and to the compliance evaluation.
**File:** `prowler/__main__.py`
This step is the same as the [SDK providers](#step-10-register-in-main).
#### Step 7: Register in the list of providers
**Explanation:**
This is needed to be able to use the provider in the generic checks. The provider must be registered in the `init_global_provider` method to handle CLI arguments and initialization.
**File:** `prowler/providers/common/provider.py`
This step is the same as the [SDK providers](#step-11-register-in-the-list-of-providers).
#### Step 8: Add to Config
**Explanation:**
Configuration registration ensures your tool provider is recognized by Prowler's configuration system. This enables proper handling of provider-specific settings and defaults.
**File:** `prowler/config/config.py`
This step is the same as the [SDK providers](#step-12-add-to-config).
In some cases, you may need to create a new configuration file for your provider, for example, the AWS one that is inside `prowler/providers/aws/config.py`.
#### Step 9: Create Compliance Files
**Explanation:**
Compliance files define the security checks and standards that your provider supports. These JSON files map security controls to specific checks and provide remediation guidance. It's needed to create the folder with an init file to ensure the provider will work, however, adding different compliance files is optional.
**Folder:** `prowler/compliance//`
This step is the same as the [SDK providers](#step-13-create-compliance-files).
#### Step 10: Add Output Support
**Explanation:**
Output support ensures that your provider's results are properly formatted in Prowler's various output formats (CSV, JSON, HTML, etc.). This step integrates your provider with Prowler's reporting system.
**File:** `prowler/lib/outputs/summary_table.py`
This step is the same as the [SDK providers](#step-14-add-output-support).
#### Step 11: Generate the HTML Report
**Explanation:**
The HTML file is needed to be able to generate the HTML report. This step involves adding support for your provider in the HTML output generation system to ensure proper display of assessment summaries and findings.
**File:** `prowler/lib/outputs/html/html.py`
This step is the same as the [SDK providers](#step-15-generate-the-html-report).
#### Step 12: Add the Check Report Model
**Explanation:**
Add the provider to the generic models, this is needed to be able to use the provider in the generic checks.
**File:** `prowler/providers/check/models.py`
This step is the same as the [SDK providers](#step-16-add-the-check-report-model).
#### Step 13: Create Tests
**Explanation:**
Testing ensures that your tool provider works correctly and maintains compatibility as Prowler evolves. Comprehensive tests cover tool execution, output parsing, and provider-specific functionality.
**Folder:** `tests/providers//`
```python theme={null}
import pytest
import tempfile
import os
from prowler.providers.your_tool_provider.your_tool_provider import ToolProvider
class TestToolProvider:
"""Test cases for ToolProvider."""
def test_provider_initialization(self):
"""Test provider initialization with valid parameters."""
provider = ToolProvider(
scan_path=".",
frameworks=["framework1"]
)
assert provider.type == "your_tool_provider"
assert provider.identity == "prowler"
assert provider.scan_path == "."
def test_tool_execution(self):
"""Test tool execution and output parsing."""
provider = ToolProvider(scan_path=".")
# Mock the subprocess call and test output parsing
# This will depend on your specific tool's output format
def test_argument_validation(self):
"""Test argument validation."""
from prowler.providers.your_tool_provider.lib.arguments.arguments import (
validate_arguments
)
# Valid arguments
class MockArgs:
scan_path = "."
tool_specific_arg = "value"
is_valid, message = validate_arguments(MockArgs())
assert is_valid is True
# Add more test cases as needed for your specific tool provider
def test_print_credentials(self):
"""Test print_credentials method."""
provider = ToolProvider(
scan_path="/test/path",
frameworks=["framework1"]
)
# This should not raise any exceptions
provider.print_credentials()
```
#### Step 14: Update Documentation
**Explanation:**
Documentation updates ensure that users can find information about your tool provider in Prowler's documentation. This includes examples, configuration guides, and troubleshooting information.
Update the provider documentation to include your new tool provider in the examples and implementation guidance.
***
## Step 2: Integrate the Provider in the API
This step is required only if you want your provider to be available in the API and UI. The API integration involves several components:
### 2.1. Backend API Models
**Location:** `api/src/backend/api/models.py`
Add your provider to the `ProviderChoices` enum and implement UID validation:
```python theme={null}
class ProviderChoices(models.TextChoices):
AWS = "aws", "AWS"
AZURE = "azure", "Azure"
GCP = "gcp", "GCP"
KUBERNETES = "kubernetes", "Kubernetes"
M365 = "m365", "Microsoft 365"
GITHUB = "github", "GitHub"
NHN = "nhn", "NHN Cloud"
IAC = "iac", "Infrastructure as Code"
YOUR_PROVIDER = "your_provider", "Your Provider" # Add your provider here
@staticmethod
def validate_your_provider_uid(value):
"""Validate your provider UID format."""
if not re.match(r"^your-regex-pattern$", value):
raise ModelValidationError(
detail="Your provider UID must follow the specified format.",
code="your-provider-uid",
pointer="/data/attributes/uid",
)
```
**Provider Model:**
The `Provider` model already exists and supports all provider types. Ensure your provider type is included in the choices.
### 2.2. Add the provider to the Provider Choices
Update the `return_prowler_provider` function to include your provider. This function is crucial for the API to instantiate the correct provider class.
**File:** `api/src/backend/api/utils.py`
```python theme={null}
from prowler.providers.your_provider.your_provider import YourProvider # Add your import
def return_prowler_provider(
provider: Provider,
) -> [
AwsProvider
| AzureProvider
| GcpProvider
| GithubProvider
| KubernetesProvider
| M365Provider
| YourProvider # Add your provider to the return type annotation
]:
"""Return the Prowler provider class based on the given provider type."""
match provider.provider:
case Provider.ProviderChoices.AWS.value:
prowler_provider = AwsProvider
case Provider.ProviderChoices.AZURE.value:
prowler_provider = AzureProvider
case Provider.ProviderChoices.GCP.value:
prowler_provider = GcpProvider
case Provider.ProviderChoices.KUBERNETES.value:
prowler_provider = KubernetesProvider
case Provider.ProviderChoices.M365.value:
prowler_provider = M365Provider
case Provider.ProviderChoices.GITHUB.value:
prowler_provider = GithubProvider
case Provider.ProviderChoices.YOUR_PROVIDER.value: # Add your provider here
prowler_provider = YourProvider
case _:
raise ValueError(f"Provider type {provider.provider} not supported")
return prowler_provider
```
**Also update the `initialize_prowler_provider` function:**
```python theme={null}
def initialize_prowler_provider(
provider: Provider,
mutelist_processor: Processor | None = None,
) -> (
AwsProvider
| AzureProvider
| GcpProvider
| GithubProvider
| KubernetesProvider
| M365Provider
| YourProvider # Add your provider to the return type annotation
):
"""Initialize a Prowler provider instance based on the given provider type."""
prowler_provider = return_prowler_provider(provider)
prowler_provider_kwargs = get_prowler_provider_kwargs(provider, mutelist_processor)
return prowler_provider(**prowler_provider_kwargs)
```
**Note:** The `match` statement requires Python 3.10+. If you're using an older version, you can use traditional `if-elif` statements instead.
### 2.3. API Serializers
Create or update serializers for your provider. You'll need to add your provider to the validation logic:
**File:** `api/src/backend/api/v1/serializers.py`
```python theme={null}
def validate_secret_based_on_provider(provider_type, secret):
"""Validate provider-specific secrets."""
if provider_type == Provider.ProviderChoices.AWS.value:
serializer = AWSProviderSecret(data=secret)
elif provider_type == Provider.ProviderChoices.AZURE.value:
serializer = AzureProviderSecret(data=secret)
elif provider_type == Provider.ProviderChoices.GCP.value:
serializer = GCPProviderSecret(data=secret)
elif provider_type == Provider.ProviderChoices.YOUR_PROVIDER.value: # Add your provider here
serializer = YourProviderSecret(data=secret)
# ... other providers
if serializer.is_valid():
return serializer.validated_data
else:
raise serializers.ValidationError(serializer.errors)
class YourProviderSecret(serializers.Serializer):
"""Serializer for your provider credentials."""
your_auth_field = serializers.CharField(required=True)
your_optional_field = serializers.CharField(required=False)
class Meta:
resource_name = "provider-secrets"
```
Also update the providers included in the serializer:
**File:** `api/src/backend/api/v1/serializer_utils/providers.py`
```python theme={null}
@extend_schema_field(
{
"oneOf": [
# ... existing provider schemas ...
{
"type": "object",
"title": "Your Provider Credentials",
"properties": {
"your_auth_field": {
"type": "string",
"description": "Your provider authentication field description.",
},
"your_optional_field": {
"type": "string",
"description": "Optional field for your provider (if applicable).",
},
"your_required_field": {
"type": "string",
"description": "Required field for your provider authentication.",
}
},
"required": ["your_auth_field", "your_required_field"]
},
# ... other existing schemas ...
]
}
)
```
### 2.4. Database Migration
Create a new migration to add your provider to the database. This is crucial for the API to recognize your provider type.
**File:** `api/src/backend/api/migrations/XXXX_your_provider.py`
```python theme={null}
# Generated by Django X.X.X on YYYY-MM-DD
from django.db import migrations
import api.db_utils
class Migration(migrations.Migration):
dependencies = [
("api", "previous_migration_name"), # Update this to the latest migration
]
operations = [
migrations.AlterField(
model_name="provider",
name="provider",
field=api.db_utils.ProviderEnumField(
choices=[
("aws", "AWS"),
("azure", "Azure"),
("gcp", "GCP"),
("kubernetes", "Kubernetes"),
("m365", "M365"),
("github", "GitHub"),
("your_provider", "Your Provider"), # Add your provider here
],
default="aws",
),
),
migrations.RunSQL(
"ALTER TYPE provider ADD VALUE IF NOT EXISTS 'your_provider';",
reverse_sql=migrations.RunSQL.noop,
),
]
```
**Important Notes:**
* **Migration Number**: Use the next sequential number (e.g., if latest is 0044, use 0045)
* **Dependencies**: Update the `dependencies` list to point to the most recent migration
* **Choices Array**: Add your provider to the `choices` array with proper display name
* **SQL Operation**: The `RunSQL` operation adds your provider to the PostgreSQL enum type
* **Reverse SQL**: Use `migrations.RunSQL.noop` since adding enum values cannot be easily reversed
**Migration Naming Convention:**
* Format: `XXXX_your_provider.py` (e.g., `0045_your_provider.py`)
* Use descriptive names that indicate what the migration does
* Follow the existing pattern in the migrations folder
### 2.5. Update the V1 Yaml
Update the OpenAPI specification (`v1.yaml`) to include your provider in all relevant endpoints and schemas. This is crucial for API documentation and client generation.
**File:** `api/src/backend/api/specs/v1.yaml`
#### 2.5.1. Provider Enum Values
Add your provider to the provider enum in the Provider schema:
```yaml theme={null}
# Around line 12150 in v1.yaml
Provider:
type: object
properties:
attributes:
properties:
provider:
enum:
- aws
- azure
- gcp
- kubernetes
- m365
- github
- your_provider # Add your provider here
type: string
description: |-
* `aws` - AWS
* `azure` - Azure
* `gcp` - GCP
* `kubernetes` - Kubernetes
* `m365` - M365
* `github` - GitHub
* `your_provider` - Your Provider # Add your provider here
```
#### 2.5.2. Provider Credential Schemas
Add your provider's credential schema to the integration configuration. This defines how your provider's credentials are structured:
```yaml theme={null}
# Around line 11100 in v1.yaml, in the integration configuration
- type: object
title: Your Provider Credentials # Add your provider here
properties:
your_auth_field:
type: string
description: Your provider authentication field description.
your_optional_field:
type: string
description: Optional field for your provider (if applicable).
your_required_field:
type: string
description: Required field for your provider authentication.
required:
- your_auth_field
- your_required_field
```
#### 2.5.3. Example Provider Schemas
Here are examples of how existing providers are documented:
**AWS Provider:**
```yaml theme={null}
- type: object
title: AWS Static Credentials
properties:
aws_access_key_id:
type: string
description: The AWS access key ID.
aws_secret_access_key:
type: string
description: The AWS secret access key.
required:
- aws_access_key_id
- aws_secret_access_key
- type: object
title: AWS Assume Role
properties:
role_arn:
type: string
description: The Amazon Resource Name (ARN) of the role to assume.
external_id:
type: string
description: An identifier to enhance security for role assumption.
required:
- role_arn
- external_id
```
**GitHub Provider:**
```yaml theme={null}
- type: object
title: GitHub Personal Access Token
properties:
personal_access_token:
type: string
description: GitHub personal access token for authentication.
required:
- personal_access_token
- type: object
title: GitHub OAuth App Token
properties:
oauth_app_token:
type: string
description: GitHub OAuth App token for authentication.
required:
- oauth_app_token
```
**M365 Provider:**
```yaml theme={null}
- type: object
title: M365 Static Credentials
properties:
client_id:
type: string
description: The Azure application (client) ID for authentication in Azure AD.
client_secret:
type: string
description: The client secret associated with the application (client) ID.
tenant_id:
type: string
description: The Azure tenant ID, representing the directory where the application is registered.
user:
type: email
description: User microsoft email address.
password:
type: string
description: User password.
required:
- client_id
- client_secret
- tenant_id
- user
- password
```
#### 2.5.4. Important Notes
* **Position**: Add your schema in the `oneOf` array alongside existing providers
* **Structure**: Follow the exact pattern of other providers (title, properties, required fields)
* **Descriptions**: Provide clear, helpful descriptions for each field
* **Required Fields**: Specify which fields are mandatory in the `required` array
* **Field Types**: Use appropriate JSON schema types (`string`, `integer`, `boolean`, `email`, etc.)
* **Validation**: Add any field-specific validation patterns or constraints
* **Documentation**: Ensure your provider appears in the generated API documentation
### 2.6. Testing API Integration
Create tests for your provider:
**Location:** `api/src/backend/api/tests/`
```python theme={null}
class YourProviderAPITestCase(APITestCase):
def setUp(self):
self.user = User.objects.create_user(username='testuser', password='testpass')
self.client.force_authenticate(user=self.user)
def test_create_your_provider(self):
data = {
'provider': 'your_provider',
'uid': 'valid-uid-123',
'alias': 'Test Account'
}
response = self.client.post('/api/v1/providers/', data)
self.assertEqual(response.status_code, 201)
self.assertEqual(response.data['provider'], 'your_provider')
def test_your_provider_uid_validation(self):
"""Test UID validation for your provider."""
invalid_uids = [
'invalid@uid',
'-invalid-start',
'a' * 40, # Too long
]
for invalid_uid in invalid_uids:
data = {
'provider': 'your_provider',
'uid': invalid_uid,
'alias': 'Test'
}
response = self.client.post('/api/v1/providers/', data)
self.assertEqual(response.status_code, 400)
self.assertIn('your-provider-uid', str(response.data))
def test_add_your_provider_credentials(self):
# Create provider first
provider = Provider.objects.create(
user=self.user,
provider='your_provider',
uid='valid-uid-123'
)
# Add credentials
credentials_data = {
'secret_type': 'your_provider_credentials',
'secret': {
'your_auth_field': 'auth_value',
'your_optional_field': 'optional_value'
},
'provider': provider.id
}
response = self.client.post('/api/v1/providers/secrets/', credentials_data)
self.assertEqual(response.status_code, 201)
```
#### 2.6.1. Add your mocked provider to the tests
If needed, add your mocked provider to the tests config file so you can use it on the tests.
**File:** `api/src/backend/conftest.py`
```python theme={null}
@pytest.fixture
def providers_fixture(tenants_fixture):
tenant, *_ = tenants_fixture
providerX = Provider.objects.create(
provider="your_provider",
uid="your_uid",
alias="your_alias",
tenant_id=tenant.id,
)
return provider1, provider2, provider3, ... providerX
```
### 2.7. Compliance and Output Support
Add your provider to the compliance export functionality:
**File:** `api/src/backend/tasks/jobs/export.py`
```python theme={null}
COMPLIANCE_FRAMEWORKS = {
"aws": [...],
"azure": [...],
"gcp": [...],
"kubernetes": [...],
"m365": [...],
"github": [...],
"your_provider": [ # Add your provider here
(lambda name: name.startswith("cis_"), YourProviderCIS),
(lambda name: name.startswith("iso27001_"), YourProviderISO27001),
],
}
```
If your provider has specific fields, add them to the finding transformation:
**File:** `prowler/lib/outputs/finding.py`
```python theme={null}
def transform_api_finding(cls, finding, provider) -> "Finding":
# ... existing code ...
# Your provider specific field
if provider.type == "your_provider":
finding.your_field = resource.your_field
# ... rest of the code ...
```
### 2.8. API Endpoints
Your provider will be available through these endpoints:
* `GET /api/v1/providers/` - List all providers
* `POST /api/v1/providers/` - Create a new provider
* `GET /api/v1/providers/{id}/` - Get provider details
* `PUT /api/v1/providers/{id}/` - Update provider
* `DELETE /api/v1/providers/{id}/` - Delete provider
* `POST /api/v1/providers/secrets/` - Add provider credentials
### 2.9. Update the provider if needed
Depending on your provider's authentication requirements, you may need to add new authentication methods that are compatible with the API. This involves updating the provider class to support additional credential types beyond the basic ones.
#### 2.9.1. Adding New Authentication Methods
If your provider requires specific authentication methods, you'll need to:
1. **Update the provider constructor** to accept new authentication parameters
2. **Extend the credential handling** to support the new authentication method
3. **Update the API serializers** to include the new credential fields
4. **Modify the OpenAPI specification** to document the new authentication schema
#### 2.9.2. Example: GitHub Provider Authentication Methods
The GitHub provider demonstrates how to implement multiple authentication methods:
```python theme={null}
# In prowler/providers/github/github_provider.py
def __init__(
self,
# Authentication methods
personal_access_token: str = "",
oauth_app_token: str = "",
github_app_key: str = "",
#Needed for the API integration
github_app_key_content: str = "",
github_app_id: int = 0,
# Provider configuration
config_path: str = None,
# ... other parameters
):
"""
Initialize GitHub provider.
Args:
personal_access_token (str): GitHub personal access token.
oauth_app_token (str): GitHub OAuth App token.
github_app_key (str): GitHub App key.
github_app_key_content (str): GitHub App key content.
github_app_id (int): GitHub App ID.
config_path (str): Path to the audit configuration file.
# ... other parameters
"""
super().__init__(
personal_access_token,
oauth_app_token,
github_app_id,
github_app_key,
github_app_key_content,
)
```
***
## Step 3: Integrate the Provider in the UI
TBD
***
## Provider Implementation Guidance
Use existing providers as templates, this will help you to understand better the structure and the implementation will be easier:
* [AWS (SDK)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/aws/aws_provider.py)
* [Azure (SDK)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/azure/azure_provider.py)
* [GCP (SDK)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/gcp/gcp_provider.py)
* [Kubernetes (SDK)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/kubernetes/kubernetes_provider.py)
* [M365 (SDK/Wrapper)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/m365/m365_provider.py)
* [GitHub (SDK)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/github/github_provider.py)
* [NHN (API)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/nhn/nhn_provider.py)
* [IAC (Tool)](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/iac/iac_provider.py)
* [MongoDB Atlas](https://github.com/prowler-cloud/prowler/blob/master/prowler/providers/mongodbatlas/mongodbatlas_provider.py)
***
## Best Practices
* **Code Quality & Documentation**
* **Comprehensive Docstrings**: Every class, method, and function should have detailed docstrings following Prowler's format
```python theme={null}
def method_name(self, param: str) -> str:
"""
Brief description of what the method does.
Args:
param: Description of the parameter
Returns:
Description of the return value
Raises:
ExceptionType: When and why this exception occurs
"""
```
* **Type Hints**: Use type hints for all function parameters and return values
```python theme={null}
from typing import Optional, List, Dict, Any
```
* **Logging**: Implement proper logging using Prowler's logger
```python theme={null}
from prowler.lib.logger import logger
logger.info("Operation completed successfully")
logger.warning("Something to be aware of")
logger.error("Something went wrong")
logger.critical("Critical error that may cause failure")
```
* **Error Handling & Validation**
* **Custom Exceptions**: Create provider-specific exceptions for better error handling
* **Input Validation**: Validate all inputs and provide clear error messages
* **Graceful Degradation**: Handle errors gracefully without crashing the entire scan
* **Raise on Exception**: Use `raise_on_exception` parameter for test methods
* **Testing & Quality Assurance**
* **Comprehensive Test Coverage**: Aim for >80% test coverage
* **Test Naming**: Use descriptive test names: `test_method_name_scenario`
* **Test Organization**: Group related tests in test classes
* **Mock External Dependencies**: Mock external API calls and services
* **Test Edge Cases**: Include tests for error conditions and edge cases
* **End-to-End Testing**: Test the provider on real infrastructure
* **Performance & Security**
* **Session Management**: Reuse sessions when possible, don't create new ones unnecessarily
* **Rate Limiting**: Implement rate limiting for API calls to avoid hitting limits
* **Resource Cleanup**: Ensure proper cleanup of temporary resources
* **Authentication Security**: Never log sensitive credentials or tokens
* **Code Organization**
* **Single Responsibility**: Each method should have one clear purpose
* **Consistent Naming**: Follow Prowler's naming conventions
* **Modular Design**: Break complex functionality into smaller, testable methods
* **Configuration Management**: Use configuration files for provider-specific settings
* **Documentation & Maintenance**
* **README Updates**: Update provider-specific documentation
* **Changelog**: Document changes and new features
* **Examples**: Provide usage examples and common scenarios
* **Troubleshooting**: Include common issues and solutions
* **Documentation**: Update the provider documentation to include your new tool provider in the examples and implementation guidance.
* **Integration Standards**
* **CLI Consistency**: Follow Prowler's CLI argument patterns
* **Output Format**: Ensure outputs are compatible with Prowler's reporting system
* **Compliance Mapping**: Map provider checks to relevant compliance frameworks
* **Backward Compatibility**: Maintain compatibility when possible
* **AI-Assisted Development**
* **Use Rules**: Use rules to ensure the code generated by AI is following the way of working in Prowler.
***
## OCSF Field Requirements for Prowler Cloud Integration
When implementing a new provider that supports the `--push-to-cloud` feature, specific OCSF fields must be correctly populated to ensure proper findings ingestion into Prowler Cloud.
### Required OCSF Fields
The following fields in the OCSF output are critical for successful ingestion:
| Field | Requirement | Description |
| ------------------ | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
| `provider_uid` | Must match the UID used when registering the provider in the API | This identifier links findings to the correct provider in Prowler Cloud |
| `provider` | Must be the provider name | The name of the provider (e.g., `aws`, `azure`, `gcp`, `googleworkspace`) |
| `finding_info.uid` | Must be unique | Each finding must have a unique identifier to avoid duplicates |
| `resources.uid` | Must have a value | The resource UID cannot be empty; it identifies the specific resource being assessed |
### Implementation Reference
These fields are set in the OCSF output generation. See the [OCSF output implementation](https://github.com/prowler-cloud/prowler/blob/master/prowler/lib/outputs/ocsf/ocsf.py) for reference.
### Validation Checklist
Before releasing a new provider with `--push-to-cloud` support:
* [ ] Verify `provider_uid` matches the UID used in the API to register the provider
* [ ] Confirm `provider` field contains the correct provider name
* [ ] Ensure all `finding_info.uid` values are unique across findings
* [ ] Validate that `resources.uid` is populated for every finding
Use `python scripts/validate_ocsf_output.py output/*.ocsf.json` to automate these checks.
## Checklist for New Providers
### CLI Integration Only
**Phase 1: Research & Planning**
* [ ] Soft research completed
* [ ] Spike date scheduled
* [ ] Deeper research completed
* [ ] Action plan created
**Phase 2: Implementation**
* [ ] Folder and files created in `prowler/providers/`
* [ ] Provider class implemented and inherits from `Provider`
* [ ] Authentication/session logic implemented
* [ ] Arguments/flags mapped and documented
* [ ] Outputs and metadata standardized
* [ ] Registered in the CLI
* [ ] Minimal usage example provided
**Phase 3: Delivery**
* [ ] PoC delivered
* [ ] MVP delivered
* [ ] Version 1 completed
* [ ] QA and documentation completed
* [ ] GA release ready
### API Integration
* [ ] All CLI integration items completed
* [ ] Provider added to `ProviderChoices` enum in API models
* [ ] API serializers created/updated for the provider
* [ ] API views support the new provider type
* [ ] Provider credentials model supports the new provider
* [ ] API endpoints tested and working
* [ ] Provider-specific validation implemented
* [ ] API tests created and passing
### UI Integration
* TBD
***
## Next Steps
* [How to add a new Service](./services)
* [How to add new Checks](./checks)
* [How to contribute](./introduction#contributing-to-prowler)
# Prowler Studio
Source: https://docs.prowler.com/developer-guide/prowler-studio
**Prowler Studio is an AI workflow that ensures Claude Code follows Prowler's skills, guardrails, and best practices when creating new security checks.** What lands in the resulting pull request is consistent, tested, and ready for human review — not half-correct boilerplate that needs to be rewritten.
**Contributor Tool**: Prowler Studio is a workflow for advanced contributors adding new Prowler security checks. It is not part of Prowler Cloud, Prowler App, or Prowler CLI.
**Preview Feature**: Prowler Studio is under active development and breaking changes are expected. Please report issues or share feedback on [GitHub](https://github.com/prowler-cloud/prowler-studio/issues) or in the [Slack community](https://goto.prowler.com/slack).
Clone the source code, install Prowler Studio, and explore the agent workflow in detail.
## The Problem
Adding a new check to [Prowler](https://github.com/prowler-cloud/prowler) is more than writing detection logic. A correct check has to:
* Match Prowler's exact service and check folder structure and naming conventions
* Wire up metadata, severity, remediation, tests, and compliance mappings
* Mirror the patterns used by the hundreds of existing checks in the same provider
* Actually load when Prowler scans for available checks — silent structural mistakes are easy to make
Asking a general-purpose AI assistant to do this usually means guessing. It misses conventions, skips tests, or invents structure that looks right but does not load. The result is a half-correct PR that needs to be reviewed line by line or rewritten.
## The Solution
Prowler Studio enforces the workflow end-to-end. Describe the check once — a markdown ticket, a Jira issue, or a GitHub issue — and the workflow:
1. **Loads Prowler-specific skills into every agent.** Every step starts with the same context an experienced Prowler engineer would have in mind. See [AI Skills System](/developer-guide/ai-skills) for how skills are structured.
2. **Runs specialized agents in sequence.** Implementation → testing → compliance mapping → review → PR creation. Each agent has one job and a tight scope.
3. **Verifies as it goes.** The check must load in Prowler. Tests must pass. If something fails, the agent fixes it and re-runs (up to a bounded number of attempts) before moving on.
4. **Produces a complete pull request.** Branch, passing check, tests, compliance mappings, and a pull request waiting for human review.
The result is a consistent starting point, every time, on every supported provider.
## Quick Start
### Install
Prowler Studio requires [`uv`](https://docs.astral.sh/uv/getting-started/installation/) — see the official [installation guide](https://docs.astral.sh/uv/getting-started/installation/).
```bash theme={null}
git clone https://github.com/prowler-cloud/prowler-studio
cd prowler-studio
uv sync
source .venv/bin/activate
```
### Describe the Check
A ticket is a structured markdown description of the check to create. It is the only input the workflow needs; every agent (implementation, testing, compliance mapping, review, PR creation) uses it as the source of truth, so the more concrete it is, the closer the first PR will land to the desired outcome.
The ticket can be supplied in three ways:
* **Local markdown file** → `--ticket path/to/ticket.md`
* **Jira issue** → `--jira-url https://...` (uses the issue body)
* **GitHub issue** → `--github-url https://...` (uses the issue body)
The content should follow the **New Check Request** template:
* The local copy at [`check_ticket_template.md`](https://github.com/prowler-cloud/prowler-studio/blob/main/check_ticket_template.md) covers `--ticket` and Jira tickets.
* A prefilled GitHub form is also available: [Create a New Check Request issue](https://github.com/prowler-cloud/prowler/issues/new?template=new-check-request.yml).
Sections marked *Optional* can be skipped; everything else helps the agents make the right decisions.
### Run the Workflow
From a local markdown ticket:
```bash theme={null}
prowler-studio --ticket check_ticket.md
```
From a Jira ticket:
```bash theme={null}
prowler-studio --jira-url https://mycompany.atlassian.net/browse/PROJ-123
```
From a GitHub issue:
```bash theme={null}
prowler-studio --github-url https://github.com/owner/repo/issues/123
```
Provide exactly one of `--ticket`, `--jira-url`, or `--github-url`.
Keep changes local (no push, no pull request):
```bash theme={null}
prowler-studio -b feat/my-check --ticket check_ticket.md --local
```
### What You Get
After a successful run the working environment contains:
* A new branch on a clean Prowler worktree containing the check, metadata, tests, and compliance mappings
* A pull request opened against Prowler (skipped with `--local`)
* A timestamped log file under `logs/` capturing every step the agents took
## CLI Options
| Option | Short | Description |
| -------------------- | ----- | -------------------------------------------------------------------------- |
| `--branch` | `-b` | Branch name (default: `feat/-` or `feat/`) |
| `--ticket` | `-t` | Path to a markdown check ticket file |
| `--jira-url` | `-j` | Jira ticket URL (e.g., `https://mycompany.atlassian.net/browse/PROJ-123`) |
| `--github-url` | `-g` | GitHub issue URL (e.g., `https://github.com/owner/repo/issues/123`) |
| `--working-dir` | `-w` | Working directory for the Prowler clone (default: `./working`) |
| `--no-worktree` | | Legacy mode — work directly on the main clone instead of using worktrees |
| `--cleanup-worktree` | | Remove the worktree after a successful pull request is created |
| `--local` | | Keep changes local — skip push and pull request creation |
## Configuration
Set these environment variables depending on the input source:
| Variable | When Needed | Purpose |
| ---------------- | ---------------------------- | ---------------------------------------------------------- |
| `GITHUB_TOKEN` | `--github-url` (recommended) | Higher GitHub API rate limits and access to private issues |
| `JIRA_SITE_URL` | `--jira-url` | Jira site, e.g. `https://mycompany.atlassian.net` |
| `JIRA_EMAIL` | `--jira-url` | Email of the Jira account used to fetch the ticket |
| `JIRA_API_TOKEN` | `--jira-url` | API token for the Jira account |
# Renaming Checks in Prowler
Source: https://docs.prowler.com/developer-guide/renaming-checks
To rename a check in Prowler, follow these steps when aligning with Check ID structure, fixing typos, or updating check logic that requires a new name.
When changing a Check ID, update the following files:
## Update Check Folder Structure
First, rename the check folder with the new check name.
**Path:** `prowler/providers//services//`
**Example:**
```
# Before
prowler/providers/aws/services/inspector2/inspector2_findings_exist/
# After
prowler/providers/aws/services/inspector2/inspector2_active_findings_exist/
```
Next, rename the file that contains the check logic. Inside that file, also rename the class name to match the new check name.
**Path:** `prowler/providers//services///.py`
**Example:**
```python theme={null}
# Before
class inspector2_findings_exist(Check):
def execute(self):
findings = []
# ... check logic ...
# After
class inspector2_active_findings_exist(Check):
def execute(self):
findings = []
# ... check logic ...
```
Then, rename the file that contains the check metadata. Inside that file, add the old check name as an alias in the `CheckAliases` field and modify the `CheckID` to the new check name.
**Path:** `prowler/providers//services///.metadata.json`
**Example:**
```json theme={null}
{
"Provider": "aws",
"CheckID": "inspector2_active_findings_exist",
"CheckTitle": "Check if Inspector2 active findings exist",
"CheckAliases": [
"inspector2_findings_exist"
],
"CheckType": [],
"ServiceName": "inspector2",
"SubServiceName": "",
"ResourceIdTemplate": "arn:aws:inspector2:region:account-id/detector-id",
"Severity": "medium",
"ResourceType": "Other",
"Description": "This check determines if there are any active findings in your AWS account that have been detected by AWS Inspector2.",
"Risk": "Without using AWS Inspector, you may not be aware of all the security vulnerabilities in your AWS resources.",
"RelatedUrl": "https://docs.aws.amazon.com/inspector/latest/user/findings-understanding.html",
"Remediation": {
"Code": {
"CLI": "",
"NativeIaC": "",
"Other": "https://www.trendmicro.com/cloudoneconformity/knowledge-base/aws/Inspector/amazon-inspector-findings.html",
"Terraform": ""
},
"Recommendation": {
"Text": "Review the active findings from Inspector2",
"Url": "https://docs.aws.amazon.com/inspector/latest/user/what-is-inspector.html"
}
},
"Categories": [],
"DependsOn": [],
"RelatedTo": [],
"Notes": ""
}
```
## Update Test Files
Second, rename the tests folder with the new check name.
**Path:** `tests/providers//services//`
**Example:**
```
# Before
tests/providers/aws/services/inspector2/inspector2_findings_exist/
# After
tests/providers/aws/services/inspector2/inspector2_active_findings_exist/
```
Next, rename the test file that contains all the unit tests. Inside that file, rename all appearances of the old check name to the new check name.
**Path:** `tests/providers//services///_test.py`
**Example:**
```python theme={null}
# Before
from prowler.providers.aws.services.inspector2.inspector2_findings_exist.inspector2_findings_exist import (
inspector2_findings_exist,
)
class Test_inspector2_findings_exist:
def test_inspector2_no_findings(self):
# ... test logic ...
def test_inspector2_with_findings(self):
# ... test logic ...
# After
from prowler.providers.aws.services.inspector2.inspector2_active_findings_exist.inspector2_active_findings_exist import (
inspector2_active_findings_exist,
)
class Test_inspector2_active_findings_exist:
def test_inspector2_no_findings(self):
# ... test logic ...
def test_inspector2_with_findings(self):
# ... test logic ...
```
**Important:** Update all references to the old check name in the test file, including:
* Import statements at the top of the file
* Class name in the test class
* Any function calls to the check
* Any string references to the check name
* Mock patches that reference the check
**Complete example of all changes needed in test files:**
```python theme={null}
# Before
from prowler.providers.aws.services.inspector2.inspector2_findings_exist.inspector2_findings_exist import (
inspector2_findings_exist,
)
class Test_inspector2_findings_exist:
def test_inspector2_no_findings(self):
# Mock setup
with mock.patch(
"prowler.providers.aws.services.inspector2.inspector2_findings_exist.inspector2_findings_exist.inspector2_client",
inspector2_client,
):
check = inspector2_findings_exist()
result = check.execute()
assert len(result) == 1
assert result[0].status == "PASS"
assert "No active findings found" in result[0].status_extended
# After
from prowler.providers.aws.services.inspector2.inspector2_active_findings_exist.inspector2_active_findings_exist import (
inspector2_active_findings_exist,
)
class Test_inspector2_active_findings_exist:
def test_inspector2_no_findings(self):
# Mock setup
with mock.patch(
"prowler.providers.aws.services.inspector2.inspector2_active_findings_exist.inspector2_active_findings_exist.inspector2_client",
inspector2_client,
):
check = inspector2_active_findings_exist()
result = check.execute()
assert len(result) == 1
assert result[0].status == "PASS"
assert "No active findings found" in result[0].status_extended
```
## Update Compliance Mappings
Finally, rename all appearances of the old check name to the new check name inside any compliance framework where the check is mapped.
* `prowler/compliance//.json`
**Example:**
```json theme={null}
{
"Framework": "CIS",
"Version": "2.0",
"Provider": "AWS",
"Description": "The CIS Amazon Web Services Foundations Benchmark provides prescriptive guidance for configuring security options for a subset of Amazon Web Services.",
"Requirements": [
{
"Id": "4.1",
"Description": "Ensure a log metric filter and alarm exist for unauthorized API calls",
"Checks": [
"inspector2_active_findings_exist"
],
"Attributes": [
{
"Section": "4 Logging and Monitoring",
"Profile": "Level 1",
"AssessmentStatus": "Automated",
"Description": "Real-time monitoring of API calls can be achieved by directing CloudTrail Logs to CloudWatch Logs and establishing corresponding metric filters and alarms."
}
]
}
]
}
```
The development compliance file may contain examples of the check being renamed. If so, modify this file as well:
* `api/src/backend/api/fixtures/dev/7_dev_compliance.json`
# Secret-Scanning Checks
Source: https://docs.prowler.com/developer-guide/secret-scanning-checks
Prowler scans audited resources for plaintext secrets using [Kingfisher](https://github.com/mongodb/kingfisher), an open-source secret-scanning engine that Prowler invokes as a subprocess. This guide explains the structure every secret-scanning check must follow to keep scanning correct and efficient on large accounts.
Since Prowler 5.32.0 the secret-scanning checks scan with Kingfisher. Earlier versions used the `detect-secrets` library.
## Overview
Secret detection runs through a single helper in `prowler/lib/utils/utils.py`:
* **`detect_secrets_scan_batch(payloads, excluded_secrets=..., validate=...)`** scans many payloads in chunked subprocess invocations and returns a `{key: [findings]}` dictionary. To scan a single payload, pass a one-entry mapping (for example, `{0: data}`).
Every Kingfisher invocation carries a fixed process-startup cost (around 100 ms). Scanning once per resource would spawn thousands of subprocesses on large accounts (for example, thousands of CloudWatch log groups). `detect_secrets_scan_batch` amortizes that cost: it writes each payload to a temporary file as it consumes them, runs one subprocess per chunk (500 payloads by default), and maps the findings back to each payload by key.
## The Batched Structure
Every secret-scanning check follows three phases.
### Phase 1: Collect
Define a generator that yields `(key, payload)` for each scannable unit. The generator builds payload strings only — it does not call Kingfisher. Lazy yielding keeps memory and temporary-disk usage bounded to a single chunk, which matters when an account holds thousands of resources.
### Phase 2: Batch
Call `detect_secrets_scan_batch` once with the generator. The helper consumes it in chunks, runs Kingfisher per chunk, and returns the keys that produced findings mapped to their finding lists.
### Phase 3: Report
Iterate the resources, look up the findings by key, and build one report per resource. Emit a finding for **every** iterated resource — never drop one silently. When a resource's payload cannot be prepared for scanning (for example, user data that fails to base64-decode or decompress), report it as `MANUAL` with a status explaining the scan could not inspect it, rather than omitting it or claiming `PASS`.
```python theme={null}
from prowler.lib.check.models import Check, Check_Report_AWS
from prowler.lib.utils.utils import (
annotate_verified_secrets,
detect_secrets_scan_batch,
)
from prowler.providers.aws.services.example.example_client import example_client
class example_resource_no_secrets(Check):
def execute(self):
findings = []
excluded = example_client.audit_config.get("secrets_ignore_patterns", [])
validate = example_client.audit_config.get("secrets_validate", False)
resources = list(example_client.resources)
# Phase 1: collect — builds strings only, no scan.
def payloads():
for index, resource in enumerate(resources):
if resource.scannable_data:
yield index, serialize(resource)
# Phase 2: batch — one call, chunked subprocesses.
batch_results = detect_secrets_scan_batch(
payloads(), excluded_secrets=excluded, validate=validate
)
# Phase 3: report — look up findings by key.
for index, resource in enumerate(resources):
report = Check_Report_AWS(metadata=self.metadata(), resource=resource)
report.status = "PASS"
report.status_extended = f"No secrets found in {resource.name}."
detect_secrets_output = batch_results.get(index)
if detect_secrets_output:
report.status = "FAIL"
report.status_extended = (
f"Potential secret found in {resource.name} -> ..."
)
annotate_verified_secrets(report, detect_secrets_output)
findings.append(report)
return findings
```
## Choosing the Key
The key maps each finding back to its source. Two shapes cover every check:
* **One payload per resource:** use the resource index. This fits checks that serialize a single payload per resource, such as launch configurations, CloudFormation outputs, SSM documents, Step Functions definitions, and OpenStack metadata.
* **Several payloads per resource:** use a `(resource_index, fragment)` tuple, where the fragment identifies the variable, log stream, container, file, or version. Phase 3 groups the per-fragment findings to build the resource report. This fits CloudWatch log streams, ECS containers, CodeBuild variables, Glue arguments, and Lambda code files.
Derive the indices from the same `list(...)` of resources in both Phase 1 and Phase 3 so the order stays stable and the keys align.
## Preserving Per-Payload Results
`detect_secrets_scan_batch` runs Kingfisher with `--no-dedup`, so a secret that appears in more than one payload is reported for each one. This reproduces the result of scanning each payload individually. Build payload strings exactly as a single scan would: serialize the same data and keep line ordering, because messages often map a finding's `line_number` back to a variable name or metadata key.
## Validation and Severity
`detect_secrets_scan_batch` accepts `validate`, read from `secrets_validate` in the provider configuration or the `--scan-secrets-validate` flag. When enabled, Kingfisher confirms whether each secret is live, and confirmed secrets carry `is_verified: True`.
After marking a report as `FAIL`, pass the findings to `annotate_verified_secrets(report, findings)`. When any secret is verified, the helper escalates the finding to critical severity and appends a note that the secret was confirmed live. Validation stays off by default because it sends the discovered secret to the provider API.
## Excluded Secrets
`detect_secrets_scan_batch` applies `secrets_ignore_patterns` — regular expressions from the provider configuration — against each finding's source line and drops the matches, mirroring single-scan behavior.
## Testing
To assert on the verified-secret path, mock `detect_secrets_scan_batch` in the check module and return the keyed dictionary. For a single resource scanned at index `0`:
```python theme={null}
mock.patch(
"prowler.providers.aws.services.example.example_resource_no_secrets.example_resource_no_secrets.detect_secrets_scan_batch",
return_value={
0: [{"type": "...", "line_number": 1, "is_verified": True}]
},
)
```
Most tests need no mock at all: they seed resources that contain example secrets and assert on the `FAIL` status and message, which exercises the real batched path. Refer to the [Testing](/developer-guide/unit-testing) documentation for the general structure.
# Creating a New Security Compliance Framework in Prowler
Source: https://docs.prowler.com/developer-guide/security-compliance-framework
This guide explains how to add a new security compliance framework to Prowler, end to end. It covers directory layout, the two supported JSON schemas (universal and legacy), the Pydantic models that validate each framework, check mapping conventions, output formatting, local validation, testing, and the pull request process.
## Introduction
A compliance framework in Prowler maps a public or custom control catalog (for example CIS, NIST 800-53, PCI DSS, HIPAA, ENS, CCC, DORA) to the security checks that Prowler already runs. Each requirement links to zero, one or more Prowler checks. When a scan executes, findings are aggregated per requirement to produce the compliance report rendered by Prowler CLI and Prowler Cloud.
Prowler ships 85+ compliance frameworks across all providers. The catalog lives under `prowler/compliance//` (legacy, per-provider) or `prowler/compliance/` (universal, multi-provider).
A compliance framework must represent the **complete state** of the source catalog. Every requirement defined by the framework has to be present in the JSON file, even when no Prowler check can automate it. In that case, leave the requirement's check list empty, but do not omit the requirement.
Requirement coverage feeds the compliance percentage calculations and the metadata surfaces (dashboards, widgets, exports). Missing requirements skew those metrics and break the report as a faithful snapshot of the framework.
### Two supported schemas
| Schema | When to use | File location | Discovered as |
| ---------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
| **Universal (recommended for new frameworks)** | Multi-provider frameworks, or single-provider frameworks that benefit from declarative table/PDF rendering | `prowler/compliance/.json` (top-level) | Available for **every** provider whose key appears in any `requirement.checks` dict |
| **Legacy provider-specific** | Single-provider frameworks with framework-specific attribute classes already declared in the codebase (CIS, ENS, ISO 27001, etc.) | `prowler/compliance//__.json` | Available only under that provider |
Auto-discovery happens in `get_bulk_compliance_frameworks_universal(provider)` (`prowler/lib/check/compliance_models.py`), which scans **both** the top-level `prowler/compliance/` directory and every per-provider sub-directory. Legacy frameworks are transparently converted to the universal `ComplianceFramework` model via `adapt_legacy_to_universal()` before being returned, so the rest of Prowler — CLI table rendering, CSV/OCSF outputs, PDF generation — works the same regardless of the source schema.
> The legacy entry-point `Compliance.get_bulk(provider)` (used by older code paths) only scans per-provider sub-directories. Universal top-level files are picked up exclusively via the universal loader; this matters if you are wiring a new code path against the legacy API.
For **new** frameworks, prefer the universal schema: it requires no Python code changes, supports multiple providers in a single file, and table/PDF rendering is driven entirely from declarative configuration inside the JSON.
> All Pydantic models in `compliance_models.py` are imported from `pydantic.v1`. Subclasses you add for the legacy schema must use `from pydantic.v1 import BaseModel`.
### Prerequisites
Before adding a new framework, complete the following checks:
* **Verify the framework is not already supported.** Inspect `prowler/compliance/` and every `prowler/compliance//` for an existing JSON file matching the name and version.
* **Confirm the required checks exist.** Every requirement that can be automated must point to one or more existing Prowler checks. For each missing check, implement it first by following the [Prowler Checks](/developer-guide/checks) guide.
* **Review a reference framework.** Use an existing framework with a similar structure as your template:
* Universal: `prowler/compliance/dora_2022_2554.json`, `prowler/compliance/csa_ccm_4.0.json`.
* Legacy: `prowler/compliance/aws/cis_2.0_aws.json` (canonical CIS shape), `prowler/compliance/aws/ccc_aws.json`, `prowler/compliance/aws/ens_rd2022_aws.json`, `prowler/compliance/aws/nist_800_53_revision_5_aws.json`.
## Universal Compliance Framework
### Where the file lives
Place the file at the top level of the compliance directory:
```
prowler/compliance/.json
```
Examples in the repository: `prowler/compliance/csa_ccm_4.0.json`, `prowler/compliance/dora_2022_2554.json`.
The file is auto-discovered — there is **no** need to register it in any `__init__.py`, modify `prowler/lib/outputs/`, or update any other Python module. The framework key Prowler CLI accepts via `--compliance` is the basename of the JSON file without `.json` (`dora_2022_2554.json` → `dora_2022_2554`).
### Top-level structure
```json theme={null}
{
"framework": "",
"name": "",
"version": "",
"description": "",
"icon": "",
"attributes_metadata": [ /* see below */ ],
"outputs": { /* see below — optional */ },
"requirements": [ /* see below */ ]
}
```
A `provider` field at the top level is **optional**. The framework's effective provider list is derived by `ComplianceFramework.get_providers()` (`compliance_models.py`) from the union of all keys appearing in `requirement.checks` across all requirements; the explicit `provider` field is used **only as a fallback** when no requirement carries any `checks` key. This is what enables a single file (e.g. `dora_2022_2554.json`) to cover AWS today and add Azure / GCP / etc. tomorrow without restructuring.
Provider keys inside `requirement.checks` must match the directory names under `prowler/providers/`. The valid keys at present are: `aws`, `azure`, `gcp`, `m365`, `kubernetes`, `iac`, `github`, `googleworkspace`, `alibabacloud`, `cloudflare`, `mongodbatlas`, `nhn`, `openstack`, `oraclecloud`, `llm`. Comparison in `supports_provider()` is case-insensitive, but lowercase is the convention used everywhere in the repository.
### `attributes_metadata`
Declares the shape of the per-requirement `attributes` dict. When this field is present, the root validator `validate_attributes_against_metadata` (`compliance_models.py`) enforces the schema at load time and rejects:
* Missing keys marked `required: true`.
* Keys present in `attributes` but not declared in `attributes_metadata` (typo / drift guard).
* Values that violate a declared `enum`.
* Values whose Python type does not match a declared `int`, `float` or `bool`.
The runtime type check **only** covers `int`, `float` and `bool`. For `str`, `list_str` and `list_dict` the type is documentation-only — non-conforming values won't fail validation. If `attributes_metadata` is omitted, **no per-requirement validation runs at all**.
```json theme={null}
"attributes_metadata": [
{
"key": "Pillar",
"label": "Pillar",
"type": "str",
"required": true,
"enum": [
"ICT Risk Management",
"ICT-Related Incident Reporting",
"Digital Operational Resilience Testing",
"ICT Third-Party Risk Management",
"Information Sharing"
],
"output_formats": { "csv": true, "ocsf": true }
},
{
"key": "Article",
"label": "Article",
"type": "str",
"required": true,
"output_formats": { "csv": true, "ocsf": true }
}
]
```
Per attribute:
* `key` (required): attribute name as it will appear in `requirement.attributes`.
* `label`: human-readable label used in CSV headers and PDF.
* `type`: one of `str`, `int`, `float`, `bool`, `list_str`, `list_dict`. Defaults to `str`.
* `enum`: optional list of allowed values; non-conforming values are rejected at load time.
* `required`: if `true`, every requirement must include this key with a non-null value.
* `enum_display` / `enum_order`: optional per-enum-value visual metadata (label, abbreviation, color, icon) and explicit ordering for PDF rendering.
* `output_formats`: `{ "csv": , "ocsf": }` — toggles inclusion in each output format. Both default to `true`.
### `outputs`
Optional. Controls how the framework is rendered in the console table and in the generated PDF report. Skipping it falls back to sensible defaults.
```json theme={null}
"outputs": {
"table_config": {
"group_by": "Pillar"
},
"pdf_config": {
"language": "en",
"primary_color": "#003399",
"secondary_color": "#0055A5",
"bg_color": "#F0F4FA",
"group_by_field": "Pillar",
"sections": [ "ICT Risk Management", "ICT-Related Incident Reporting", "..." ],
"section_short_names": { "ICT Risk Management": "ICT Risk Mgmt" },
"charts": [
{
"id": "pillar_compliance",
"type": "horizontal_bar",
"group_by": "Pillar",
"title": "Compliance Score by Pillar",
"y_label": "Pillar",
"x_label": "Compliance %",
"value_source": "compliance_percent",
"color_mode": "by_value"
}
],
"filter": { "only_failed": true, "include_manual": false }
}
}
```
`table_config.group_by` must reference an attribute key declared in `attributes_metadata`. The same applies to `pdf_config.group_by_field` and to every `charts[].group_by`.
For frameworks with weighted scoring (e.g. ThreatScore) declare `pdf_config.scoring` with `risk_field` / `weight_field` / `risk_boost_factor`. For column splitting (e.g. CIS Level 1 vs Level 2) use `table_config.split_by`.
### `requirements`
```json theme={null}
"requirements": [
{
"id": "DORA-Art5",
"name": "Governance and organisation",
"description": "Financial entities shall have a sound, comprehensive and well-documented ICT internal governance and control framework. ...",
"attributes": {
"Pillar": "ICT Risk Management",
"Article": "Article 5",
"ArticleTitle": "Governance and organisation"
},
"checks": {
"aws": [
"iam_avoid_root_usage",
"iam_no_root_access_key",
"iam_root_mfa_enabled"
],
"azure": [],
"gcp": []
}
}
]
```
Per requirement:
* `id` (required): unique identifier within the framework.
* `description` (required): the requirement text as authored by the framework.
* `name`: short title shown alongside the id.
* `attributes`: flat dict; keys must conform to `attributes_metadata`.
* `checks`: dict keyed by provider name (the same lowercase keys listed in the previous section). Each value is a list of Prowler check names that evidence this requirement for that provider. The list **may be empty** and the dict itself defaults to `{}` if omitted; either way the requirement is still loaded and listed by `--list-compliance-requirements`, it just has zero checks to execute. Note: there is **no automatic check-existence validation** at load time — referencing a non-existent check name will silently produce a requirement with no findings. Validate this yourself (see "Validating Your Framework" below).
* `config_requirements`: optional list of configuration guardrails. Each entry asserts that a configurable check referenced by this requirement ran with a configuration strict enough to actually satisfy the requirement; otherwise the requirement is forced to FAIL. See [Configuration Guardrails for Requirements](#configuration-guardrails-for-requirements) for the full schema and semantics. In the universal schema the field name is lowercase (`config_requirements`); legacy files use `ConfigRequirements`.
For MITRE-style frameworks, additional optional fields are available on the requirement: `tactics`, `sub_techniques`, `platforms`, `technique_url` (these are populated automatically when adapting a legacy MITRE JSON to the universal model).
### Multi-provider frameworks
A single universal file can cover any number of providers. The framework appears under each provider's `--list-compliance` output as long as **at least one** requirement has that provider key in its `checks` dict.
When extending an existing universal framework with a new provider, the only change required is editing `requirement.checks`:
```diff theme={null}
"checks": {
"aws": ["iam_avoid_root_usage", "iam_no_root_access_key"],
+ "azure": ["entra_policy_ensure_mfa_for_admin_roles"]
}
```
No code changes, no new file, no registration step.
## Legacy Provider-Specific Compliance Framework
The legacy schema is still fully supported and remains the format used by most frameworks shipped today (CIS, NIST, ISO 27001, FedRAMP, PCI DSS, GDPR, HIPAA, ENS, etc.). It binds a framework to a single provider and validates each requirement against a framework-specific Pydantic attribute class.
The legacy schema spans **four layers** — a complete contribution must touch every layer that applies:
* **Layer 1 — Schema validation:** the Pydantic models in `prowler/lib/check/compliance_models.py` define the canonical schema for each attribute shape.
* **Layer 2 — JSON catalog:** the framework JSON file in `prowler/compliance//` lists every requirement and maps it to checks.
* **Layer 3 — Output formatter:** the Python module in `prowler/lib/outputs/compliance//` builds the CSV row model, the per-provider transformer, and the CLI summary table.
* **Layer 4 — Output dispatchers:** the dispatchers in `prowler/lib/outputs/compliance/compliance.py` and `prowler/lib/outputs/compliance/compliance_output.py` route findings to the right formatter based on the framework identifier.
The universal schema collapses Layers 3 and 4 into declarative configuration inside the JSON — that is the main reason it is preferred for new contributions.
### Directory structure and file naming
Compliance frameworks live at:
```
prowler/compliance//__.json
```
The filename conventions are:
* All lowercase, words separated with underscores.
* `` is a supported provider identifier (same lowercase list as the universal section above).
* `` is optional but recommended. Omit only when the framework has no versioning (e.g. `ccc_aws.json`).
* The file basename (without `.json`) is the framework key that Prowler CLI accepts via `--compliance`.
Examples:
* `prowler/compliance/aws/cis_2.0_aws.json`
* `prowler/compliance/aws/nist_800_53_revision_5_aws.json`
* `prowler/compliance/azure/ens_rd2022_azure.json`
* `prowler/compliance/kubernetes/cis_1.10_kubernetes.json`
* `prowler/compliance/aws/ccc_aws.json`
The output formatter directory mirrors the framework name:
```
prowler/lib/outputs/compliance//
├── .py # CLI summary-table dispatcher
├── _.py # Per-provider transformer class
├── models.py # Pydantic CSV row model
└── __init__.py
```
### JSON schema reference
Every legacy compliance file is a JSON document with the following top-level keys. `Framework`, `Name` and `Provider` are validated non-empty by the root validator `framework_and_provider_must_not_be_empty` (`compliance_models.py`).
| Field | Type | Required | Description |
| -------------- | ------ | ----------------- | --------------------------------------------------------------------------------------------------------------------- |
| `Framework` | string | Yes | Canonical framework identifier, for example `CIS`, `NIST-800-53-Revision-5`, `ENS`, `CCC`. |
| `Name` | string | Yes | Human-readable framework name displayed by Prowler App. |
| `Version` | string | Yes (recommended) | Framework version, e.g. `2.0`. See [Version Handling](#version-handling). |
| `Provider` | string | Yes | Upper-cased provider identifier: `AWS`, `AZURE`, `GCP`, `KUBERNETES`, `M365`, `GITHUB`, `GOOGLEWORKSPACE`, and so on. |
| `Description` | string | Yes | Short description of the framework's scope and purpose. |
| `Requirements` | array | Yes | List of [requirement objects](#requirement-object). |
#### Requirement Object
Each entry in `Requirements` describes one control or requirement.
| Field | Type | Required | Description |
| -------------------- | ---------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Id` | string | Yes | Unique identifier within the framework, for example `1.10` or `CCC.Core.CN01.AR01`. |
| `Name` | string | No | Optional human-readable name (frameworks like NIST distinguish control name from description). |
| `Description` | string | Yes | Verbatim description from the source framework. |
| `Attributes` | array | Yes | List of [attribute objects](#attribute-objects). The shape depends on the framework. |
| `Checks` | array of strings | Yes | Prowler check identifiers that automate the requirement. Leave the list empty when the control cannot be automated. |
| `ConfigRequirements` | array of objects | No | Optional [configuration guardrails](#configuration-guardrails-for-requirements). Each entry asserts that a configurable check ran with a configuration strict enough to satisfy the requirement; when it did not, the requirement is forced to FAIL. |
#### Attribute Objects
`Attributes` is parsed against the union declared in `Compliance_Requirement.Attributes` (`compliance_models.py`). Pydantic v1 tries each member of the union in declaration order and falls back to `Generic_Compliance_Requirement_Attribute` (the last entry) when nothing else matches — so a brand-new shape that doesn't match any existing class will silently be accepted as Generic, losing its specific fields.
As of today, the registered attribute classes are: `CIS_Requirement_Attribute`, `ENS_Requirement_Attribute`, `ASDEssentialEight_Requirement_Attribute`, `ISO27001_2013_Requirement_Attribute`, `AWS_Well_Architected_Requirement_Attribute`, `KISA_ISMSP_Requirement_Attribute`, `Prowler_ThreatScore_Requirement_Attribute`, `CCC_Requirement_Attribute`, `C5Germany_Requirement_Attribute`, `CSA_CCM_Requirement_Attribute`, and `Generic_Compliance_Requirement_Attribute` (fallback). MITRE-style frameworks use the separate `Mitre_Requirement` model with `Tactics` / `SubTechniques` / `Platforms` / `TechniqueURL` at the requirement top level. The most common shapes are summarized below.
##### CIS\_Requirement\_Attribute
Used by every CIS benchmark.
| Field | Type | Required | Notes |
| ----------------------- | ------ | -------- | ------------------------------------------------------------------------------------ |
| `Section` | string | Yes | Top-level section, e.g. `1 Identity and Access Management`. |
| `SubSection` | string | No | Optional second-level grouping. |
| `Profile` | enum | Yes | One of `Level 1`, `Level 2`, `E3 Level 1`, `E3 Level 2`, `E5 Level 1`, `E5 Level 2`. |
| `AssessmentStatus` | enum | Yes | `Manual` or `Automated`. |
| `Description` | string | Yes | Control description. |
| `RationaleStatement` | string | Yes | Reason the control exists. |
| `ImpactStatement` | string | Yes | Impact of non-compliance. |
| `RemediationProcedure` | string | Yes | Remediation steps. |
| `AuditProcedure` | string | Yes | Audit steps. |
| `AdditionalInformation` | string | Yes | Free-form notes. |
| `DefaultValue` | string | No | Default configuration value, when relevant. |
| `References` | string | Yes | Colon-separated list of reference URLs. |
##### ENS\_Requirement\_Attribute
Used by the Spanish ENS (Esquema Nacional de Seguridad) frameworks.
| Field | Type | Required | Notes |
| -------------------- | ---------------- | -------- | --------------------------------------------------------------------------------------------- |
| `IdGrupoControl` | string | Yes | Control group identifier. |
| `Marco` | string | Yes | Framework block (`operacional`, `organizativo`, `proteccion`). |
| `Categoria` | string | Yes | Control category. |
| `DescripcionControl` | string | Yes | Control description in Spanish. |
| `Tipo` | enum | Yes | `refuerzo`, `requisito`, `recomendacion`, `medida`. |
| `Nivel` | enum | Yes | `opcional`, `bajo`, `medio`, `alto`. |
| `Dimensiones` | array of enum | Yes | Subset of `confidencialidad`, `integridad`, `trazabilidad`, `autenticidad`, `disponibilidad`. |
| `ModoEjecucion` | string | Yes | Execution mode (`manual`, `automático`, `híbrido`). |
| `Dependencias` | array of strings | Yes | Ids of prerequisite controls. Empty list when none. |
##### CCC\_Requirement\_Attribute
Used by the Common Cloud Controls Catalog.
| Field | Type | Required | Notes |
| -------------------------- | ---------------- | -------- | --------------------------------------------------------------- |
| `FamilyName` | string | Yes | Control family, e.g. `Data`. |
| `FamilyDescription` | string | Yes | Description of the family. |
| `Section` | string | Yes | Section title. |
| `SubSection` | string | Yes | Subsection title, or empty string. |
| `SubSectionObjective` | string | Yes | Stated objective for the subsection. |
| `Applicability` | array of strings | Yes | Applicability tags such as `tlp-green`, `tlp-amber`, `tlp-red`. |
| `Recommendation` | string | Yes | Implementation recommendation. |
| `SectionThreatMappings` | array of objects | Yes | Each entry has `ReferenceId` and `Identifiers`. |
| `SectionGuidelineMappings` | array of objects | Yes | Each entry has `ReferenceId` and `Identifiers`. |
##### Generic\_Compliance\_Requirement\_Attribute
The fallback attribute model used when no framework-specific schema applies (e.g. NIST 800-53, PCI DSS, GDPR, HIPAA). It is **always the last** element of the `Compliance_Requirement.Attributes` Union; that ordering is load-bearing.
| Field | Type | Required | Notes |
| ------------ | ------ | -------- | ----------------------------- |
| `ItemId` | string | No | Item identifier. |
| `Section` | string | No | Section name. |
| `SubSection` | string | No | Subsection name. |
| `SubGroup` | string | No | Subgroup name. |
| `Service` | string | No | Affected service, e.g. `iam`. |
| `Type` | string | No | Control type. |
| `Comment` | string | No | Free-form comment. |
For the remaining attribute classes (`AWS_Well_Architected_Requirement_Attribute`, `ISO27001_2013_Requirement_Attribute`, `Mitre_Requirement_Attribute_`, `KISA_ISMSP_Requirement_Attribute`, `Prowler_ThreatScore_Requirement_Attribute`, `C5Germany_Requirement_Attribute`, `CSA_CCM_Requirement_Attribute`) consult `prowler/lib/check/compliance_models.py` for the full field sets.
The `Attributes` field is a Pydantic `Union`. The generic attribute model **must** remain the last element of that Union — otherwise Pydantic v1 silently coerces every framework into the generic shape and your specialized fields are dropped. Adding a brand-new attribute shape requires inserting the Pydantic class **before** `Generic_Compliance_Requirement_Attribute`.
#### Minimal working example
The following snippet is a complete, valid framework file named `my_framework_1.0_aws.json`, saved at `prowler/compliance/aws/my_framework_1.0_aws.json`. It uses the generic attribute shape for simplicity.
```json title="prowler/compliance/aws/my_framework_1.0_aws.json" theme={null}
{
"Framework": "My-Framework",
"Name": "My Framework 1.0 for AWS",
"Version": "1.0",
"Provider": "AWS",
"Description": "Internal baseline for AWS accounts.",
"Requirements": [
{
"Id": "MF-1.1",
"Description": "Root account must have multi-factor authentication enabled.",
"Attributes": [
{
"ItemId": "MF-1.1",
"Section": "Identity and Access Management",
"SubSection": "Root Account",
"Service": "iam"
}
],
"Checks": [
"iam_root_mfa_enabled",
"iam_root_hardware_mfa_enabled"
]
},
{
"Id": "MF-2.1",
"Description": "S3 buckets must block public access at the account level.",
"Attributes": [
{
"ItemId": "MF-2.1",
"Section": "Data Protection",
"Service": "s3"
}
],
"Checks": [
"s3_account_level_public_access_blocks"
]
}
]
}
```
### Mapping checks to requirements
Each requirement links to the Prowler checks that, together, produce a PASS or FAIL verdict for that control.
* **Include every requirement from the source catalog.** The framework file must mirror the full control list, one-to-one. Compliance percentages, dashboards, and exported metadata are computed against the total requirement count.
* List every check by its canonical identifier — the value of `CheckID` inside the check's `.metadata.json` file.
* One requirement can reference multiple checks. The requirement is evaluated as FAIL when any referenced check produces a FAIL finding for a resource in scope.
* Leave `Checks` (legacy) or `checks.` (universal) as an empty array when the requirement cannot be automated. The requirement still appears in the report and contributes to the total.
* Reuse checks across requirements when the same control applies in multiple places. Do not duplicate check logic to match framework structure.
* Avoid referencing checks from a different provider. A legacy compliance file is bound to one provider, and cross-provider checks will never match findings in the scan.
To discover available checks:
```bash theme={null}
uv run python prowler-cli.py --list-checks
```
### Supporting multiple providers (legacy)
The legacy schema binds each file to a single provider. To cover several providers with the same framework, ship one JSON file per provider:
```
prowler/compliance/aws/cis_2.0_aws.json
prowler/compliance/azure/cis_2.0_azure.json
prowler/compliance/gcp/cis_2.0_gcp.json
```
Keep the `Framework` and `Version` values identical across the files so the dispatcher matches them; change only the `Provider`, `Checks`, and provider-specific metadata. The CIS output formatter already supports every provider listed above.
For a brand-new framework that spans several providers, **prefer the universal schema** — it covers every provider from a single file. If you must use the legacy schema, add one transformer per provider in `prowler/lib/outputs/compliance//` and extend the summary-table dispatcher accordingly. See [Output Formatter](#output-formatter).
### Output formatter
Legacy frameworks render in two forms: a detailed CSV report written to disk, and a summary table printed in the CLI. Both are produced by the output formatter package for the framework. Universal frameworks do **not** need a Python output formatter — the `outputs` config inside the JSON drives rendering — so this section applies only to the legacy schema.
For a new legacy framework named `my_framework`, create:
```
prowler/lib/outputs/compliance/my_framework/
├── __init__.py
├── my_framework.py # CLI summary table dispatcher
├── my_framework_aws.py # Per-provider transformer
└── models.py # CSV row Pydantic model
```
#### Step 1 — Define the CSV row model
In `models.py`, declare a Pydantic v1 model with one field per CSV column. Use existing models such as `AWSCISModel` in `prowler/lib/outputs/compliance/cis/models.py` as the reference. Fields typically include `Provider`, `Description`, `AccountId`, `Region`, `AssessmentDate`, `Requirements_Id`, `Requirements_Description`, one `Requirements_Attributes_*` field per attribute key, plus the finding fields `Status`, `StatusExtended`, `ResourceId`, `ResourceName`, `CheckId`, `Muted`, `Framework`, `Name`.
#### Step 2 — Implement the transformer
In `my_framework_aws.py`, subclass `ComplianceOutput` from `prowler.lib.outputs.compliance.compliance_output` and implement `transform(findings, compliance, compliance_name)`. Iterate over `findings`, match each finding to the requirements it satisfies through `finding.compliance.get(compliance_name, [])`, and append one row per attribute to `self._data`.
#### Step 3 — Add the summary-table dispatcher
In `my_framework.py`, implement `get_my_framework_table(findings, bulk_checks_metadata, compliance_framework, output_filename, output_directory, compliance_overview)` following the pattern in `prowler/lib/outputs/compliance/cis/cis.py`.
#### Step 4 — Register the framework in the dispatchers
* Add the dispatcher call in `prowler/lib/outputs/compliance/compliance.py`, inside `display_compliance_table`, with a branch such as `elif "my_framework" in compliance_framework:`.
* Register the CSV model and transformer in `prowler/lib/outputs/compliance/compliance_output.py` so the CSV file is emitted during the scan.
For NIST-style catalogs that use `Generic_Compliance_Requirement_Attribute`, no custom formatter is needed. The generic formatter in `prowler/lib/outputs/compliance/generic/` handles them automatically, provided the JSON validates against the generic attribute schema.
### Legacy-to-universal adapter
At load time, every legacy file is transparently adapted to a `ComplianceFramework` via `adapt_legacy_to_universal()` (`compliance_models.py`), which: (a) flattens the first element of `Attributes` into a flat `attributes` dict, (b) wraps `Checks` as `{provider_lower: [...]}`, (c) infers `attributes_metadata` from the matched Pydantic class via `_infer_attribute_metadata()`. The rest of Prowler (CSV/OCSF/PDF output, CLI table) then treats both formats identically.
Loader-error behaviour differs between the two entry points:
* `load_compliance_framework()` (legacy) is **fail-fast**: it calls `sys.exit(1)` on any `ValidationError` (`compliance_models.py`).
* `load_compliance_framework_universal()` is more lenient — it logs the error and returns `None`, so `get_bulk_compliance_frameworks_universal()` simply skips the broken file and keeps loading the rest.
## Configuration Guardrails for Requirements
Some requirements are only truly satisfied when the configurable checks behind them ran with a configuration strict enough to meet the control. A [configurable check](/developer-guide/configurable-checks) reads thresholds from the scan's `audit_config`, so loosening a value can make the check PASS while the requirement it backs is, in fact, not satisfied.
A worked example: CIS AWS 6.0 requirement 2.11 ("credentials unused for 45 days or more are disabled") maps to `iam_user_accesskey_unused`, which is driven by the `max_unused_access_keys_days` config key. If a user raises that value to `120`, the check passes for a key unused for 90 days — yet the requirement explicitly demands a 45-day threshold, so the PASS is misleading.
Configuration guardrails close that gap. A requirement declares the configuration it expects, and when the scan ran with a configuration too loose to honor it, the requirement is forced to **FAIL** in every compliance output, with the reason surfaced in the finding's extended status.
Guardrails are an **optional** safety net for configurable checks. A requirement that maps only to non-configurable checks does not need them. When the field is absent, behavior is unchanged.
### Where guardrails are declared
The field is attached to each requirement and exists in both schemas:
* **Legacy** (`prowler/compliance//...`): `ConfigRequirements`, a list of objects, validated against the `Compliance_Requirement_ConfigConstraint` Pydantic model (`prowler/lib/check/compliance_models.py`).
* **Universal** (`prowler/compliance/...`): `config_requirements`, the same list of objects as plain dicts on `UniversalComplianceRequirement`.
When a legacy file is adapted to the universal model, `adapt_legacy_to_universal()` copies `ConfigRequirements` into `config_requirements` (`compliance_models.py`), so downstream code only ever reads one shape.
### Constraint schema
Each entry in the list is a single constraint with the following fields:
| Field | Type | Required | Description |
| ----------- | --------------------------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Check` | string | Yes | The configurable check this constraint guards. Should be one of the requirement's `Checks`. Used only to build a human-readable reason. |
| `ConfigKey` | string | Yes | The `audit_config` key the check reads (for example `max_unused_access_keys_days`). |
| `Operator` | enum | Yes | How to compare the applied value against `Value`. One of `lte`, `gte`, `eq`, `in`, `subset`, `superset`. |
| `Value` | bool, int, float, string, or list | Yes | The strictest configuration the requirement tolerates. The accepted Python type depends on the operator (see below). |
| `Provider` | string | No | The provider this constraint applies to (e.g. `aws`). **Required for universal (multi-provider) frameworks**, where the same requirement maps checks across providers — the constraint is only evaluated when the scanned provider matches. Single-provider (legacy) frameworks omit it. |
### Operators
| Operator | Applied value satisfies the guardrail when… | Typical use |
| ---------- | ------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `lte` | `applied <= Value` | Maximum-age / maximum-count thresholds (e.g. `max_unused_access_keys_days <= 45`). |
| `gte` | `applied >= Value` | Minimum-retention / minimum-count thresholds. |
| `eq` | `applied == Value` | Boolean toggles or an exact required value (e.g. `mute_non_default_regions == false`). |
| `in` | `applied` is one of `Value` (a list) | The applied scalar must belong to an allowed set. |
| `subset` | `set(applied) <= set(Value)` | **Allowlist** configs — every applied value must already be permitted. Widening the allowlist with a weaker value (e.g. adding TLS `1.0` to `recommended_minimal_tls_versions`) breaks the guardrail. |
| `superset` | `set(applied) >= set(Value)` | **Denylist** configs — every forbidden value must remain forbidden. Removing an entry from a denylist (e.g. dropping a weak algorithm from `insecure_key_algorithms`) breaks the guardrail. |
`subset` / `superset` require both the applied value and `Value` to be lists; any other type is treated as not satisfied. For `eq` against a boolean, declare `Value` as a JSON boolean (`false`, not `0`) — the model keeps booleans distinct from integers.
### How guardrails are evaluated
All evaluation lives in one shared module, `prowler/lib/check/compliance_config_eval.py`, consumed by every compliance output (CSV, OCSF, and the CLI tables) and reused by the Prowler App backend so the rule is defined exactly once.
1. The applied configuration is the scan-global `audit_config` (the same mapping for every resource and region), resolved via `get_scan_audit_config()`.
2. For each requirement that declares constraints, `evaluate_config_constraints()` walks the list and returns `(is_compliant, reason)`. The requirement is compliant when **every** explicitly-set key satisfies its constraint.
3. A constraint tagged with a `Provider` that does **not** match the provider being scanned (resolved via `get_scan_provider_type()`) is **skipped**. This scopes a universal framework's constraints to the right provider, so a guardrail authored for an AWS check never affects a GCP or Azure scan of the same requirement. Untagged constraints (legacy single-provider frameworks) always apply.
4. A constraint whose `ConfigKey` is **not present** in `audit_config` is **skipped** — the check's built-in default is assumed to already match what the requirement expects. This is why nothing changes for the default configuration.
5. When a constraint is violated, the finding's status is overridden to `FAIL` and a plain-language explanation is prepended to `status_extended` (via `apply_config_status()`). The message opens with `Configuration not valid for this requirement.` and names the check, the value the scan applied, what the requirement needs and how to fix it. For the table generators, `get_effective_status()` applies the same FAIL roll-up so per-section counts stay consistent.
Guardrails only ever make a result **stricter** (they can turn PASS into FAIL); they never relax a real FAIL into PASS. A requirement with no constraints, or whose keys all use defaults, is reported exactly as before.
### Example: legacy framework
From `prowler/compliance/aws/cis_6.0_aws.json`, requirement 2.11 declares two guardrails — one per configurable check it maps to:
```json title="prowler/compliance/aws/cis_6.0_aws.json" theme={null}
{
"Id": "2.11",
"Description": "Ensure credentials unused for 45 days or more are disabled.",
"Checks": [
"iam_user_accesskey_unused",
"iam_user_console_access_unused"
],
"ConfigRequirements": [
{
"Check": "iam_user_accesskey_unused",
"ConfigKey": "max_unused_access_keys_days",
"Operator": "lte",
"Value": 45
},
{
"Check": "iam_user_console_access_unused",
"ConfigKey": "max_console_access_days",
"Operator": "lte",
"Value": 45
}
],
"Attributes": [ /* ... */ ]
}
```
A boolean guardrail from the same file: requirement 2.5 (IAM Access Analyzer) only holds when regions are not muted, so a scan with `mute_non_default_regions: true` cannot be trusted for it:
```json theme={null}
"ConfigRequirements": [
{
"Check": "accessanalyzer_enabled",
"ConfigKey": "mute_non_default_regions",
"Operator": "eq",
"Value": false
}
]
```
### Example: universal framework
The universal schema uses the lowercase `config_requirements` key with the identical object shape:
```json theme={null}
{
"id": "MF-2.1",
"name": "Restrict TLS to modern versions",
"description": "Endpoints must negotiate only TLS 1.2 or higher.",
"checks": {
"aws": ["elbv2_listener_ssl_listeners"]
},
"config_requirements": [
{
"Check": "elbv2_listener_ssl_listeners",
"Provider": "aws",
"ConfigKey": "recommended_minimal_tls_versions",
"Operator": "subset",
"Value": ["TLS 1.2", "TLS 1.3"]
}
]
}
```
Each constraint declares the `Provider` it targets so the guardrail is only evaluated on scans of that provider — essential for universal frameworks like CSA CCM and DORA, where one requirement maps checks across `aws`, `azure`, `gcp` and more. Because the operator is `subset`, adding `"TLS 1.0"` to `recommended_minimal_tls_versions` widens the allowlist beyond `["TLS 1.2", "TLS 1.3"]` and the requirement is forced to FAIL.
### What the user sees
With a loosened config, the affected requirement's findings report:
```text theme={null}
Status: FAIL
StatusExtended: Configuration not valid for this requirement. The check
iam_user_accesskey_unused has max_unused_access_keys_days set
to 120, but the requirement needs a value of 45 or lower.
Update it to 45 or lower.
```
The same `Configuration not valid for this requirement.` message appears identically across the CSV, OCSF, and console-table outputs.
### Authoring guidelines
* Declare a guardrail only for keys whose value actually changes whether the requirement is met. Most configurable checks do not need one.
* Set `Value` to the **strictest** configuration the control tolerates — the same number the control text cites (CIS 45 days, NIST ≤90, and so on).
* Keep `ConfigKey` spelled exactly as the check reads it from `audit_config`; an unknown key is never present in the config and the constraint is silently skipped.
* In a **universal (multi-provider) framework**, always set `Provider` to the provider that owns `Check` — otherwise the guardrail would leak onto scans of the other providers the requirement maps. Legacy single-provider files omit it.
* Pick the operator from the value's role: a max threshold is `lte`, a min threshold is `gte`, a toggle is `eq`, an allowlist is `subset`, a denylist is `superset`.
* An unrecognized operator does **not** block the requirement — a malformed constraint is treated as satisfied rather than failing the whole framework. Validate your JSON with the tests below.
### Testing guardrails
The shared evaluator and the per-output integration are covered by:
* `tests/lib/check/compliance_config_eval_test.py` — operator semantics, skipped-key behavior, and the FAIL override.
* `tests/lib/check/compliance_config_constraint_model_test.py` — model validation (types, operator enum, bool-vs-int).
* `tests/lib/check/compliance_config_requirements_data_test.py` — sanity-checks the guardrails shipped in the JSON catalog.
* Per-output tests under `tests/lib/outputs/compliance/` (CIS AWS/Azure, ENS AWS, OCSF, universal table) confirm the override reaches each format.
Run them with:
```bash theme={null}
uv run pytest -n auto \
tests/lib/check/compliance_config_eval_test.py \
tests/lib/check/compliance_config_constraint_model_test.py \
tests/lib/check/compliance_config_requirements_data_test.py \
tests/lib/outputs/compliance/
```
## Version handling
Prowler matches frameworks by concatenating `Framework` and `Version`. A missing or empty `Version` collapses several frameworks to the same key and breaks CLI filtering with `--compliance`.
* Always set `Version` (or `version` for universal frameworks) to a non-empty string, even for frameworks that rename editions rather than version them. Use the edition identifier (for example `RD2022`, `v2025.10`, `4.0`, `2022/2554`).
* When the source catalog has no version, use the first year of adoption or the release date.
* For **legacy** files, make sure the version substring embedded in the filename matches `Version`, because the CLI dispatcher reads `compliance_framework.split("_")[1]` to select the correct version.
## Validating Your Framework
Before opening a PR, validate the JSON loads cleanly against the model and that every referenced check actually exists.
### 1. Schema validation
For **universal** frameworks, load the file and inspect what was parsed. The framework key inside `bulk` is the **basename of the JSON file** (without `.json`); for `prowler/compliance/dora_2022_2554.json` that key is `dora_2022_2554`, for `prowler/compliance/aws/cis_5.0_aws.json` it is `cis_5.0_aws`.
```python theme={null}
from prowler.lib.check.compliance_models import (
load_compliance_framework_universal,
get_bulk_compliance_frameworks_universal,
)
fw = load_compliance_framework_universal("prowler/compliance/.json")
assert fw is not None, "load returned None — check the logs for the validation error"
print(fw.framework, len(fw.requirements), fw.get_providers())
bulk = get_bulk_compliance_frameworks_universal("aws")
assert "" in bulk
```
### 2. Check existence cross-check
There is **no automatic check-existence validation** at load time. Cross-check that every check name in your framework maps to a real check directory:
```python theme={null}
import os
real = set()
for svc in os.listdir("prowler/providers/aws/services"):
svc_path = f"prowler/providers/aws/services/{svc}"
if not os.path.isdir(svc_path):
continue
for entry in os.listdir(svc_path):
if os.path.isfile(f"{svc_path}/{entry}/{entry}.metadata.json"):
real.add(entry)
referenced = {c for r in fw.requirements for c in r.checks.get("aws", [])}
missing = referenced - real
assert not missing, f"checks referenced in framework but not found in repo: {sorted(missing)}"
```
### 3. CLI smoke test
```bash theme={null}
uv run python prowler-cli.py --list-compliance
```
The framework must appear in the output. A validation error indicates a schema mismatch.
```bash theme={null}
uv run python prowler-cli.py \
--compliance \
--log-level ERROR
```
Verify that:
* Prowler produces a CSV file under `output/compliance/` with the expected name.
* The CLI summary table lists every section / pillar of the framework.
* Findings roll up under the expected requirements.
### 4. Inspect the CSV output
Open the generated CSV and confirm:
* All columns defined in `models.py` (legacy) or in `attributes_metadata` (universal) appear.
* Every requirement has at least one row per scanned resource (when there are findings).
* Attribute values such as `Requirements_Attributes_Section` reflect the JSON content.
### 5. Verify the framework in Prowler App
Launch Prowler App locally (`docker compose up` from the repository root) and run a scan with the new compliance framework. Confirm the compliance page renders the requirements, sections, and status widgets correctly.
## Testing
Compliance contributions require two layers of tests.
* **Schema tests** exercise the Pydantic models. Extend `tests/lib/check/universal_compliance_models_test.py` with a case that loads the new JSON file and asserts the attribute type matches the expected model.
* **Output tests** (legacy frameworks only) exercise the transformer. Mirror the structure under `tests/lib/outputs/compliance//` with fixtures that feed synthetic findings through the transformer and assert the resulting CSV rows.
Run the suite with:
```bash theme={null}
uv run pytest -n auto tests/lib/check/universal_compliance_models_test.py \
tests/lib/outputs/compliance/
```
For guidance on writing Prowler SDK tests, refer to [Unit Testing](/developer-guide/unit-testing).
## Running and listing your framework
Once the file is in place, the CLI auto-discovers it:
```sh theme={null}
prowler --list-compliance # framework appears in the list
prowler --compliance --list-checks
prowler --compliance # full scan + compliance report
prowler --compliance --list-compliance-requirements
```
For end-user-facing tutorials (recommended for high-profile frameworks), add a dedicated page under `docs/user-guide/compliance/tutorials/` and register it in the `"Compliance"` group of `docs/docs.json`. See `docs/user-guide/compliance/tutorials/threatscore.mdx` as a reference.
## Submitting the pull request
Before opening the pull request:
1. Run the complete QA pipeline:
```bash theme={null}
uv run pre-commit run --all-files
uv run pytest -n auto
```
2. Add a changelog entry under the `### 🚀 Added` section of `prowler/CHANGELOG.md`, describing the new framework and the providers it covers.
3. Follow the [Pull Request Template](https://github.com/prowler-cloud/prowler/blob/master/.github/pull_request_template.md) and set the PR title using Conventional Commits, e.g. `feat(compliance): add My Framework 1.0 for AWS`.
4. Request review from the compliance codeowners listed in `.github/CODEOWNERS`.
## Troubleshooting
The following issues are the most common when contributing a compliance framework.
* **`ValidationError: field required` during scan (legacy).** The JSON is missing a required attribute field. Re-check the matching Pydantic model in `prowler/lib/check/compliance_models.py`.
* **All attributes collapse to `Generic_Compliance_Requirement_Attribute` values (legacy).** The Pydantic `Union` is ordered incorrectly, or the JSON matches only the generic shape. Keep the generic model in the last Union position and ensure every required field is present in the JSON.
* **`attributes_metadata validation failed` (universal).** The root validator in `compliance_models.py` rejected the file. The error message lists each offending requirement; common causes are unknown attribute keys (typo or missing entry in `attributes_metadata`), enum violations, or missing required keys.
* **`--compliance` filter does not find the framework.** For legacy: the filename does not match `__.json`, the version is empty, or the file lives outside `prowler/compliance/