Prowler Lighthouse AI¶

Prowler Lighthouse AI is a Cloud Security Analyst chatbot that helps you understand, prioritize, and remediate security findings in your cloud environments. It's designed to provide security expertise for teams without dedicated resources, acting as your 24/7 virtual cloud security analyst.

Prowler Lighthouse

How It Works¶

Prowler Lighthouse AI uses OpenAI's language models and integrates with your Prowler security findings data.

Here's what's happening behind the scenes:

The system uses a multi-agent architecture built with LanggraphJS for LLM logic and Vercel AI SDK UI for frontend chatbot.
It uses a "supervisor" architecture that interacts with different agents for specialized tasks. For example, findings_agent can analyze detected security findings, while overview_agent provides a summary of connected cloud accounts.
The system connects to OpenAI models to understand, fetch the right data, and respond to the user's query.

Note

Lighthouse AI is tested against gpt-4o and gpt-4o-mini OpenAI models.

The supervisor agent is the main contact point. It is what users interact with directly from the chat interface. It coordinates with other agents to answer users' questions comprehensively.

Lighthouse AI Architecture

Note

All agents can only read relevant security data. They cannot modify your data or access sensitive information like configured secrets or tenant details.

Set up¶

Getting started with Prowler Lighthouse AI is easy:

Go to the configuration page in your Prowler dashboard.
Enter your OpenAI API key.
Select your preferred model. The recommended one for best results is gpt-4o.
(Optional) Add business context to improve response quality and prioritization.

Lighthouse AI Configuration

Adding Business Context¶

The optional business context field lets you provide additional information to help Lighthouse AI understand your environment and priorities, including:

Your organization's cloud security goals
Information about account owners or responsible teams
Compliance requirements for your organization
Current security initiatives or focus areas

Better context leads to more relevant responses and prioritization that aligns with your needs.

Capabilities¶

Prowler Lighthouse AI is designed to be your AI security team member, with capabilities including:

Natural Language Querying¶

Ask questions in plain English about your security findings. Examples:

"What are my highest risk findings?"
"Show me all S3 buckets with public access."
"What security issues were found in my production accounts?"

Natural language querying

Detailed Remediation Guidance¶

Get tailored step-by-step instructions for fixing security issues:

Clear explanations of the problem and its impact
Commands or console steps to implement fixes
Alternative approaches with different solutions

Detailed Remediation

Enhanced Context and Analysis¶

Lighthouse AI can provide additional context to help you understand the findings:

Explain security concepts related to findings in simple terms
Provide risk assessments based on your environment and context
Connect related findings to show broader security patterns

Business Context

Contextual Responses

Important Notes¶

Prowler Lighthouse AI is powerful, but there are limitations:

Continuous improvement: Please report any issues, as the feature may make mistakes or encounter errors, despite extensive testing.
Access limitations: Lighthouse AI can only access data the logged-in user can view. If you can't see certain information, Lighthouse AI can't see it either.
NextJS session dependence: If your Prowler application session expires or logs out, Lighthouse AI will error out. Refresh and log back in to continue.
Response quality: The response quality depends on the selected OpenAI model. For best results, use gpt-4o.

Getting Help¶

If you encounter issues with Prowler Lighthouse AI or have suggestions for improvements, please reach out through our Slack channel.

What Data Is Shared to OpenAI?¶

The following API endpoints are accessible to Prowler Lighthouse AI. Data from the following API endpoints could be shared with OpenAI depending on the scope of user's query:

Accessible API Endpoints¶

User Management:

List all users - /api/v1/users
Retrieve the current user's information - /api/v1/users/me

Provider Management:

List all providers - /api/v1/providers
Retrieve data from a provider - /api/v1/providers/{id}

Scan Management:

List all scans - /api/v1/scans
Retrieve data from a specific scan - /api/v1/scans/{id}

Resource Management:

List all resources - /api/v1/resources
Retrieve data for a resource - /api/v1/resources/{id}

Findings Management:

List all findings - /api/v1/findings
Retrieve data from a specific finding - /api/v1/findings/{id}
Retrieve metadata values from findings - /api/v1/findings/metadata

Overview Data:

Get aggregated findings data - /api/v1/overviews/findings
Get findings data by severity - /api/v1/overviews/findings_severity
Get aggregated provider data - /api/v1/overviews/providers
Get findings data by service - /api/v1/overviews/services

Compliance Management:

List compliance overviews for a scan - /api/v1/compliance-overviews
Retrieve data from a specific compliance overview - /api/v1/compliance-overviews/{id}

Excluded API Endpoints¶

Not all Prowler API endpoints are integrated with Lighthouse AI. They are intentionally excluded for the following reasons:

OpenAI/other LLM providers shouldn't have access to sensitive data (like fetching provider secrets and other sensitive config)
Users queries don't need responses from those API endpoints (ex: tasks, tenant details, downloading zip file, etc.)

Excluded Endpoints:

User Management:

List specific users information - /api/v1/users/{id}
List user memberships - /api/v1/users/{user_pk}/memberships
Retrieve membership data from the user - /api/v1/users/{user_pk}/memberships/{id}

Tenant Management:

List all tenants - /api/v1/tenants
Retrieve data from a tenant - /api/v1/tenants/{id}
List tenant memberships - /api/v1/tenants/{tenant_pk}/memberships
List all invitations - /api/v1/tenants/invitations
Retrieve data from tenant invitation - /api/v1/tenants/invitations/{id}

Security and Configuration:

List all secrets - /api/v1/providers/secrets
Retrieve data from a secret - /api/v1/providers/secrets/{id}
List all provider groups - /api/v1/provider-groups
Retrieve data from a provider group - /api/v1/provider-groups/{id}

Reports and Tasks:

Download zip report - /api/v1/scans/{v1}/report
List all tasks - /api/v1/tasks
Retrieve data from a specific task - /api/v1/tasks/{id}

Lighthouse AI Configuration:

List OpenAI configuration - /api/v1/lighthouse-config
Retrieve OpenAI key and configuration - /api/v1/lighthouse-config/{id}

Note

Agents only have access to hit GET endpoints. They don't have access to other HTTP methods.

FAQs¶

1. Why only OpenAI models?

During feature development, we evaluated other LLM models.

Claude AI - Claude models have tier-based ratelimits. For Lighthouse AI to answer slightly complex questions, there are a handful of API calls to the LLM provider within few seconds. With Claude's tiering system, users must purchase $400 credits or convert their subscription to monthly invoicing after talking to their sales team. This pricing may not suit all Prowler users.
Gemini Models - Gemini lacks a solid tool calling feature like OpenAI. It calls functions recursively until exceeding limits. Gemini-2.5-Pro-Experimental is better than previous models regarding tool calling and responding, but it's still experimental.
Deepseek V3 - Doesn't support system prompt messages.

2. Why a multi-agent supervisor model?

Context windows are limited. While demo data fits inside the context window, querying real-world data often exceeds it. A multi-agent architecture is used so different agents fetch different sizes of data and respond with the minimum required data to the supervisor. This spreads the context window usage across agents.

3. Is my security data shared with OpenAI?

Minimal data is shared to generate useful responses. Agents can access security findings and remediation details when needed. Provider secrets are protected by design and cannot be read. The OpenAI key configured with Lighthouse AI is only accessible to our NextJS server and is never sent to LLMs. Resource metadata (names, tags, account/project IDs, etc) may be shared with OpenAI based on your query requirements.

4. Can the Lighthouse AI change my cloud environment?

No. The agent doesn't have the tools to make the changes, even if the configured cloud provider API keys contain permissions to modify resources.