I describe a practical pipeline I built to move raw security signals into clear, repeatable steps that my team can run during incidents. I focus on hands-on setup and what I actually run in production, not just theory.
I rely on file integrity monitoring to spot new or changed files, then trigger a signature scan and containment. I combine those detections with an artificial intelligence model to add context that matters to the business.
My flow ties server and endpoint logs together so analysts see correlated evidence fast. I sketch the components, show where cloud and local options fit, and explain how llms speed triage while keeping human review.
Get your copy now. PowerShell Essentials for Beginners – With Script Samples
Get your copy now. PowerShell Essentials for Beginners – With Script Samples
The problem I solve is translating high-volume signals into precise, safe actions for responders. I process large amounts of security information so my team can focus on real risk, not noise.
How this helps: a model speeds identification by synthesizing logs and spotting subtle anomalies that humans can miss. That reduces false positives and shortens the detection-to-response loop.
I map clear roles so everyone knows who validates recommendations, who executes changes, and who documents outcomes. I also use search over historical logs to place a suspicious file or process in context before escalation.
Result: narrower wazuh agent configs, higher signal quality, and faster, better-documented responses that strengthen security posture over time.
I map a compact reference architecture that keeps logs flowing, models accessible, and endpoints monitored for fast triage.
I run central components (server, indexer, and dashboard) on Ubuntu 24.04. The baseline build is 4.12.0 with at least 16 GB RAM and 4 CPUs to support scalable log ingestion and efficient search.
I enroll Ubuntu and Windows 11 endpoint agents to capture diverse telemetry. Agent policies target key directories and file types so FIM events are consistent across OS types.
Models include a cloud option (OpenAI ChatGPT), a local Llama 3 via Ollama for private inference, and Claude Haiku via Amazon Bedrock integrated into the dashboard UI. I size local models to match CPU/RAM limits to keep response times predictable.
| Component | Example | Notes |
|---|---|---|
| Central server | Ubuntu 24.04, 16 GB RAM, 4 CPUs | Holds server, indexer, dashboard for ingestion and search |
| Endpoint | Ubuntu / Windows 11 agents | Monitored directories standardized across OS |
| Model | OpenAI ChatGPT, Llama 3 (Ollama), Claude Haiku (Bedrock) | Mix of cloud and local inference; name groups for maintenance |
| Checks | curl health and _post tests | Validates connector responses and expected formats |
Finally, I document the minimal install and following steps so teams can replicate the setup. I keep names clear for models, connectors, and groups to simplify later updates and audits.
I set a tight monitoring surface so my team sees high-quality signals and fewer false positives. Small, intentional changes to agent scope cut noise and speed validation.
I add this snippet to /var/ossec/etc/ossec.conf to monitor home directories in real time:
<directories realtime=”yes”>/home</directories>
After editing, I restart the agent to apply configuration changes and watch for any error in the startup logs.
On Windows I add:
<directories realtime=”yes”>C:\Users\*\Downloads</directories>
I confirm service permissions so the agent can read long file names and nested paths. Then I run Restart-Service -Name wazuh.
Archives live in /var/ossec/logs/archives as date-based JSON or JSON.GZ files. I validate that new events appear there and on the wazuh server.
I check parameters in ossec.conf to confirm the directory scope and then parse fields like file path, directory, and rule name for downstream automation.
I built a compact active response flow that ties YARA detection to enrichment and immediate remediation. The goal is clear: detect a suspicious file, enrich the finding, attempt a measured response, and write a consistent log entry for the server and analysts.
I install YARA from source, fetch community rules via curl and valhallaAPI, then set ownership to root:wazuh and permissions to 750. I validate that each rule contains description metadata so enrichment can reference human-friendly context.
The yara.sh script reads parameters, waits for file writes to complete, runs YARA, captures output, and handles error conditions. On a match it attempts deletion and posts results to the model endpoint. All combined YARA and model text is appended to logs/active-responses.log.
I install Python and YARA, download Valhalla rules, and convert yara.py to yara.exe via PyInstaller for the Active Response bin. The script sends a POST with headers (Authorization, Content-Type), handles 401 invalid key responses, logs the event, and records deletion attempts.
| Platform | Script | Key behavior |
|---|---|---|
| Ubuntu | yara.sh | External key, flags invalid key, writes logs |
| Windows | yara.exe | Headers, 401 handling, audit logging |
| Server | Log store | Harmonized fields for decoders |
I store the API key outside source, explicitly handle invalid key responses so the pipeline never stalls, and verify consistent logs per endpoint. These steps keep the response predictable and auditable.
I centralize server-side parsing so each detection yields structured fields an analyst can trust.
Decoders in local_decoder.xml extract many named values from YARA text. I parse log_type, rule_name, rule_description, author, reference, date, threat_score, api_customer, file_hash, tags, minimum_YARA_version, scanned_file, chatgpt_response, and deletion indicators.
I add custom decoders that extract the rule name, description, tags, and threat_score. I also capture the model-generated chatgpt_response field for downstream review.
I write rules in local_rules.xml to trigger on FIM events in /home and C:\Users\*\Downloads (IDs 100300–100303) and on YARA groups (108000–108003).
The active response module is configured in ossec.conf to run YARA and orchestrate the response when those rules fire. This ties detection to remediation and logging.
“Version your configuration so changes to name mappings or expected values can be rolled back if an error appears.”
My approach starts by shaping short prompts that extract impact, scope, and clear fixes from each detection. I keep prompts focused so the model returns a compact mitigation suggestion, a risk value, and a short rationale.
I use templates that feed the model: rule name, file path, file hash, timestamps, and a short excerpt from the detection message. These fields force the model to ground its content in the evidence I provide.
Standardized outputs include a one-line response, key parameters, a value statement about risk reduction, and 2–3 remediation steps. Consistent structure reduces analyst decisions and speeds action.
I map each rule hit to a specific role and endpoint. That mapping becomes the playbook step so the operator sees who acts, which host to target, and what to run.
I log every model query and response, group changes, and playbook name versions. This creates an audit trail for security reviews and supports iterative improvements.
“Keep prompts short, explicit, and tied to evidence so recommendations remain practical and verifiable.”
I host the model on my server so semantic search runs close to the logs and returns fast results.
I install Ollama using the curl installer, then pull llama3 and confirm CPU and RAM are sufficient for responsive inference.
I enable archives at /var/ossec/logs/archives/archives.json. threat_hunter.py loads those files, builds embeddings with all-MiniLM-L6-v2, and creates a FAISS vector store for fast semantic search.
I run a WebSocket chatbot that accepts /help, /reload, /set days, and /stat. Short queries and focused messages yield better output and faster retrieval.
Remote mode uses SSH with group permissions assigned to wazuh. I limit machine access, audit messages, and verify endpoint permissions before exposing file reads.
“Keep queries concise and ground them in evidence so retrieval surfaces meaningful clusters across time windows.”
I explain the steps to enable Claude 3.5 Haiku in Bedrock, wire it into OpenSearch Assistant, and make the chat available in the wazuh dashboard.
First, I create an IAM user, generate access keys, and attach AmazonBedrockFullAccess plus a custom marketplace policy so the model can be invoked. I keep keys out of source and record them in a secrets store.
On the host I copy the OpenSearch dashboard plugins (observabilityDashboards, mlCommonsDashboards, assistantDashboards), set ownership and permissions, and enable assistant.chat.enabled. On the indexer I install opensearch-flow-framework and opensearch-skills so model calls can route correctly.
Using DevTools I set ML to run on any node, then POST a connector with secure headers, access_key, secret_key, region (use us-west-2 for Haiku), anthropic_version, and model parameters. I register a model group, deploy the model, and test with _predict to verify output and response time.
“Test _predict and logs first; a healthy connector produces consistent output and clear response codes.”
I operationalize model recommendations by turning them into precise, auditable runbooks. Each playbook lists the actor, the file or directory target, and the exact commands to run. This makes the guidance repeatable and reviewable.
I map model output into an active response module entry that contains clear steps. Each step includes a dry-run command, a curl-based health check for the server, and a final remediation command for the matching file.
I always include an explicit rollback and approval gate for high-impact actions. Low-risk types get automated deletion when YARA matches and the model confidence is high. High-risk changes require human sign-off before the response executes.
I track metrics on logs volume, time-to-detect, and time-to-contain to demonstrate the value of each change. The Llama 3 threat_hunter.py app speeds log review while Claude Haiku provides in-dashboard Q&A for analysts.
“Automate small, reversible changes first; widen scope only after you measure consistent, low-risk outcomes.”
Conclusion:
This method ties raw events to specific roles, machines, and commands so work is consistent and traceable. I combined detection, enrichment, and action to produce repeatable runbooks that teams can execute with confidence.
I keep disciplined configuration, schedule periodic restart wazuh windows, and store each key and secret in a dedicated vault. I log all attempts, approvals, and the resulting post entries to satisfy audits.
Users see in-dashboard content and text interactions tied to agent group mappings. I capture machine and information context for every file or directory action. Human oversight remains central: operators validate suggestions, refine prompts, and keep role clarity.
Lightweight post-checks and guardrails limit unintended changes. Next, I expect tighter feedback loops, broader coverage, and steady, measurable improvement across the pipeline.
I aim to convert detection events into clear, repeatable remediation steps. I use large language models to enrich alerts with context, proposed commands, and role-based tasks so analysts can act faster and with confidence.
I want to reduce mean time to resolution and lower cognitive load for security teams. With more telemetry and complex detections, I find automated intelligence helps triage, suggest safe fixes, and maintain audit trails.
I run a server with the manager, an indexer (OpenSearch/Elasticsearch), and the dashboard on Ubuntu. Agents on Ubuntu and Windows report logs and file integrity events. For models I evaluate OpenAI ChatGPT, local Llama 3 via Ollama, and Anthropic Claude via Amazon Bedrock.
On Ubuntu agents I set syscheck directories and enable real-time monitoring for critical paths. On Windows I monitor Users\Downloads and other high-risk folders and ensure permissions allow the agent to read target files. I apply configuration changes centrally and push them to agents.
I inspect archived logs and look for key fields: log, file, name, directory, server, and type. I use the dashboard and CLI tools to filter by agent ID, timestamp, and event type to confirm telemetry completeness before enrichment.
I deploy YARA rules on endpoints, capture detections, and pass metadata to an LLM for enrichment. The pipeline runs an Ubuntu shell helper (yara.sh) or a Windows helper (yara.py → yara.exe) to gather context, call the model API, and take safe actions such as quarantining or alerting.
I install YARA via package manager or build from source, ensure executable permissions, and include required rule metadata. I use curl to fetch rules or valhallaAPI integrations and validate rule format and file permissions before deployment.
My scripts accept parameters like extra_args, output paths, and message payloads. I add robust error handling for timeouts and permission errors, log all attempts, and report stdout/stderr so the manager and dashboard show precise results.
The Windows helper includes API key headers, builds POST requests, and retries safe delete attempts if allowed. I log API responses and increment counters for failed deletions so I can audit automatic remediation actions.
I store keys in secure vaults or use OS-level protected files with strict permissions. I limit key scope, rotate keys regularly, and monitor audit logs for invalid key responses or suspicious usage.
I write decoders to parse events and extract rule_name, description, tags, threat_score, and chatgpt_response. Those fields feed rules and the assistant so the model sees structured context instead of raw text blobs.
I configure rules to trigger on FIM events under /home and Users\Downloads with a level appropriate to the threat score. Higher levels can auto-run the active response module; lower ones generate analyst prompts for review.
When a rule fires, I map it to a module that executes scripts, calls the model, and writes a consistent response object. That object includes parameters, suggested commands, and a severity tag so operators can follow playbook steps.
I structure prompts to include role, endpoint type, detection metadata, and desired output format. I enforce a response schema with fields for steps, commands, rollback, and rationale so outputs are predictable and machine-parsable.
I store playbooks and enrichment outputs in a versioned repository or index. I log playbook runs, group actions by agent and incident, and snapshot model outputs to allow audits and repeated testing.
Yes. I run Llama 3 via Ollama on the same server to reduce latency and avoid external data exfiltration. I install and pull models, manage resource constraints, and isolate the runtime for security.
I create embeddings from archives.json and store vectors in a FAISS index. My threat_hunter.py loads vectors, executes similarity searches, and formulates queries to the local model for focused hunting sessions.
My chatbot supports commands like /help, /reload, /set days, and /stat. It maintains conversation flow, reloads context from archives, and allows running targeted queries while maintaining audit logs.
I use SSH with key-based auth, restrict sudo where possible, and ensure agents run under least-privilege accounts. I audit access logs and revoke credentials when not needed.
I create an IAM user with restricted policies, enable the model in the targeted region, and register the model group in OpenSearch Assistant. Then I create a connector, deploy the predictor, and test using the _predict endpoint before mapping it to agents in the dashboard.
I enable mlCommons, skills, observability, and the assistant plugin. I create connectors for data sources, register model groups, and deploy them so the assistant can return enriched predictions inside the dashboard.
I register agents in the OpenSearch Assistant, map them to model groups, and refresh the dashboard index. This allows direct agent context to appear in assistant queries and links detections to conversation snippets.
I convert model suggestions into active response steps that include curl-based checks, safe commands, and rollback plans. Each runbook includes preconditions, impact statements, and manual approval gates where needed.
I track metrics like mean time to acknowledge, mean time to remediate, false positive rates, and number of automated vs. manual interventions. These metrics show reduced noise and faster, clearer mitigations over time.
Get my expert guide to Understanding Data Centre Architecture: Core Components Every IT Pro Should…
I setup my Wazuh network at home to enhance security. Follow my guide to understand…
I analyze the risks of a decripted blockchain by quantum computer and its implications on…
Discover how Wazuh for business can enhance your enterprise security with my comprehensive guide, covering…
Get started with Wazuh using my Wazuh for Beginners: A Comprehensive Guide, your ultimate resource…
I examine the impact of past conflicts on IT projects post war in Europe, providing…