Skip to main content Skip to sidebar

Export AWS CloudTrail to VictoriaLogs

AWS CloudTrail records every API call made in your account. When a trail is configured to deliver logs to S3, they arrive as compressed JSON files – cheap to store but difficult to search. Exporting CloudTrail events to VictoriaLogs gives you fast full-text search, structured filtering by account or event name, and configurable retention – all without paying for a managed SIEM.

This post walks through setting up the pipeline with Terraform for AWS infrastructure and Vector as the log forwarder.

Architecture

The pipeline has four components:

flowchart LR
    CT["CloudTrail"] --> S3["S3 Bucket"]
    S3 -->|ObjectCreated event| SQS["SQS Queue"]
    SQS --> V["Vector"]
    V --> VL["VictoriaLogs"]
  1. CloudTrail writes JSON log files to an S3 bucket.
  2. S3 event notifications send a message to an SQS queue whenever a new object is created.
  3. Vector polls the SQS queue, downloads the S3 object, parses CloudTrail records, and forwards them.
  4. VictoriaLogs ingests the events via its JSON line protocol and stores them with configurable retention.

Using SQS as an intermediary avoids polling S3 directly and provides at-least-once delivery with automatic retries.

Terraform: CloudTrail, S3, and SQS

resource "aws_cloudtrail" "main" {
  name           = "main"
  s3_bucket_name = aws_s3_bucket.cloudtrail.id

  is_multi_region_trail         = true
  include_global_service_events = true
  enable_logging                = true
}

is_multi_region_trail captures API calls from all regions. include_global_service_events adds IAM, STS, and CloudFront events that are only logged in us-east-1.

The S3 bucket needs public access blocked, server-side encryption, and a bucket policy that allows writes only from the CloudTrail service principal.

The key piece that connects S3 to Vector is the SQS notification:

resource "aws_sqs_queue" "notifications" {
  name                       = "cloudtrail-notifications"
  message_retention_seconds  = 604800
  visibility_timeout_seconds = 300
  sqs_managed_sse_enabled    = true
}

resource "aws_s3_bucket_notification" "cloudtrail" {
  bucket = aws_s3_bucket.cloudtrail.id

  queue {
    queue_arn = aws_sqs_queue.notifications.arn
    events    = ["s3:ObjectCreated:*"]
  }
}

The queue retains messages for 7 days (604800 seconds), giving Vector enough time to process a backlog after maintenance. The visibility timeout of 300 seconds prevents other consumers from picking up the same message while Vector is downloading and processing the S3 object. Add a queue policy that allows sqs:SendMessage from the S3 bucket ARN.

The host running Vector needs IAM permissions to read from the S3 bucket and consume the SQS queue. On EC2 this is done via an instance profile; outside AWS, use access keys or IAM Roles Anywhere.

Vector Pipeline

data_dir: /var/lib/vector

sources:
  cloudtrail_s3:
    type: aws_s3
    sqs:
      queue_url: "https://sqs.us-east-1.amazonaws.com/<account-id>/<queue-name>"

transforms:
  parse_and_split:
    type: remap
    inputs:
      - cloudtrail_s3
    source: |-
      records = parse_json!(.message).Records
      . = []
      for_each(array!(records)) -> |_index, record| {
        record.log_type = "aws_cloudtrail"
        record.aws_account = record.recipientAccountId
        record._msg = record.eventName
        record.timestamp = record.eventTime
        if exists(record.responseElements.credentials.sessionToken) {
          record.responseElements.credentials.sessionToken = "<redacted>"
        }
        . = push(., record)
      }

sinks:
  victorialogs:
    type: http
    inputs:
      - parse_and_split
    uri: "http://127.0.0.1:9428/insert/jsonline"
    encoding:
      codec: json
    framing:
      method: newline_delimited
    compression: gzip
    healthcheck:
      enabled: false
    request:
      headers:
        VL-Stream-Fields: log_type,aws_account
        VL-Time-Field: timestamp
        AccountID: "0"
        ProjectID: "0"

Querying CloudTrail in VictoriaLogs

Once events are flowing, you can query them using the LogsQL query language.

Find all ConsoleLogin events:

log_type:aws_cloudtrail AND _msg:ConsoleLogin

Find AccessDenied errors:

log_type:aws_cloudtrail AND errorCode:AccessDenied

Find all actions by a specific IAM user:

log_type:aws_cloudtrail AND userIdentity.userName:alice

Find all S3 operations in a specific time range:

log_type:aws_cloudtrail AND eventSource:s3.amazonaws.com AND _time:[2026-02-24T00:00:00Z, 2026-02-24T23:59:59Z]

Filter by AWS account in a multi-account setup:

aws_account:123456789012 AND _msg:RunInstances

These same LogsQL queries can be used with vmalert for anomaly detection – schedule queries like root account usage or AccessDenied spikes and route alerts through Alertmanager to Slack, PagerDuty, or email.