scanner
  • About Scanner
  • When to use it
  • Architecture
  • Getting Started
  • Playground Guide
    • Overview
    • Part 1: Search and Analysis
    • Part 2: Detection Rules
    • Wrapping Up
  • Log Data Sources
    • Overview
    • List
      • AWS
        • AWS Aurora
        • AWS CloudTrail
        • AWS CloudWatch
        • AWS ECS
        • AWS EKS
        • AWS GuardDuty
        • AWS Lambda
        • AWS Route53 Resolver
        • AWS VPC Flow
        • AWS VPC Transit Gateway Flow
        • AWS WAF
      • Cloudflare
        • Audit Logs
        • Firewall Events
        • HTTP Requests
        • Other Datasets
      • Crowdstrike
      • Custom via Fluentd
      • Fastly
      • GitHub
      • Jamf
      • Lacework
      • Osquery
      • OSSEC
      • Sophos
      • Sublime Security
      • Suricata
      • Syslog
      • Teleport
      • Windows Defender
      • Windows Sysmon
      • Zeek
  • Indexing Your Logs in S3
    • Linking AWS Accounts
      • Manual setup
        • AWS CloudShell
      • Infra-as-code
        • AWS CloudFormation
        • Terraform
        • Pulumi
    • Creating S3 Import Rules
      • Configuration - Basic
      • Configuration - Optional Transformations
      • Previewing Imports
      • Regular Expressions in Import Rules
  • Using Scanner
    • Query Syntax
    • Aggregation Functions
      • avg()
      • count()
      • countdistinct()
      • eval()
      • groupbycount()
      • max()
      • min()
      • percentile()
      • rename()
      • stats()
      • sum()
      • table()
      • var()
      • where()
    • Detection Rules
      • Event Sinks
      • Out-of-the-Box Detection Rules
      • MITRE Tags
    • API
      • Ad hoc queries
      • Detection Rules
      • Event Sinks
      • Validating YAML files
    • Built-in Indexes
      • _audit
    • Role-Based Access Control (RBAC)
    • Beta features
      • Scanner for Splunk
        • Getting Started
        • Using Scanner Search Commands
        • Dashboards
        • Creating Custom Content in Splunk Security Essentials
      • Scanner for Grafana
        • Getting Started
      • Jupyter Notebooks
        • Getting Started with Jupyter Notebooks
        • Scanner Notebooks on Github
      • Detection Rules as Code
        • Getting Started
        • Writing Detection Rules
        • CLI
        • Managing Synced Detection Rules
      • Detection Alert Formatting
        • Customizing PagerDuty Alerts
      • Scalar Functions and Operators
        • coalesce()
        • if()
        • arr.join()
        • math.abs()
        • math.round()
        • str.uriencode()
  • Single Sign On (SSO)
    • Overview
    • Okta
      • Okta Workforce
      • SAML
  • Self-Hosted Scanner
    • Overview
Powered by GitBook
On this page

Was this helpful?

  1. Indexing Your Logs in S3
  2. Linking AWS Accounts
  3. Infra-as-code

Pulumi

Getting started with Pulumi

You can use this Pulumi Typescript function to set up the IAM role, IAM policies, SNS topic, and S3 bucket to integrate with Scanner.

You provide the value of dataLakeS3Bucket to indicate which S3 bucket you want Scanner to index. Scanner will provide the values for:

  • scannerInstanceStsExternalId

  • scannerInstanceAwsAccountId

  • scannerInstanceSqsIndexQueueArn

import * as aws from "@pulumi/aws";
import * as pulumi from "@pulumi/pulumi";

export function setUpScannerInfra(
  envName: string,
  dataLakeS3BucketName: pulumi.Input<string>,
  dataLakeS3BucketArn: pulumi.Input<string>,
  dataLakeS3BucketKmsKeyArn: pulumi.Input<string> | undefined,
  bucketNotificationSnsTopic: aws.sns.Topic,
  scannerInstanceStsExternalId: string,
  scannerInstanceAwsAccountId: string,
  scannerInstanceSqsIndexQueueArn: string,
): { scannerRole: aws.iam.Role } {
  const scannerRole = new aws.iam.Role(`ScannerRole-${envName}`, {
    name: `ScannerRole-${envName}`,
    assumeRolePolicy: aws.iam.getPolicyDocumentOutput({
      statements: [
        {
          actions: ["sts:AssumeRole"],
          effect: "Allow",
          principals: [
            {
              type: "AWS",
              identifiers: [scannerInstanceAwsAccountId],
            },
          ],
          conditions: [
            {
              test: "StringEquals",
              variable: "sts:ExternalId",
              values: [scannerInstanceStsExternalId],
            },
          ],
        },
      ],
    }).json,
  });

  // Set up topic subscription for the scanner instance to receive notifications
  // when new objects are created in the data lake S3 bucket.
  new aws.sns.TopicSubscription(`ScannerTopicSubscription-${envName}`, {
    topic: bucketNotificationSnsTopic,
    protocol: "sqs",
    endpoint: scannerInstanceSqsIndexQueueArn,
  });

  // The new S3 bucket in the customer's account where Scanner will store its
  // index files.
  const scannerIndexFilesS3BucketName = `scanner-index-files-${scannerInstanceStsExternalId}`;
  const scannerIndexFilesS3Bucket = new aws.s3.Bucket(
    `ScannerIndexFilesBucket-${envName}`,
    {
      bucket: scannerIndexFilesS3BucketName,
      acl: "private",
      // To clear out expired data groups:
      lifecycleRules: [
        {
          id: "ExpireTagging",
          enabled: true,
          tags: {
            "Scnr-Lifecycle": "expire",
          },
          expiration: {
            days: 1,
          },
        },
        {
          id: "AbortIncompleteMultiPartUploads",
          enabled: true,
          abortIncompleteMultipartUploadDays: 1,
        },
      ],
      serverSideEncryptionConfiguration: {
        rule: {
          applyServerSideEncryptionByDefault: {
            sseAlgorithm: "aws:kms",
          },
          // Enabling bucket keys gives us 99% savings in KMS request costs. See:
          // https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-key.html
          bucketKeyEnabled: true,
        },
      },
    },
    {
      // Since we now allow multiple scanner instances, parameterized by
      // `envName`, we want to keep the original index files bucket for the
      // `dev` env Scanner instance - i.e. we don't want to delete it just
      // because the name changed. Hence, we add an alias to the bucket for
      // just this case.
      aliases:
        envName === "dev" ? [{ name: "ScannerIndexFilesBucket" }] : undefined,
    },
  );

  new aws.s3.BucketPublicAccessBlock(
    `ScannerIndexFilesBucketPublicAccessBlock-${envName}`,
    {
      bucket: scannerIndexFilesS3Bucket.id,
      blockPublicAcls: true,
      blockPublicPolicy: true,
      ignorePublicAcls: true,
      restrictPublicBuckets: true,
    },
  );

  const scannerPolicyStatements = [
    {
      actions: [
        "s3:ListAllMyBuckets",
        "s3:GetBucketLocation",
        "s3:GetBucketTagging",
      ],
      effect: "Allow",
      resources: ["*"],
    },
    {
      actions: [
        "s3:GetBucketEncryption",
        "s3:GetBucketNotification",
        "s3:ListBucket",
        "s3:GetObject",
        "s3:GetObjectTagging",
      ],
      effect: "Allow",
      resources: [
        dataLakeS3BucketArn,
        pulumi.interpolate`${dataLakeS3BucketArn}/*`,
      ],
    },
    {
      actions: [
        "s3:GetLifecycleConfiguration",
        "s3:ListBucket",
        "s3:GetObject",
        "s3:GetObjectTagging",
        "s3:PutObject",
        "s3:PutObjectTagging",
        "s3:DeleteObject",
        "s3:DeleteObjectTagging",
        "s3:DeleteObjectVersion",
        "s3:DeleteObjectVersionTagging",
      ],
      effect: "Allow",
      resources: [
        scannerIndexFilesS3Bucket.arn,
        pulumi.interpolate`${scannerIndexFilesS3Bucket.arn}/*`,
      ],
    },
  ];

  // This statement is only necessary if your `dataLakeS3Bucket` is encrypted with a KMS key.
  if (dataLakeS3BucketKmsKeyArn) {
    scannerPolicyStatements.push({
      actions: ["kms:Decrypt", "kms:DescribeKey"],
      effect: "Allow",
      resources: [dataLakeS3BucketKmsKeyArn!],
    });
  }

  const scannerPolicy = new aws.iam.Policy(`ScannerPolicy-${envName}`, {
    description: "Allow ScannerRole to interact with data lake S3 bucket",
    path: "/",
    policy: aws.iam.getPolicyDocumentOutput({
      statements: scannerPolicyStatements,
    }).json,
  });

  new aws.iam.RolePolicyAttachment(`ScannerRpa-${envName}`, {
    role: scannerRole.name,
    policyArn: scannerPolicy.arn,
  });

  return {
    scannerRole,
  };
}

export function setUpSnsTopicForBucket(
  envName: string,
  bucketName: pulumi.Input<string>,
): aws.sns.Topic {
  const snsTopic = new aws.sns.Topic(`SnsTopic-${envName}`);
  const snsTopicPolicy = new aws.sns.TopicPolicy(`SnsTopicPolicy-${envName}`, {
    arn: snsTopic.arn,
    policy: aws.iam.getPolicyDocumentOutput({
      statements: [
        {
          actions: ["sns:Publish"],
          effect: "Allow",
          principals: [
            {
              type: "Service",
              identifiers: ["s3.amazonaws.com"],
            },
          ],
          resources: [snsTopic.arn],
        },
      ],
    }).json,
  });

  new aws.s3.BucketNotification(
    `BucketNotification-${envName}`,
    {
      bucket: bucketName,
      topics: [
        {
          topicArn: snsTopic.arn,
          events: ["s3:ObjectCreated:*"],
        },
      ],
    },
    { dependsOn: [snsTopic, snsTopicPolicy] },
  );

  return snsTopic;
}

You can include this function in your Pulumi codebase or use it as a starting point as you update your infrastructure code to support what Scanner needs.

If your S3 buckets are in multiple regions

If the S3 buckets that you want to index are in multiple regions, edit this Pulumi file to do the following:

  • Create one aws.sns.Topic per region.

  • Create an aws.sns.TopicSubscription for each SNS topic, all pointing to the same SQS queue in your Scanner instance.

  • Create an aws.s3.BucketNotification for each S3 bucket, and point it to the SNS topic that is in the same region as the S3 bucket.

PreviousTerraformNextCreating S3 Import Rules

Last updated 2 months ago

Was this helpful?