<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>The Wombelix Post - Cloud</title><link href="https://dominik.wombacher.cc/" rel="alternate"/><link href="/feeds/category_cloud.atom.xml" rel="self"/><id>https://dominik.wombacher.cc/</id><updated>2025-07-30T00:00:00+02:00</updated><entry><title>Serverless RAG without monthly costs using AWS Bedrock and S3 Vectors</title><link href="https://dominik.wombacher.cc/posts/serverless-rag-without-monthly-costs-using-aws-bedrock-and-s3-vectors.html" rel="alternate"/><published>2025-07-30T00:00:00+02:00</published><updated>2025-07-30T00:00:00+02:00</updated><author><name>Dominik Wombacher</name></author><id>tag:dominik.wombacher.cc,2025-07-30:/posts/serverless-rag-without-monthly-costs-using-aws-bedrock-and-s3-vectors.html</id><summary type="html">&lt;!-- SPDX-FileCopyrightText: 2025 Dominik Wombacher &lt;dominik@wombacher.cc&gt; --&gt;
&lt;!--  --&gt;
&lt;!-- SPDX-License-Identifier: CC-BY-SA-4.0 --&gt;
&lt;p&gt;I was curious if it's possible to build a Retrieval-Augmented
Generation (RAG) system for an AI chatbot in a pure serverless way without
monthly fixed costs. The main goal was  ... &lt;a class="read-more" href="/posts/serverless-rag-without-monthly-costs-using-aws-bedrock-and-s3-vectors.html"&gt; [read more]&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;!-- SPDX-FileCopyrightText: 2025 Dominik Wombacher &lt;dominik@wombacher.cc&gt; --&gt;
&lt;!--  --&gt;
&lt;!-- SPDX-License-Identifier: CC-BY-SA-4.0 --&gt;
&lt;p&gt;I was curious if it's possible to build a Retrieval-Augmented
Generation (RAG) system for an AI chatbot in a pure serverless way without
monthly fixed costs. The main goal was to test some ideas without
investing much in infrastructure upfront. When AWS launched S3 Vectors
on July 15th, it immediately caught my attention because it promised
exactly what I was looking for.&lt;/p&gt;
&lt;p&gt;Part of the
&lt;a class="reference external" href="https://aws.amazon.com/blogs/aws/introducing-amazon-s3-vectors-first-cloud-storage-with-native-vector-support-at-scale/"&gt;S3 Vectors announcement&lt;/a&gt;
was that it can be used as a vector store in AWS Bedrock Knowledge Base
too, which made it even more interesting. This combination could potentially
solve the cost challenge I was facing with traditional vector databases that
come with monthly fees regardless of usage.&lt;/p&gt;
&lt;p&gt;S3 Vectors is currently in preview and available in five regions:
US East (N. Virginia), US East (Ohio), US West (Oregon), EU Central 1
(Frankfurt), and Asia Pacific (Sydney). I picked eu-central-1, which
is closest to me, for my tests and pricing calculations.
I use around 500 markdown files from the
&lt;a class="reference external" href="https://github.com/rancher/rancher-docs"&gt;Rancher Manager Documentation&lt;/a&gt;
as my test dataset. Unlike regular S3 buckets, S3 Vectors doesn't
require a globally unique name since you reference it by ARN, which
means it only needs to be unique within the same account and region.&lt;/p&gt;
&lt;p&gt;Setting up the Bedrock Knowledge Base was overall pretty straightforward.
The wizard guides you through all the steps, though I created the S3 bucket
and S3 Vectors bucket upfront. I had the
&lt;a class="reference external" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html"&gt;Bedrock Knowledge Base documentation&lt;/a&gt;
and
&lt;a class="reference external" href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html"&gt;S3 Vectors documentation&lt;/a&gt;
open in parallel to learn more and make decisions about parsing strategy, chunking,
vector dimensions and configuration options. The knowledge base, data
source, and vector store all need to be in the same region, which makes
sense from performance and traffic costs perspective.&lt;/p&gt;
&lt;p&gt;I chose the Amazon Bedrock default parser as my parsing strategy. This
parser claims to work well with various file formats including markdown, HTML,
PDF, and Office documents. The main advantage is that it doesn't incur
additional charges, making it perfect for projects where you want to
keep costs low. Since my content was primarily text-based markdown
files, the default parser seems more than sufficient for this use case.&lt;/p&gt;
&lt;p&gt;For the vector configuration, I used a 1024-dimension index aligned
with the
&lt;a class="reference external" href="https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html"&gt;Amazon Titan Text Embeddings V2&lt;/a&gt;
model. I also went with the default chunking strategy, which splits
content into approximately 300-token chunks while preserving sentence
boundaries. I want to play around with this another time and see if it
would make sense and bring an improvement raising the chunk size. The
embedding model can handle up to 8192 tokens.&lt;/p&gt;
&lt;p&gt;The data flow is now: From the S3 source bucket through the AWS Bedrock
parser, processed by the embedding model, stored in the
vector storage. Testing the result and interacting with the synced
data is easy using the
&lt;a class="reference external" href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-chatdoc.html"&gt;Chat with your document&lt;/a&gt;
feature in the Amazon Bedrock console. So, to see first results,
you don't have to build out your Chatbot interface yet.&lt;/p&gt;
&lt;p&gt;Several things became clear during my test setup. The overall
process is simpler as expected, with the wizard handling
resource creation and permissions.
Since S3 Vectors is in preview, it's not yet available in
infrastructure-as-code tools like the AWS Terraform provider. I didn't
check if it's available in CloudFormation though. I would love to
manage the knowledge base and related resources through IaC next time
instead of the click-ops through the console.&lt;/p&gt;
&lt;p&gt;The cost aspect, which was my primary motivation, exceeded expectations.
With S3 Vectors, we're talking about the cost of a cup of coffee when
testing or building a proof of concept. This opens up possibilities for
builders who want to experiment with ideas without upfront investment.&lt;/p&gt;
&lt;p&gt;One challenge I encountered was with permissions. The Knowledge Base
wizard is strict and defaults to claiming ownership of the entire
bucket, at least from an IAM role perspective. The managed roles don't
cover individual subfolders / prefixes someone might create when uploading data.
Manual IAM adjustments work but can cause issues when editing the
knowledge base later. There's likely a better way to configure this
properly from the start. I have to figure out if the answer is
really one source S3 bucket per knowledge base or if there's a better
way to leverage prefixes without the clunky IAM role and policy
handling.&lt;/p&gt;
&lt;p&gt;I'm also curious about further reducing costs for the S3 data source,
which could grow over time.
&lt;a class="reference external" href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/intelligent-tiering-overview.html"&gt;S3 Intelligent-Tiering&lt;/a&gt;
might help reduce ongoing storage costs. The knowledge base index can be
recreated from the source data in the S3 bucket, though that will cause
costs for using the embedding model again.&lt;/p&gt;
&lt;p&gt;Embedding tokens are another cost factor, though not a major one. My
test with approximately 5MB of Rancher documentation in markdown format
resulted in about 1 million tokens, translating to roughly $0.02 USD
with Amazon Titan Text Embeddings V2, which doesn't seem like much.
Using batch processing outside of Bedrock could potentially cut costs
in half, but batch processing doesn't work in the context of Bedrock
Knowledge Base. The knowledge base is convenient to use, but if you
build your own stack instead, then batching can become a cost saver,
especially at a higher scale beyond simple testing.&lt;/p&gt;
&lt;p&gt;I ran into an issue with metadata handling. Using S3 Vectors as vector
storage with Bedrock Knowledge Base requires adding the Bedrock-related
metadata keys as non-filterable, otherwise the sync fails. By default,
all metadata keys in S3 Vectors are considered filterable, but Bedrock's
metadata exceeds the 2KB limit for filterable metadata. I found the
solution in the
&lt;a class="reference external" href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html"&gt;S3 Vectors documentation&lt;/a&gt;
and this
&lt;a class="reference external" href="https://repost.aws/questions/QUWezLMjc0S8GOiaa3jOOKGQ/s3-vector-big-metadata-error"&gt;AWS re:Post discussion&lt;/a&gt;:
create the vector index with &lt;code&gt;AMAZON_BEDROCK_TEXT&lt;/code&gt; and
&lt;code&gt;AMAZON_BEDROCK_METADATA&lt;/code&gt; as non-filterable metadata keys.&lt;/p&gt;
&lt;p&gt;Additional (slight) costs occur when interacting with the knowledge base,
for the input and output tokens of the leveraged LLM
through AWS Bedrock. For example, something between $0.05 and $0.10 USD
per 1 Million Tokens when using one of the
&lt;a class="reference external" href="https://aws.amazon.com/bedrock/pricing/"&gt;Amazon Nova Models&lt;/a&gt;.
Based on the S3 Vectors pricing for &lt;code&gt;eu-central-1&lt;/code&gt;,
storage costs $0.064 per GB per month, PUT requests cost $0.214 per GB,
and query requests are $0.0027 per 1,000 requests. For small datasets
like my 5MB test, these costs remain minimal.&lt;/p&gt;
&lt;p&gt;With my tests I achieved what I wanted. Vector storage
was previously the component with higher monthly costs in Bedrock
Knowledge Base, but S3 Vectors makes this purely pay-per-use. Now
we're talking about costs comparable to a cup of coffee for testing and
building MVPs, just perfect for experimentation and idea validation.
Even though I didn't run the numbers, I wouldn't be surprised if this
is even up to a specific scale extremely interesting and cost efficient.&lt;/p&gt;
</content><category term="Cloud"/><category term="AWS"/><category term="Bedrock"/><category term="S3"/><category term="Vectors"/><category term="RAG"/><category term="AI"/></entry><entry><title>OpenTofu State and Plan Encryption with AWS KMS</title><link href="https://dominik.wombacher.cc/posts/opentofu-state-and-plan-encryption-with-aws-kms.html" rel="alternate"/><published>2024-12-22T00:00:00+01:00</published><updated>2024-12-22T00:00:00+01:00</updated><author><name>Dominik Wombacher</name></author><id>tag:dominik.wombacher.cc,2024-12-22:/posts/opentofu-state-and-plan-encryption-with-aws-kms.html</id><summary type="html">&lt;!-- SPDX-FileCopyrightText: 2024 Dominik Wombacher &lt;dominik@wombacher.cc&gt; --&gt;
&lt;!--  --&gt;
&lt;!-- SPDX-License-Identifier: CC-BY-SA-4.0 --&gt;
&lt;p&gt;Its been a while since &lt;a class="reference external" href="https://opentofu.org/blog/opentofu-1-7-0/"&gt;OpenTofu 1.7.0&lt;/a&gt;
(Archive: &lt;a class="reference external" href="https://archive.today/2024.04.30-155242/https://opentofu.org/blog/opentofu-1-7-0/"&gt;[1]&lt;/a&gt;,
&lt;a class="reference external" href="https://web.archive.org/web/20250113215433/https://opentofu.org/blog/opentofu-1-7-0/"&gt;[2]&lt;/a&gt;)
introduced an exciting new feature, the
&lt;a class="reference external" href="https://opentofu.org/docs/language/state/encryption/"&gt;State and Plan Encryption&lt;/a&gt;
(Archive: &lt;a class="reference external" href="https://web.archive.org/web/20241215184404/https://opentofu.org/docs/language/state/encryption/"&gt;[1]&lt;/a&gt;,
&lt;a class="reference external" href="https://archive.today/2025.01.13-221839/https://opentofu.org/docs/language/state/encryption/"&gt;[2]&lt;/a&gt;)
for files at rest, for  ... &lt;a class="read-more" href="/posts/opentofu-state-and-plan-encryption-with-aws-kms.html"&gt; [read more]&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;!-- SPDX-FileCopyrightText: 2024 Dominik Wombacher &lt;dominik@wombacher.cc&gt; --&gt;
&lt;!--  --&gt;
&lt;!-- SPDX-License-Identifier: CC-BY-SA-4.0 --&gt;
&lt;p&gt;Its been a while since &lt;a class="reference external" href="https://opentofu.org/blog/opentofu-1-7-0/"&gt;OpenTofu 1.7.0&lt;/a&gt;
(Archive: &lt;a class="reference external" href="https://archive.today/2024.04.30-155242/https://opentofu.org/blog/opentofu-1-7-0/"&gt;[1]&lt;/a&gt;,
&lt;a class="reference external" href="https://web.archive.org/web/20250113215433/https://opentofu.org/blog/opentofu-1-7-0/"&gt;[2]&lt;/a&gt;)
introduced an exciting new feature, the
&lt;a class="reference external" href="https://opentofu.org/docs/language/state/encryption/"&gt;State and Plan Encryption&lt;/a&gt;
(Archive: &lt;a class="reference external" href="https://web.archive.org/web/20241215184404/https://opentofu.org/docs/language/state/encryption/"&gt;[1]&lt;/a&gt;,
&lt;a class="reference external" href="https://archive.today/2025.01.13-221839/https://opentofu.org/docs/language/state/encryption/"&gt;[2]&lt;/a&gt;)
for files at rest, for local storage and remote backends.
I always found it challenging to keep &lt;code&gt;.tfstate&lt;/code&gt; files secure.
Now I can use &lt;a class="reference external" href="https://aws.amazon.com/kms/"&gt;Amazon Web Servers Key Management Service (AWS KMS)&lt;/a&gt;
with a customer managed KMS key to encrypt the state before it's uploaded to
&lt;a class="reference external" href="https://aws.amazon.com/s3/"&gt;Amazon Simple Storage Service (Amazon S3)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I decided to use CloudFormation to create a
&lt;a class="reference external" href="https://git.sr.ht/~wombelix/aws-sideprojects-infrastructure/tree/main/item/cfn/kms-key-backend-encryption.yaml"&gt;KMS key&lt;/a&gt;
in &lt;code&gt;eu-central-1&lt;/code&gt;, Europe (Frankfurt) and a
&lt;a class="reference external" href="https://git.sr.ht/~wombelix/aws-sideprojects-infrastructure/tree/main/item/cfn/kms-key-backend-encryption-replica.yaml"&gt;replica&lt;/a&gt;
in &lt;code&gt;eu-west-1&lt;/code&gt;, Europe (Ireland) for backup purposes. Other AWS resources I've created:
&lt;a class="reference external" href="https://git.sr.ht/~wombelix/aws-sideprojects-infrastructure/tree/main/item/cfn/iac-opentofu.yaml"&gt;IAM User, IAM Roles, IAM Policies, S3 Bucket, DynamoDB Table and AWS Lambda function.&lt;/a&gt;.
The Lambda function is based on
&lt;a class="reference external" href="https://git.sr.ht/~wombelix/cfn-custom-resource-aws-ssm-securestring"&gt;CloudFormation Custom Resource AWS SSM Parameter Store SecureString&lt;/a&gt;
and is necessary because of
&lt;a class="reference external" href="https://dominik.wombacher.cc/posts/aws-cloudformation-and-cdk-doesnt-support-aws-ssm-parameter-store-securestring.html"&gt;AWS CloudFormation and CDK doesn't support AWS SSM Parameter Store SecureString?!&lt;/a&gt;.
Access and Secret Key for the IAM User are then auto-generated by CloudFormation and stored in AWS SSM Parameter Store.&lt;/p&gt;
&lt;p&gt;When the AWS resource are ready, an example config to use state encryption and a S3 based remote backed looks like this:&lt;/p&gt;
&lt;pre class="code text literal-block"&gt;
# SPDX-FileCopyrightText: 2024 Dominik Wombacher &amp;lt;dominik&amp;#64;wombacher.cc&amp;gt;
#
# SPDX-License-Identifier: MIT

terraform {
  required_version = &amp;quot;&amp;gt;= 1.8&amp;quot;
  encryption {
    key_provider &amp;quot;aws_kms&amp;quot; &amp;quot;wombelix-sideprojects&amp;quot; {
      kms_key_id = &amp;quot;arn:${var.aws_partition}:kms:${var.aws_region}:${var.aws_account_id}:key/${var.aws_kms_name}&amp;quot;
      region     = var.aws_region
      key_spec   = &amp;quot;AES_256&amp;quot;
      assume_role = {
        role_arn = &amp;quot;arn:${var.aws_partition}:iam::${var.aws_account_id}:role/OpenTofuStateEncryptionRole&amp;quot;
      }
    }
    method &amp;quot;aes_gcm&amp;quot; &amp;quot;wombelix-sideprojects&amp;quot; {
      keys = key_provider.aws_kms.wombelix-sideprojects
    }
    state {
      method = method.aes_gcm.wombelix-sideprojects
    }
  }
  backend &amp;quot;s3&amp;quot; {
    bucket                  = var.aws_s3_bucket
    key                     = &amp;quot;opentofu-states/${var.project}/terraform.tfstate&amp;quot;
    region                  = var.aws_region
    skip_metadata_api_check = true
    encrypt                 = true
    kms_key_id              = &amp;quot;arn:${var.aws_partition}:kms:${var.aws_region}:${var.aws_account_id}:key/${var.aws_kms_name}&amp;quot;
    dynamodb_table          = &amp;quot;arn:${var.aws_partition}:dynamodb:${var.aws_region}:${var.aws_account_id}:table/iac-opentofu-remote-backend&amp;quot;
    assume_role = {
      role_arn = &amp;quot;arn:${var.aws_partition}:iam::${var.aws_account_id}:role/OpenTofuRemoteBackendRole&amp;quot;
    }
  }
}

&lt;/pre&gt;
&lt;p&gt;For the above example config to interact with AWS, the following Environment variables have to be set:&lt;/p&gt;
&lt;pre class="code text literal-block"&gt;
AWS_REGION
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
TF_VAR_aws_region
TF_VAR_aws_account_id
TF_VAR_aws_kms_name
TF_VAR_aws_s3_bucket
TF_VAR_project

&lt;/pre&gt;
&lt;p&gt;After bootstrapping with CloudFormation, all subsequent IaC can be implemented with OpenTofu.
Each project gets its own unique S3 key in the form of &lt;code&gt;opentofu-states/&amp;lt;PROJECT&amp;gt;/terraform.tfstate&lt;/code&gt;.
That's all the customization it needs, which makes the solution basically maintenance free.&lt;/p&gt;
&lt;p&gt;I like that the content of the state file is always encrypted. I also encrypt the S3 Bucket with the the customer managed key.
When I use OpenTofu on my local system or in sr.ht builds, I leverage long-term Access and Secret key credentials.
But the IAM User has no permissions directly assigned and can only assume a specific Role that allows access to KMS, S3 and DynamoDB.
&lt;code&gt;s3:DeleteObject&lt;/code&gt; is explicitly set to &lt;code&gt;Deny&lt;/code&gt; and versioning enabled on the S3 Bucket.
The potential attack surface is therefore very limited. In future I plan to avoid any usage of long-term credentials.&lt;/p&gt;
&lt;p&gt;But for now the setup is already pretty decent and secure.&lt;/p&gt;
</content><category term="Cloud"/><category term="OpenTofu"/><category term="Amazon"/><category term="AWS"/><category term="KMS"/><category term="S3"/><category term="DynamoDB"/></entry><entry><title>AWS CloudFormation and CDK doesn't support AWS SSM Parameter Store SecureString?!</title><link href="https://dominik.wombacher.cc/posts/aws-cloudformation-and-cdk-doesnt-support-aws-ssm-parameter-store-securestring.html" rel="alternate"/><published>2024-05-12T00:00:00+02:00</published><updated>2024-07-17T00:00:00+02:00</updated><author><name>Dominik Wombacher</name></author><id>tag:dominik.wombacher.cc,2024-05-12:/posts/aws-cloudformation-and-cdk-doesnt-support-aws-ssm-parameter-store-securestring.html</id><summary type="html">&lt;!-- SPDX-FileCopyrightText: 2024 Dominik Wombacher &lt;dominik@wombacher.cc&gt; --&gt;
&lt;!--  --&gt;
&lt;!-- SPDX-License-Identifier: CC-BY-SA-4.0 --&gt;
&lt;p&gt;I recently started to set up some resources on AWS for my side projects.
For starters an AWS KMS key so I can encrypt data on S3 and in the  ... &lt;a class="read-more" href="/posts/aws-cloudformation-and-cdk-doesnt-support-aws-ssm-parameter-store-securestring.html"&gt; [read more]&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;!-- SPDX-FileCopyrightText: 2024 Dominik Wombacher &lt;dominik@wombacher.cc&gt; --&gt;
&lt;!--  --&gt;
&lt;!-- SPDX-License-Identifier: CC-BY-SA-4.0 --&gt;
&lt;p&gt;I recently started to set up some resources on AWS for my side projects.
For starters an AWS KMS key so I can encrypt data on S3 and in the AWS SSM Parameter Store.
To use S3 and DynamoDB as backend and perform end-to-end state encryption for OpenTofu,
I also needed an IAM User. So the Idea was to write a CloudFormation template that
creates all these resources for me and then use it to deploy other Infrastructure as code via OpenTofu.
I'm not a huge fan of IAM Users and access keys, but in this case good enough to get started.&lt;/p&gt;
&lt;p&gt;What I wanted: The generated access and secret key are stored in AWS SSM Parameter store.
That way I don't have to deal with clear text credentials in CloudFormation.&lt;/p&gt;
&lt;p&gt;SSM Parameter Store can save Strings and SecureStrings. As the name implies, a SecureString
is encrypted via AWS KMS before put into SSM Parameter Store. But then I learned, neither Cfn nor CDK
support it. They can only write clear text Strings to the Parameter Store. What a bummer and pretty unexpected.&lt;/p&gt;
&lt;p&gt;So after some research, a Cfn CustomResource is what I need. It's basically a Lambda function
that receives a Create/Update/Delete request from Cfn, performs an action and sends the result back to the Stack.
It took me a bit to get something together but now it works like a charm.&lt;/p&gt;
&lt;p&gt;I'm still a bit disappointed that such a common feature isn't supported. Arguments are mostly
that Cfn and CDK are not supposed to deal with secrets. I can understand that, but putting some
data that were generated during a Cfn run into the parameter store can't be that unique.&lt;/p&gt;
&lt;p&gt;I published my Lambda Function to interact with AWS SSM Parameter Store SecureString under MIT:
&lt;a class="reference external" href="https://git.sr.ht/~wombelix/cfn-custom-resource-aws-ssm-securestring"&gt;https://git.sr.ht/~wombelix/cfn-custom-resource-aws-ssm-securestring&lt;/a&gt;&lt;/p&gt;
</content><category term="Cloud"/><category term="AWS"/><category term="SSM"/><category term="CloudFormation"/><category term="CDK"/><category term="Lambda"/></entry></feed>