Listen to this Post

Using managed services on AWS allows you to create powerful applications with minimal provisioning. This example demonstrates how to use AWS Lambda to build a sitemap of files in an S3 bucket using an event-driven approach. The process is automated with Terraform for easy deployment and cleanup.
Key Components:
- AWS Lambda – Serverless compute to process S3 events.
- Amazon S3 – Storage for files and generated sitemap.
- OpenGraph Protocol – Defines metadata for web content.
- Terraform – Infrastructure as Code (IaC) for deployment.
You Should Know:
1. Terraform Setup for AWS Lambda & S3
provider "aws" {
region = "us-east-1"
}
resource "aws_lambda_function" "sitemap_generator" {
filename = "lambda_function.zip"
function_name = "s3_sitemap_generator"
role = aws_iam_role.lambda_role.arn
handler = "lambda_function.lambda_handler"
runtime = "python3.8"
}
resource "aws_s3_bucket" "data_bucket" {
bucket = "my-sitemap-bucket"
acl = "private"
}
resource "aws_lambda_permission" "allow_s3" {
statement_id = "AllowExecutionFromS3"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.sitemap_generator.function_name
principal = "s3.amazonaws.com"
source_arn = aws_s3_bucket.data_bucket.arn
}
2. Python Lambda Function for Sitemap Generation
import boto3
import json
from opengraph import OpenGraph
s3 = boto3.client('s3')
def lambda_handler(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
Fetch OpenGraph metadata
obj = s3.get_object(Bucket=bucket, Key=key)
html_content = obj['Body'].read().decode('utf-8')
og_data = OpenGraph(html=html_content)
Generate sitemap entry
sitemap_entry = f"<url><loc>{og_data.url}</loc><title>{og_data.title}</title></url>"
Append to sitemap.xml
s3.put_object(
Bucket=bucket,
Key="sitemap.xml",
Body=sitemap_entry,
ContentType="application/xml"
)
3. Triggering Lambda on S3 Upload
resource "aws_s3_bucket_notification" "bucket_notification" {
bucket = aws_s3_bucket.data_bucket.id
lambda_function {
lambda_function_arn = aws_lambda_function.sitemap_generator.arn
events = ["s3:ObjectCreated:"]
}
}
4. Deploying with Terraform
terraform init terraform plan terraform apply -auto-approve
5. Destroying Resources
terraform destroy -auto-approve
What Undercode Say:
This approach leverages AWS serverless architecture for efficient, scalable file processing. Using Terraform ensures reproducibility, while OpenGraph enhances metadata handling. For further optimization:
– Add error handling in Lambda for malformed HTML.
– Use DynamoDB to track processed files.
– Enable CloudWatch Logs for debugging.
Expected Output:
A dynamically updated `sitemap.xml` in your S3 bucket, listing all processed files with OpenGraph metadata.
Prediction:
As serverless adoption grows, more enterprises will shift to event-driven architectures for real-time data processing, reducing operational overhead.
Reference: AWS S3 & Lambda Example
IT/Security Reporter URL:
Reported By: Darryl Ruggles – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅


