Building AI-Ready Infrastructure: Terraform Modules for GraphRAG on AWS

RAG (Retrieval-Augmented Generation) has become the go-to pattern for grounding LLMs with enterprise data. Ask a question, retrieve relevant documents, generate an answer based on those documents. It works remarkably well for many use cases.
But traditional vector-only RAG has limitations. It struggles with relationships between entities, loses context across document boundaries, and can't handle multi-hop reasoning where answering one question requires answering several intermediate questions first.
Enter GraphRAG: combining knowledge graphs with vector search for smarter retrieval. Instead of just finding similar text, you can traverse relationships. "Who reported to the CEO in Q3?" becomes answerable because you have entity relationships, not just document embeddings.
The infrastructure for GraphRAG is more complex than basic RAG - you need both a graph database and a vector store. But Terraform makes it manageable and repeatable.
Let me walk you through building production-ready GraphRAG infrastructure on AWS.
______
GraphRAG Architecture Overview

- Neptune stores entities and relationships as a knowledge graph. Think of it as the "who knows whom" and "what relates to what" layer.
- OpenSearch handles vector embeddings for semantic search. This is the "find similar content" layer.
Together, they enable hybrid retrieval - combining graph traversal with similarity search. The graph tells you what's connected; the vectors tell you what's similar.
______
Module Structure
1modules/
2├── graphrag/
3│ ├── main.tf
4│ ├── variables.tf
5│ ├── outputs.tf
6│ ├── neptune.tf
7│ ├── opensearch.tf
8│ ├── s3.tf
9│ ├── lambda.tf
10│ └── iam.tf
______
Core Infrastructure Module
Variables Definition
1# modules/graphrag/variables.tf
2variable "environment" {
3 description = "Environment name (dev, staging, prod)"
4 type = string
5}
6
7variable "project_name" {
8 description = "Project identifier"
9 type = string
10}
11
12variable "vpc_id" {
13 description = "VPC ID for deploying resources"
14 type = string
15}
16
17variable "private_subnet_ids" {
18 description = "Private subnet IDs"
19 type = list(string)
20}
21
22variable "neptune_instance_class" {
23 description = "Neptune instance type"
24 type = string
25 default = "db.r5.large"
26}
27
28variable "opensearch_instance_type" {
29 description = "OpenSearch instance type"
30 type = string
31 default = "r6g.large.search"
32}
33
34variable "opensearch_volume_size" {
35 description = "OpenSearch EBS volume size in GB"
36 type = number
37 default = 100
38}Neptune Graph Database
1# modules/graphrag/neptune.tf
2resource "aws_neptune_cluster" "graph" {
3 cluster_identifier = "${var.project_name}-${var.environment}"
4 engine = "neptune"
5 engine_version = "1.3.1.0"
6 backup_retention_period = 7
7 preferred_backup_window = "02:00-03:00"
8 skip_final_snapshot = var.environment != "prod"
9 iam_database_authentication_enabled = true
10 storage_encrypted = true
11 kms_key_arn = aws_kms_key.graphrag.arn
12
13 vpc_security_group_ids = [aws_security_group.neptune.id]
14 neptune_subnet_group_name = aws_neptune_subnet_group.main.name
15
16 tags = local.common_tags
17}
18
19resource "aws_neptune_cluster_instance" "graph" {
20 count = var.environment == "prod" ? 2 : 1
21 cluster_identifier = aws_neptune_cluster.graph.id
22 instance_class = var.neptune_instance_class
23 engine = "neptune"
24
25 tags = local.common_tags
26}
27
28resource "aws_neptune_subnet_group" "main" {
29 name = "${var.project_name}-${var.environment}"
30 subnet_ids = var.private_subnet_ids
31
32 tags = local.common_tags
33}
34
35resource "aws_security_group" "neptune" {
36 name = "${var.project_name}-neptune-${var.environment}"
37 description = "Security group for Neptune cluster"
38 vpc_id = var.vpc_id
39
40 ingress {
41 description = "Neptune from application"
42 from_port = 8182
43 to_port = 8182
44 protocol = "tcp"
45 security_groups = [aws_security_group.application.id]
46 }
47
48 egress {
49 from_port = 0
50 to_port = 0
51 protocol = "-1"
52 cidr_blocks = ["0.0.0.0/0"]
53 }
54
55 tags = local.common_tags
56}OpenSearch Vector Store
1# modules/graphrag/opensearch.tf
2resource "aws_opensearch_domain" "vectors" {
3 domain_name = "${var.project_name}-${var.environment}"
4 engine_version = "OpenSearch_2.11"
5
6 cluster_config {
7 instance_type = var.opensearch_instance_type
8 instance_count = var.environment == "prod" ? 3 : 1
9 zone_awareness_enabled = var.environment == "prod"
10
11 dynamic "zone_awareness_config" {
12 for_each = var.environment == "prod" ? [1] : []
13 content {
14 availability_zone_count = 3
15 }
16 }
17 }
18
19 ebs_options {
20 ebs_enabled = true
21 volume_size = var.opensearch_volume_size
22 volume_type = "gp3"
23 }
24
25 encrypt_at_rest {
26 enabled = true
27 kms_key_id = aws_kms_key.graphrag.key_id
28 }
29
30 node_to_node_encryption {
31 enabled = true
32 }
33
34 vpc_options {
35 subnet_ids = var.environment == "prod" ? var.private_subnet_ids : [var.private_subnet_ids[0]]
36 security_group_ids = [aws_security_group.opensearch.id]
37 }
38
39 advanced_security_options {
40 enabled = true
41 internal_user_database_enabled = false
42 master_user_options {
43 master_user_arn = aws_iam_role.opensearch_master.arn
44 }
45 }
46
47 domain_endpoint_options {
48 enforce_https = true
49 tls_security_policy = "Policy-Min-TLS-1-2-2019-07"
50 }
51
52 tags = local.common_tags
53}
54
55resource "aws_security_group" "opensearch" {
56 name = "${var.project_name}-opensearch-${var.environment}"
57 description = "Security group for OpenSearch domain"
58 vpc_id = var.vpc_id
59
60 ingress {
61 description = "HTTPS from application"
62 from_port = 443
63 to_port = 443
64 protocol = "tcp"
65 security_groups = [aws_security_group.application.id]
66 }
67
68 egress {
69 from_port = 0
70 to_port = 0
71 protocol = "-1"
72 cidr_blocks = ["0.0.0.0/0"]
73 }
74
75 tags = local.common_tags
76}Document Storage and Processing Trigger
1# modules/graphrag/s3.tf
2resource "aws_s3_bucket" "documents" {
3 bucket = "${var.project_name}-documents-${var.environment}-${data.aws_caller_identity.current.account_id}"
4
5 tags = local.common_tags
6}
7
8resource "aws_s3_bucket_versioning" "documents" {
9 bucket = aws_s3_bucket.documents.id
10 versioning_configuration {
11 status = "Enabled"
12 }
13}
14
15resource "aws_s3_bucket_server_side_encryption_configuration" "documents" {
16 bucket = aws_s3_bucket.documents.id
17
18 rule {
19 apply_server_side_encryption_by_default {
20 kms_master_key_id = aws_kms_key.graphrag.arn
21 sse_algorithm = "aws:kms"
22 }
23 }
24}
25
26# Trigger Lambda when documents are uploaded
27resource "aws_s3_bucket_notification" "document_upload" {
28 bucket = aws_s3_bucket.documents.id
29
30 lambda_function {
31 lambda_function_arn = aws_lambda_function.document_processor.arn
32 events = ["s3:ObjectCreated:*"]
33 filter_prefix = "uploads/"
34 filter_suffix = ".pdf"
35 }
36
37 depends_on = [aws_lambda_permission.s3_invoke]
38}Document Processing Lambda
1# modules/graphrag/lambda.tf
2resource "aws_lambda_function" "document_processor" {
3 function_name = "${var.project_name}-doc-processor-${var.environment}"
4 role = aws_iam_role.lambda_processor.arn
5 handler = "handler.process_document"
6 runtime = "python3.11"
7 timeout = 300
8 memory_size = 1024
9
10 filename = data.archive_file.lambda_package.output_path
11 source_code_hash = data.archive_file.lambda_package.output_base64sha256
12
13 vpc_config {
14 subnet_ids = var.private_subnet_ids
15 security_group_ids = [aws_security_group.application.id]
16 }
17
18 environment {
19 variables = {
20 NEPTUNE_ENDPOINT = aws_neptune_cluster.graph.endpoint
21 OPENSEARCH_ENDPOINT = aws_opensearch_domain.vectors.endpoint
22 ENVIRONMENT = var.environment
23 }
24 }
25
26 tags = local.common_tags
27}
28
29resource "aws_lambda_permission" "s3_invoke" {
30 statement_id = "AllowS3Invoke"
31 action = "lambda:InvokeFunction"
32 function_name = aws_lambda_function.document_processor.function_name
33 principal = "s3.amazonaws.com"
34 source_arn = aws_s3_bucket.documents.arn
35}Module Outputs
1# modules/graphrag/outputs.tf
2output "neptune_endpoint" {
3 description = "Neptune cluster endpoint"
4 value = aws_neptune_cluster.graph.endpoint
5}
6
7output "neptune_reader_endpoint" {
8 description = "Neptune cluster reader endpoint"
9 value = aws_neptune_cluster.graph.reader_endpoint
10}
11
12output "opensearch_endpoint" {
13 description = "OpenSearch domain endpoint"
14 value = aws_opensearch_domain.vectors.endpoint
15}
16
17output "documents_bucket" {
18 description = "S3 bucket for document uploads"
19 value = aws_s3_bucket.documents.id
20}
21
22output "application_security_group_id" {
23 description = "Security group ID for applications needing GraphRAG access"
24 value = aws_security_group.application.id
25}______
Using the Module
1# environments/dev/main.tf
2module "graphrag" {
3 source = "../../modules/graphrag"
4
5 environment = "dev"
6 project_name = "enterprise-search"
7 vpc_id = module.vpc.vpc_id
8 private_subnet_ids = module.vpc.private_subnets
9
10 neptune_instance_class = "db.r5.large"
11 opensearch_instance_type = "r6g.large.search"
12 opensearch_volume_size = 50
13}______
Cost Optimization Tips
- Dev/Test environments - Use smaller instance types and single-AZ deployments. The module handles this via the
environmentvariable. - Neptune Serverless - For variable workloads, consider Neptune Serverless instead of provisioned instances.
- OpenSearch UltraWarm - For older vector data that's queried less frequently, enable UltraWarm storage tier.
- S3 Lifecycle policies - Archive processed documents to Glacier after 90 days.
______
Wrapping Up
GraphRAG infrastructure is inherently complex - you're running multiple databases, managing networking between them, handling document processing, and ensuring security across all components. Terraform modules bring sanity to this complexity by giving you repeatable, version-controlled deployments.
Start with dev, validate your retrieval patterns work as expected, then scale to production with confidence knowing the infrastructure is identical. The module approach means you can iterate on improvements and roll them out consistently across environments.
AI infrastructure is evolving rapidly. Having your foundation in Terraform means you can adapt as new services and patterns emerge - without rebuilding from scratch each time.
