How to Use Cert-Manager DNS Challenge With Cloudflare on Kubernetes With Helm

Dec 22nd, 2023 8:04 am

In this tutorial, we will be issuing Let’s Encrypt certificates using cert-manager on Kubernetes and we will be using the DNS Challenge with Cloudflare.

The reason I am using DNS Challenge instead of HTTP Challenge is because the Kubernetes environment is local on my laptop and there isn’t a direct HTTP route into my environment from the internet and I would like to not expose the endpoints to the public internet.

Summary of what we will be doing

We would like to have Let’s Encrypt Certificates on our web application that will be issued by Cert-Manager using the DNS Challenge from CloudFlare.

Our ingress controller will be ingress-nginx and our endpoints will be private, as they will resolve to private IP addresses, hence the reason why we are using DNS validation instead of HTTP.

Pre-Requisites

To follow along in this tutorial you will need the following

https://blog.ruanbekker.com/blog/2022/09/20/kind-for-local-kubernetes-clusters/
Helm
Kubectl
Cloudflare Account
Patience (just kidding, I will try my best to make it easy)

Install a Kubernetes Cluster

If you already have a Kubernetes Cluster, you can skip this step.

Define the kind-config.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.26.6@sha256:6e2d8b28a5b601defe327b98bd1c2d1930b49e5d8c512e1895099e4504007adb
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
    listenAddress: "0.0.0.0"
  - containerPort: 443
    hostPort: 443
    protocol: TCP

Then create the cluster with kind:

kind create cluster --name example --config kind-config.yaml

Nginx Ingress Controller

First we need to install a ingress controller and I am opting in to use ingress-nginx, so first we need to add the helm repository to our local repositories:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx

Then we need to update our repositories:

helm repo update

Then we can install the helm release:

helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.kind=DaemonSet \
  --set controller.hostPort.enabled=true \
  --set controller.ingressClass=nginx

You can view all the default values from their GitHub repository where the chart is hosted:

https://github.com/kubernetes/ingress-nginx/blob/main/charts/ingress-nginx/values.yaml

Once the release has been deployed, you should see the ingress-nginx pod running under the ingress-nginx namespace:

kubectl get pods -n ingress-nginx

Cert-Manager

The next step is to install cert-manager using helm, first add the repository:

helm repo add jetstack https://charts.jetstack.io

Update the repositories:

helm repo update

Then install the cert-manager release:

helm upgrade --install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.13.1 \
  --set installCRDs=true

Cloudflare API Token

We need to grant Cert-Manager access to make DNS changes on our Cloudflare account for DNS validation on our behalf, and in order to do that, we need to create a Cloudflare API Token.

As per the cert-manager documentation, from your profile select API Tokens, create an API Token and select Edit Zone DNS template.

Then select the following:

Permissions:
- Zone: DNS -> Edit
- Zone: Zone -> Read
Zone Resources:
- Include -> All Zones

Then create the token and save the value somewhere safe, as we will be using it in the next step.

Cert-Manager ClusterIssuer

First, we need to create a Kubernetes secret with the API Token that we created in the previous step.

kubectl create secret generic cloudflare-api-key-secret \
  --from-literal=api-key=[YOUR_CLOUDFLARE_API_KEY]

Then create the clusterissuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-dns01-issuer
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: you@example.com  # your email address for updates
    privateKeySecretRef:
      name: letsencrypt-dns01-private-key
    solvers:
    - dns01:
        cloudflare:
          email: you@example.com # your cloudflare account email address
          apiTokenSecretRef:
            name: cloudflare-api-key-secret
            key: api-key

Then create the cluster issuer:

kubectl apply -f clusterissuer.yaml

Request a Certificate

Now that we have our ClusterIssuer created, we can request a certificate. In my scenario, I have a domain example.com which is hosted on CloudFlare and I would like to create a wildcard certificate on the sub-domain *.workshop.example.com

Certificates are scoped on a namespace level, and ClusterIssuer’s are cluster-wide, therefore I am prefixing my certificate with the namespace (just my personal preference).

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: default-workshop-certificate
  namespace: default
spec:
  secretName: default-workshop-example-tls
  issuerRef:
    name: letsencrypt-dns01-issuer
    kind: ClusterIssuer
  commonName: workshop.example.com
  dnsNames:
  - workshop.example.com
  - '*.workshop.example.com'

Before we create the certificate on CloudFlare, I have created private DNS to the names mentioned in the manifest above like the following:

- workshop.example.com -> A Record -> 10.5.24.254
- *.workshop.example.com -> CNAME -> workshop.example.com

In the DNS configuration mentioned above, to explain why I am creating 2 entries:

10.2.24.254 - This is my LoadBalancer IP Address
I have a static DNS entry to the name workshop.example.com so if my LoadBalancer IP Address ever change, I can just change this address
I am creating a wildcard DNS entry for *.workshop.example.com and I am creating a CNAME record for it to resolve to workshop.example.com so it will essentially respond to the LoadBalancer IP.
So lets say I create test1.workshop.example.com and test2.workshop.example.com then it will resolve to the LoadBalancer IP in workshop.example.com and as mentioned before, if the LoadBalancer IP ever changes, I only have to update the A Record of workshop.example.com

Then after DNS was created, I went ahead and created the certificate:

kubectl apply -f certificate.yaml

You can view the progress by viewing the certificate status by running:

kubectl get certificate -n default

Specify the Certificate in your Ingress

Let’s deploy a nginx web server deployment and I have concatenated the following in one manifest called deployment.yaml:

Deployment
Service
Ingress

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-web
  namespace: default
  labels:
    app: nginx-web
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-web
  template:
    metadata:
      labels:
        app: nginx-web
    spec:
      containers:
      - name: nginx
        image: nginx:1.19
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-web-service
  namespace: default
  labels:
    app: nginx-web
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: nginx-web
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-web-ingress
  namespace: default
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: nginx.workshop.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx-web-service
            port:
              number: 80
  tls:
  - hosts:
    - nginx.workshop.example.com
    secretName: default-workshop-example-tls

A few important things to notice on the ingress resource:

host the host needs to match the certificate
secretName the secret needs to match the secret defined in the certificate

Then create the deployment:

kubectl apply -f deployment.yaml

Ensure DNS Challenges are successful

Ensure that cert-manager can set DNS-01 challenge records correctly, if you encounter issues, you can inspect the cert-manager pod logs.

To view the pods for cert-manager:

kubectl get pods -n cert-manager

Then view the logs using:

kubectl logs -f pod <pod-id> -n cert-manager

Test

You can open up a browser and access the ingress on your browser, in my case it would be https://nginx.workshop.example.com and verify that you have a certificate issued from Lets Encrypt.

Thank You

Thanks for reading, if you enjoy my content please feel free to follow me on Twitter - @@ruanbekker or visit me on my website - ruan.dev

How to Deploy Ingress-Nginx Controller on Kubernetes With Helm

Dec 22nd, 2023 7:56 am

In this tutorial we will deploy the ingress-nginx controller on kubernetes.

Pre-Requisites

I will be using kind to run a kubernetes cluster locally, if you want to follow along, have a look at my previous post on how to install kubectl and kind and the basic usage of kind:

https://blog.ruanbekker.com/blog/2022/09/20/kind-for-local-kubernetes-clusters/

You will also need helm to deploy the ingress-nginx release from their helm charts, you can see their documentation on how to install it:

https://helm.sh/docs/intro/install/

Create the Kubernetes Cluster

First we will define the kind configuration which will expose port 80 locally in a file name kind-config.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.25.11@sha256:227fa11ce74ea76a0474eeefb84cb75d8dad1b08638371ecf0e86259b35be0c8
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
    listenAddress: "0.0.0.0"
  - containerPort: 443
    hostPort: 443
    protocol: TCP
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"

Then go ahead and create the kubernetes cluster:

kind create cluster --name workshop --config kind-config.yaml

Install ingress-nginx using Helm

Install the ingress-nginx helm chart, by first adding the repository:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx

Then update your local repositories:

helm repo update

Then install the helm release, and set a couple of overrides.

The reason we use NodePort is because our kubernetes cluster runs on docker containers, and from our kind config we have exposed port 80 locally, we are using the NodePort service so that we can make an HTTP request to port 80 to traverse to the port of the service:

helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx --create-namespace \
  --set controller.kind=DaemonSet \
  --set controller.hostPort.enabled=true \
  --set controller.ingressClass=nginx

You can view all the default values from their GitHub repository where the chart is hosted:

https://github.com/kubernetes/ingress-nginx/blob/main/charts/ingress-nginx/values.yaml

Once the release has been deployed, you should see the ingress-nginx pod running under the ingress-nginx namespace:

kubectl get pods -n ingress-nginx

Deploy a Web Application

We will create 3 files:

example/deployment.yaml
example/service.yaml
example/ingress.yaml

Create the example directory:

mkdir example

Our example/deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: webapp
  name: webapp
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      containers:
      - image: ruanbekker/web-center-name-v2
        name: webapp
        ports:
        - name: http
          containerPort: 5000
        env:
        - name: APP_TITLE
          value: "Runs on Kind"
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "256Mi"
            cpu: "1000m"

Our example/service.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: webapp
  namespace: default
spec:
  type: ClusterIP
  selector:
    app: webapp
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 5000

Our example/ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: webapp
  namespace: default
spec:
  ingressClassName: nginx
  rules:
    - host: example.127.0.0.1.nip.io
      http:
        paths:
          - pathType: Prefix
            backend:
              service:
                name: webapp
                port:
                  number: 80
            path: /

In summary, we are creating a deployment with a pod that listens on port 5000, and then we are creating a service with port 80 that will forward its connections to the container port of 5000.

Then we define our ingress that will match our hostname and forward its connections to our service on port 80, and also notice that we are defining our ingress class name, which we have set in our helm values.

Deploy this example with kubectl:

kubectly apply -f example/

Now you can access the web application at http://example.127.0.0.1.nip.io

Teardown

You can delete the resources that we’ve created using:

kubectl delete -f example/

Delete the cluster using:

kind delete cluster --name workshop

Thank You

Thanks for reading, if you enjoy my content please feel free to follow me on Twitter - @ruanbekker or visit me on my website - ruan.dev

Creating a Python Lambda Function With Terraform on AWS

Aug 3rd, 2023 11:29 am

In this tutorial I will explain how to deploy a AWS Lambda Function with Terraform using the Python runtime. It will include the permissions it needs to write its logs to AWS CloudWatch as well as to get information from the AWS API’s as a boilerplate for you to expand on it.

We will also use CloudWatch Events to trigger this lambda function every two hours.

Pre-Requisites

First you will need to have Terraform installed as well as authentication for Terraform to interact with your AWS account, I have written a post about it and you can follow that on “How to use the AWS Terraform Provider”.

Project Structure

The following code will be available on my github repository, but if you would like to follow along we will create everything step by step.

First create the project directory:

mkdir -p ~/workspace/aws-lambda-terraform

Then change into the directory:

cd ~/workspace/aws-lambda-terraform

First we want to create our modules directory:

mkdir -p modules/lambda-function

Then our environment directory:

mkdir -p environment/test

We will also create the directory for our function code:

mkdir -p modules/lambda-function/functions

And we can create the file for our python function:

touch modules/lambda-function/functions/demo.py

Now we will create our files inside our modules directory:

touch modules/lambda-function/{main,versions,outputs,variables}.tf

Then create the files inside our environments directory:

touch environment/test/{main,provider,output}.tf

Then in summary our project structure should look more or less like this:

tree .
.
├── environment
│   └── test
│       ├── main.tf
│       ├── output.tf
│       └── provider.tf
└── modules
    └── lambda-function
        ├── functions
        │   └── demo.py
        ├── main.tf
        ├── outputs.tf
        ├── variables.tf
        └── versions.tf

5 directories, 8 files

Terraform Code

We will first start populating the modules bit, and start with modules/lambda-function/main.tf:

data "aws_iam_policy_document" "lambda" {
  statement {
    actions = ["sts:AssumeRole"]

    principals {
      type        = "Service"
      identifiers = ["lambda.amazonaws.com"]
    }
  }
}

data "aws_iam_policy_document" "lambda_execution" {
  count = var.logs_enabled ? 1 : 0

  statement {
    sid     = "GetCallerIdentity"
    effect  = "Allow"

    actions = [
      "sts:GetCallerIdentity"
    ]

    resources = ["*"]

  }

  statement {
    sid     = "DescribeFunctionsInRegion"
    effect  = "Allow"

    actions = [
      "lambda:GetFunction"
    ]

    resources = ["*"]

    condition {
      test     = "StringEquals"
      variable = "aws:RequestedRegion"
      values = [var.aws_region]
    }
  }

}

resource "aws_iam_role_policy" "lambda_execution_policy" {
  count  = var.logs_enabled ? 1 : 0
  name   = "${var.project_name}-lambda-function-execution-policy"
  role   = aws_iam_role.lambda_role[count.index].id
  policy = data.aws_iam_policy_document.lambda_execution[count.index].json
}

data "archive_file" "lambda_zip" {
  type        = "zip"
  source_file = "${path.module}/functions/demo.py"
  output_path = "${path.module}/lambda-archives/package.zip"
}

resource "aws_iam_role" "lambda_role" {
  count              = var.logs_enabled ? 1 : 0
  name               = "${var.project_name}-lambda-function-role"
  assume_role_policy = data.aws_iam_policy_document.lambda.json
}

resource "aws_lambda_function" "lambda" {
  count            = var.logs_enabled ? 1 : 0
  filename         = data.archive_file.lambda_zip.output_path
  function_name    = "${var.project_name}-lambda-function"
  role             = aws_iam_role.lambda_role[count.index].arn
  handler          = "demo.lambda_handler"
  source_code_hash = filebase64sha256(data.archive_file.lambda_zip.output_path)
  runtime          = "python3.8"
  timeout          = 30

  environment {
    variables = {
      PROJECT_NAME  = var.project_name
      FUNCTION_NAME = "${var.project_name}-lambda-function"
    }
  }

  depends_on = [
    data.archive_file.lambda_zip
  ]

}

resource "aws_cloudwatch_event_rule" "every_two_hours" {
  count               = var.logs_enabled ? 1 : 0
  name                = "${var.project_name}-every-two-hours"
  description         = "Fires every 2 hours"
  schedule_expression = "rate(2 hours)"
}

resource "aws_lambda_permission" "allow_cloudwatch" {
  count         = var.logs_enabled ? 1 : 0
  statement_id  = "AllowExecutionFromCloudWatch"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.lambda[count.index].function_name
  principal     = "events.amazonaws.com"
  source_arn    = aws_cloudwatch_event_rule.every_two_hours[count.index].arn
}

resource "aws_cloudwatch_event_target" "cloudwatch_event" {
  count     = var.logs_enabled ? 1 : 0
  rule      = aws_cloudwatch_event_rule.every_two_hours[count.index].name
  target_id = "${var.project_name}-snapshot-retention-target"
  arn       = aws_lambda_function.lambda[count.index].arn
}

// CloudWatch Logs
resource "aws_cloudwatch_log_group" "cloudwatch_log_group" {
  count     = var.logs_enabled ? 1 : 0
  name      = "/aws/lambda/${aws_lambda_function.lambda[count.index].function_name}"
  retention_in_days = 5
}

resource "aws_iam_role_policy_attachment" "lambda_exec_policy" {
  count      = var.logs_enabled ? 1 : 0
  role       = aws_iam_role.lambda_role[count.index].name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

The next one will be the modules/lambda-function/variables.tf:

variable "aws_region" {
  default = "eu-west-1"
  type    = string
}

variable "project_name" {
  default = "example"
  type    = string
}

variable "logs_enabled" {
  default = false
  type    = bool
}

Then define the modules output in modules/lambda-function/outputs.tf:

output "arn_string" {
  value = aws_lambda_function.lambda[*].arn
}

Then we define our python function code in modules/lambda-function/functions/demo.py:

import os
import json
import logging
import boto3

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    client = boto3.client('lambda')
    logger.info(event)

    response = client.get_function(
        FunctionName=os.environ['FUNCTION_NAME']
    )

    logger.info(response)

    return {
        'statusCode' : 200,
        'body': response
    }

For our environment we want to specify the source as our module in environment/test/main.tf:

module "myfunction" {
  source       = "../../modules/lambda-function"
  project_name = "test"
  logs_enabled = true
}

Our outputs in environment/test/output.tf:

output "arn_string" {
  value = module.myfunction.arn_string
}

And since we are using AWS, we need to define our providers and the profile that we will use to authenticate against AWS, in my case, im using the default profile in environment/test/provider.tf:

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "4.23.0"
    }
  }
}

provider "aws" {
  region                   = "eu-west-1"
  profile                  = "default"
  shared_credentials_files = ["~/.aws/credentials"]
}

Terraform Plan

Now that we have defined our terraform code we can run:

terraform plan

And it should return something more or less like the following:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.myfunction.aws_cloudwatch_event_rule.every_two_hours[0] will be created
  + resource "aws_cloudwatch_event_rule" "every_two_hours" {
      + arn                 = (known after apply)
      + description         = "Fires every 2 hours"
      + event_bus_name      = "default"
      + id                  = (known after apply)
      + is_enabled          = true
      + name                = "test-every-two-hours"
      + name_prefix         = (known after apply)
      + schedule_expression = "rate(2 hours)"
      + tags_all            = (known after apply)
    }

  # module.myfunction.aws_cloudwatch_event_target.cloudwatch_event[0] will be created
  + resource "aws_cloudwatch_event_target" "cloudwatch_event" {
      + arn            = (known after apply)
      + event_bus_name = "default"
      + id             = (known after apply)
      + rule           = "test-every-two-hours"
      + target_id      = "test-snapshot-retention-target"
    }

  # module.myfunction.aws_cloudwatch_log_group.cloudwatch_log_group[0] will be created
  + resource "aws_cloudwatch_log_group" "cloudwatch_log_group" {
      + arn               = (known after apply)
      + id                = (known after apply)
      + name              = "/aws/lambda/test-lambda-function"
      + retention_in_days = 5
      + tags_all          = (known after apply)
    }

  # module.myfunction.aws_iam_role.lambda_role[0] will be created
  + resource "aws_iam_role" "lambda_role" {
      + arn                   = (known after apply)
      + assume_role_policy    = jsonencode(
            {
              + Statement = [
                  + {
                      + Action   = "sts:GetCallerIdentity"
                      + Effect   = "Allow"
                      + Resource = "*"
                      + Sid      = "GetCallerIdentity"
                    },
                  + {
                      + Action    = "lambda:GetFunction"
                      + Condition = {
                          + StringEquals = {
                              + "aws:RequestedRegion" = "eu-west-1"
                            }
                        }
                      + Effect    = "Allow"
                      + Resource  = "*"
                      + Sid       = "DescribeFunctionsInRegion"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + create_date           = (known after apply)
      + force_detach_policies = false
      + id                    = (known after apply)
      + managed_policy_arns   = (known after apply)
      + max_session_duration  = 3600
      + name                  = "test-lambda-function-role"
      + name_prefix           = (known after apply)
      + path                  = "/"
      + tags_all              = (known after apply)
      + unique_id             = (known after apply)
    }

  # module.myfunction.aws_iam_role_policy.lambda_execution_policy[0] will be created
  + resource "aws_iam_role_policy" "lambda_execution_policy" {
      + id     = (known after apply)
      + name   = "test-lambda-function-execution-policy"
      + policy = jsonencode(
            {
              + Statement = [
                  + {
                      + Action   = "sts:GetCallerIdentity"
                      + Effect   = "Allow"
                      + Resource = "*"
                      + Sid      = "GetCallerIdentity"
                    },
                  + {
                      + Action    = "lambda:GetFunction"
                      + Condition = {
                          + StringEquals = {
                              + "aws:RequestedRegion" = "eu-west-1"
                            }
                        }
                      + Effect    = "Allow"
                      + Resource  = "*"
                      + Sid       = "DescribeFunctionsInRegion"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + role   = (known after apply)
    }

  # module.myfunction.aws_iam_role_policy_attachment.lambda_exec_policy[0] will be created
  + resource "aws_iam_role_policy_attachment" "lambda_exec_policy" {
      + id         = (known after apply)
      + policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
      + role       = "test-lambda-function-role"
    }

  # module.myfunction.aws_lambda_function.lambda[0] will be created
  + resource "aws_lambda_function" "lambda" {
      + architectures                  = (known after apply)
      + arn                            = (known after apply)
      + filename                       = "../../modules/lambda-function/lambda-archives/package.zip"
      + function_name                  = "test-lambda-function"
      + handler                        = "demo.lambda_handler"
      + id                             = (known after apply)
      + invoke_arn                     = (known after apply)
      + last_modified                  = (known after apply)
      + memory_size                    = 128
      + package_type                   = "Zip"
      + publish                        = false
      + qualified_arn                  = (known after apply)
      + reserved_concurrent_executions = -1
      + role                           = (known after apply)
      + runtime                        = "python3.8"
      + signing_job_arn                = (known after apply)
      + signing_profile_version_arn    = (known after apply)
      + source_code_hash               = "MI7FD/KHgxRFh7cmPjzxg+w494pmyRGgQIr9Ls8Yups="
      + source_code_size               = (known after apply)
      + tags_all                       = (known after apply)
      + timeout                        = 30
      + version                        = (known after apply)

      + environment {
          + variables = {
              + "FUNCTION_NAME" = "test-lambda-function"
              + "PROJECT_NAME"  = "test"
            }
        }
    }

  # module.myfunction.aws_lambda_permission.allow_cloudwatch[0] will be created
  + resource "aws_lambda_permission" "allow_cloudwatch" {
      + action              = "lambda:InvokeFunction"
      + function_name       = "test-lambda-function"
      + id                  = (known after apply)
      + principal           = "events.amazonaws.com"
      + source_arn          = (known after apply)
      + statement_id        = "AllowExecutionFromCloudWatch"
      + statement_id_prefix = (known after apply)
    }

Plan: 8 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + arn_string = [
      + (known after apply),
    ]

Create Resources

If you are happy with the plan you can go ahead and run:

terraform apply

Which will create the resources in AWS. Upon creation we should see something like this:

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

Outputs:

arn_string = [
  "arn:aws:lambda:eu-west-1:000000000000:function:test-lambda-function",
]

Since we have our aws cli configured with a profile we can also test our lambda function:

$ aws --profile default lambda invoke --function-name test-lambda-function --cli-binary-format raw-in-base64-out --payload '{"name": "ruan"}' out.log
{
    "StatusCode": 200,
    "ExecutedVersion": "$LATEST"
}

And the response from the invocation can be seen in the file we defined:

$ cat out.log
{"statusCode": 200, "body": {"ResponseMetadata": {"RequestId": "5171x", "HTTPStatusCode": 200, "HTTPHeaders": {"date": "Thu, 21 Dec 2023 06:34:13 GMT", "content-type": "application/json", "content-length": "3517", "connection": "keep-alive", "x-amzn-requestid": "5171x"}, "RetryAttempts": 0}, "Configuration": {"FunctionName": "test-lambda-function", "FunctionArn": "arn:aws:lambda:eu-west-1:000000000000:function:test-lambda-function", "Runtime": "python3.8", "Role": "arn:aws:iam::000000000000:role/test-lambda-function-role", "Handler": "demo.lambda_handler", "CodeSize": 401, "Description": "", "Timeout": 30, "MemorySize": 128, "LastModified": "2023-12-21T06:26:46.000+0000", "CodeSha256": "x", "Version": "$LATEST", "Environment": {"Variables": {"FUNCTION_NAME": "test-lambda-function", "PROJECT_NAME": "test"}}, "TracingConfig": {"Mode": "PassThrough"}, "RevisionId": "7faex", "State": "Active", "LastUpdateStatus": "Successful", "PackageType": "Zip", "Architectures": ["x86_64"], "EphemeralStorage": {"Size": 512}, "SnapStart": {"ApplyOn": "None", "OptimizationStatus": "Off"}, "RuntimeVersionConfig": {"RuntimeVersionArn": "arn:aws:lambda:eu-west-1::runtime:x"}}, "Code": {"RepositoryType": "S3", "Location": "https://awslambda-eu-west-1-tasks.s3.eu-west-1.amazonaws.com/snapshots/x/test-lambda-function-x?queryparameters"}}}

Updating Lambda Function Code

If we want to redeploy our function with updated code, we can change the content of functions/demo.py and then run:

terraform apply

Since our terraform code defined that if the source has of the function code changes, it will trigger a redeploy, and from the computed plan we can see that it will redeploy our function code:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.myfunction.aws_lambda_function.lambda[0] will be updated in-place
  ~ resource "aws_lambda_function" "lambda" {
        id                             = "test-lambda-function"
      ~ last_modified                  = "2023-12-21T06:26:46.000+0000" -> (known after apply)
      ~ source_code_hash               = "8TLrm4GmTrfAxwfElmIjws1Vf9UDZ6k2w1+VEONJaCQ=" -> "RIQ62KCcjlcHh5lLCOlrkB7GioBpLY1Y5vN4UZGyN+c="
        tags                           = {}
        # (18 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value:

After entering “yes” we will update our function code

Discover AWS Console

If we logon to the AWS Console and head to Lambda we can inspect our function code:

If we manually want to trigger the function, select “Test”, then enter the “Event name” with something like “testing” then click “Test”:

If we follow the CloudWatch log link we can view the logs in CloudWatch:

Destroy Infrastructure

If you followed along and would like to destroy the created infrastructure:

terraform destroy

Resources

Terraform Examples

https://github.com/ruanbekker/terraformfiles/tree/master/modules/aws-lambda-function

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

How to Use the MySQL Terraform Provider

Jul 15th, 2023 8:55 pm

In this tutorial we will provision a MySQL Server with Docker and then use Terraform to provision MySQL Users, Database Schemas and MySQL Grants with the MySQL Terraform Provider.

About

Terraform is super powerful and can do a lot of things. And it shines when it provisions Infrastructure. So in a scenario where we use Terraform to provision RDS MySQL Database Instances, we might still want to provision extra MySQL Users, or Database Schemas and the respective MySQL Grants.

Usually you will logon to the database and create them manually with sql syntax. But in this tutorial we want to make use of Docker to provision our MySQL Server and we would like to make use of Terraform to provision the MySQL Database Schemas, Grants and Users.

Instead of using AWS RDS, I will be provisioning a MySQL Server on Docker so that we can keep the costs free, for those who are following along.

We will also go through the steps on how to rotate the database password that we will be provisioning for our user.

MySQL Server

First we will provision a MySQL Server on Docker Containers, I have a docker-compose.yaml which is available in my quick-starts github repository:

version: "3.8"

services:
  mysql:
    image: mysql:8.0
    container_name: mysql
    ports:
      - 3306:3306
    environment:
      - MYSQL_DATABASE=sample
      - MYSQL_ROOT_PASSWORD=rootpassword

Once you have saved that in your current working directory, you can start the container with docker compose:

docker-compose up -d

You can test the mysql container by logging onto the mysql server with the correct auth:

docker exec -it mysql mysql -u root -prootpassword -e 'show databases;'

This should be more or less the output:

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sample             |
| sys                |
+--------------------+

Terraform

If you don’t have Terraform installed, you can install it from their documentation.

If you want the source code of this example, its available in my terraform-mysql/petoju-provider repository. Which you can clone and jump into the terraform/mysql/petoju-provider directory.

First we will define the providers.tf:

terraform {
  required_providers {
    mysql = {
      source = "petoju/mysql"
      version = "3.0.37"
    }
  }
}

provider "mysql" {
  alias    = "local"
  endpoint = "127.0.0.1:3306"
  username = "root"
  password = "rootpassword"
}

Then the main.tf:

resource "random_password" "user_password" {
  length           = 24
  special          = true
  min_special      = 2
  override_special = "!#$%^&*()-_=+[]{}<>:?"
  keepers = {
    password_version = var.password_version
  }
}

resource "mysql_database" "user_db" {
  provider = mysql.local
  name = var.database_name
}

resource "mysql_user" "user_id" {
  provider = mysql.local
  user = var.database_username
  plaintext_password = random_password.user_password.result
  host = "%"
  tls_option = "NONE"
}

resource "mysql_grant" "user_id" {
  provider = mysql.local
  user = var.database_username
  host = "%"
  database = var.database_name
  privileges = ["SELECT", "UPDATE"]
  depends_on = [
    mysql_user.user_id
  ]
}

Then the variables.tf:

variable "database_name" {
  description = "The name of the database that you want created."
  type        = string
  default     = null
}

variable "database_username" {
  description = "The name of the database username that you want created."
  type        = string
  default     = null
}

variable "password_version" {
  description = "The password rotates when this value gets updated."
  type        = number
  default     = 0
}

Then our outputs.tf:

output "user" {
  value = mysql_user.user_id.user
}

output "password" {
  sensitive = true
  value = random_password.user_password.result
}

Our terraform.tfvars that defines the values of our variables:

database_name     = "foobar"
database_username = "ruanb"
password_version  = 0

Now we are ready to run our terraform code, which will ultimately create a database, user and grants. Outputs the encrypted string of your password which was encrypted with your keybase_username.

Initialise Terraform:

terraform init

Run the plan to see what terraform wants to provision:

terraform plan

And we can see the following resources will be created:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # mysql_database.user_db will be created
  + resource "mysql_database" "user_db" {
      + default_character_set = "utf8mb4"
      + default_collation     = "utf8mb4_general_ci"
      + id                    = (known after apply)
      + name                  = "foobar"
    }

  # mysql_grant.user_id will be created
  + resource "mysql_grant" "user_id" {
      + database   = "foobar"
      + grant      = false
      + host       = "%"
      + id         = (known after apply)
      + privileges = [
          + "SELECT",
          + "UPDATE",
        ]
      + table      = "*"
      + tls_option = "NONE"
      + user       = "ruanb"
    }

  # mysql_user.user_id will be created
  + resource "mysql_user" "user_id" {
      + host               = "%"
      + id                 = (known after apply)
      + plaintext_password = (sensitive value)
      + tls_option         = "NONE"
      + user               = "ruanb"
    }

  # random_password.user_password will be created
  + resource "random_password" "user_password" {
      + bcrypt_hash      = (sensitive value)
      + id               = (known after apply)
      + keepers          = {
          + "password_version" = "0"
        }
      + length           = 24
      + lower            = true
      + min_lower        = 0
      + min_numeric      = 0
      + min_special      = 2
      + min_upper        = 0
      + number           = true
      + numeric          = true
      + override_special = "!#$%^&*()-_=+[]{}<>:?"
      + result           = (sensitive value)
      + special          = true
      + upper            = true
    }

Plan: 4 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + password = (sensitive value)
  + user     = "ruanb"

Run the apply which will create the database, the user, sets the password and applies the grants:

terraform apply

Then our returned output should show something like this:

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

Outputs:

password = <sensitive>
user = "ruanb"

As our password is set as sensitive, we can access the value with terraform output -raw password, let’s assign the password to a variable:

DBPASS=$(terraform output -raw password)

Then we can exec into the mysql container and logon to the mysql server with our new credentials:

docker exec -it mysql mysql -u ruanb -p$DBPASS

And we can see we are logged onto the mysql server:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 14
Server version: 8.0.33 MySQL Community Server - GPL

mysql>

If we run show databases; we should see the following:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| foobar             |
| information_schema |
| performance_schema |
+--------------------+
3 rows in set (0.03 sec)

If we want to rotate the mysql password for the user, we can update the password_version variable either in our terraform.tfvars or via the cli. Let’s pass the variable in the cli and do a terraform plan to verify the changes:

terraform plan -var password_version=1

And due to our value for the random resource keepers parameter being updated, it will trigger the value of our password to be changed, and that will let terraform update our mysql user’s password:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # mysql_user.user_id will be updated in-place
  ~ resource "mysql_user" "user_id" {
        id                 = "ruanb@%"
      ~ plaintext_password = (sensitive value)
        # (5 unchanged attributes hidden)
    }

  # random_password.user_password must be replaced
-/+ resource "random_password" "user_password" {
      ~ bcrypt_hash      = (sensitive value)
      ~ id               = "none" -> (known after apply)
      ~ keepers          = { # forces replacement
          ~ "password_version" = "0" -> "1"
        }
      ~ result           = (sensitive value)
        # (11 unchanged attributes hidden)
    }

Plan: 1 to add, 1 to change, 1 to destroy.

Let’s go ahead by updating our password:

terraform apply -var password_version=1 -auto-approve

To validate that the password has changed, we can try to logon to mysql by using the password variable that was created initially:

docker exec -it mysql mysql -u ruanb -p$DBPASS

And as you can see authentication failed:

mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'ruanb'@'localhost' (using password: YES)

Set the new password to the variable again:

DBPASS=$(terraform output -raw password)

Then try to logon again:

docker exec -it mysql mysql -u ruanb -p$DBPASS

And we can see we are logged on again:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 22
Server version: 8.0.33 MySQL Community Server - GPL

mysql>

Resources

The terraform mysql provider: - https://registry.terraform.io/providers/petoju/mysql/latest/docs

The quick-starts repository: - https://github.com/ruanbekker/quick-starts

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

How to Use the AWS Terraform Provider

Jul 15th, 2023 8:01 pm

In this post we will be using the AWS Terraform provider, from how to install Terraform, create a AWS IAM User, configure the AWS Provider and deploy a EC2 instance using Terraform.

AWS IAM User

In order to authenticate against AWS’s APIs, we need to create a AWS IAM User and create Access Keys for Terraform to use to authenticate.

From https://aws.amazon.com/ logon to your account, then search for IAM:

aws-iam-search-result

Select IAM, then select “Users” on the left hand side and select “Create User”, then provide the username for your AWS IAM User:

aws-iam-user-creation-wizard

Now we need to assign permissions to our new AWS IAM User. For this scenario I will be assigning a IAM Policy directly to the user and I will be selecting the “AdministratorAccess” policy. Keep in mind that this allows admin access to your whole AWS account:

permissions-for-your-aws-iam-user

Once you select the policy, select “Next” and select “Create User”. Once the user has been created, select “Users” on the left hand side, search for your user that we created, in my case “medium-terraform”.

Select the user and click on “Security credentials”. If you scroll down to the “Access keys” section, you will notice we don’t have any access keys for this user:

aws-iam-access-keys

In order to allow Terraform access to our AWS Account, we need to create access keys that Terraform will use, and because we assigned full admin access to the user, Terraform will be able to manage resources in our AWS Account.

Click “Create access key”, then select the “CLI” option and select the confirmation at the bottom:

aws-iam-access-keys-wizard

Select “Next” and then select “Create access key”. I am providing a screenshot of the Access Key and Secret Access Key that has been provided, but by the time this post has been published, the key will be deleted.

retrieve-aws-iam-access-keys

Store your Access Key and Secret Access Key in a secure place and treat this like your passwords. If someone gets access to these keys they can manage your whole AWS Account.

I will be using the AWS CLI to configure my Access Key and Secret Access Key, as I will configure Terraform later to read my Access Keys from the Credential Provider config.

First we need to configure the AWS CLI by passing the profile name, which I have chosen medium for this demonstration:

aws --profile medium configure

We will be asked to provide the access key, secret access key, aws region and the default output:

AWS Access Key ID [None]: AKIATPRT2G4SGXLAC3HJ
AWS Secret Access Key [None]: KODnR[............]nYTYbd
Default region name [None]: eu-west-1
Default output format [None]: json

To verify if everything works as expected we can use the following command to verify:

aws --profile medium sts get-caller-identity

The response should look something similar to the following:

{
    "UserId": "AIDATPRT2G4SOAO5Y7S5Z",
    "Account": "000000000000",
    "Arn": "arn:aws:iam::000000000000:user/medium-terraform"
}

Terraform

Now that we have our AWS IAM User configured, we can install Terraform, if you don’t have Terraform installed yet, you can follow their Installation Documentation.

Once you have Terraform installed, we can setup our workspace where we will ultimately deploy a EC2 instance, but before we get there we need to create our project directory and change to that directory:

mkdir ~/terraform-demo
cd ~/terraform-demo

Then we will create 4 files with .tf extensions:

touch main.tf
touch outputs.tf
touch providers.tf
touch variables.tf

We will define our Terraform definitions on how we want our desired infrastructure to look like. We will get to the content in the files soon.

I personally love Terraform’s documentation as they are rich in examples and really easy to use.

Head over to the Terraform AWS Provider documentation and you scroll a bit down, you can see the Authentication and Configuration section where they outline the order in how Terraform will look for credentials and we will be making use of the shared credentials file as that is where our access key and secret access key is stored.

If you look at the top right corner of the Terraform AWS Provider documentation, they show you how to use the AWS Provider:

terraform-aws-provider-docs

We can copy that code snippet and paste it into our providers.tf file and configure the aws provider section with the medium profile that we’ve created earlier.

This will tell Terraform where to look for credentials in order to authenticate with AWS.

Open providers.tf with your editor of choice:

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "5.8.0"
    }
  }
}

provider "aws" {
  shared_credentials_files = ["~/.aws/credentials"]
  profile                  = "medium"
  region                   = "eu-west-1"
}

Then we can open main.tf and populate the following to define the EC2 instance that we want to provision:

data "aws_ami" "latest_ubuntu" {
  most_recent = true
  owners = ["099720109477"]

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-*-server-*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }
}

resource "aws_instance" "ec2" {
  ami           = data.aws_ami.latest_ubuntu.id
  instance_type = var.instance_type
  tags = {
    Name = "${var.instance_name}-ec2-instance"
  }
}

In the above example we are filtering for the latest Ubuntu 22.04 64bit AMI then we are defining a EC2 instance and specifying the AMI ID that we filtered from our data source.

Note that we haven’t specified a SSH Keypair, as we are just focusing on how to provision a EC2 instance.

As you can see we are also referencing variables, which we need to define in variables.tf :

variable "instance_name" {
  description = "Instance Name for EC2."
  type        = string
  default     = "test"
}

variable "instance_type" {
  description = "Instance Type for EC2."
  type        = string
  default     = "t2.micro"
}

And then lastly we need to define our outputs.tf which will be used to output the instance id and ip address:

output "instance_id" {
  value = aws_instance.ec2.id
}

output "ip" {
  value = aws_instance.ec2.public_ip
}

Deploy our EC2 with Terraform

Now that our infrastructure has been defined as code, we can first initialise terraform which will initialise the backend and download all the providers that has been defined:

terraform init

Once that has done we can run a “plan” which will show us what Terraform will deploy:

terraform plan

Now terraform will show us the difference in what we have defined, and what is actually in AWS, as we know its a new account with zero infrastructure, the diff should show us that it needs to create a EC2 instance.

The response from the terraform plan shows us the following:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # aws_instance.ec2 will be created
  + resource "aws_instance" "ec2" {
      + ami                                  = "ami-0f56955469757e5aa"
      + arn                                  = (known after apply)
      + id                                   = (known after apply)
      + instance_type                        = "t2.micro"
      + key_name                             = (known after apply)
      + private_ip                           = (known after apply)
      + public_ip                            = (known after apply)
      + security_groups                      = (known after apply)
      + subnet_id                            = (known after apply)
      + tags                                 = {
          + "Name" = "test-ec2-instance"
        }
      + tags_all                             = {
          + "Name" = "test-ec2-instance"
        }
      + vpc_security_group_ids               = (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + instance_id = (known after apply)
  + ip          = (known after apply)

As you can see terraform has looked up the AMI ID using the data source, and we can see that terraform will provision 1 resource which is a EC2 instance. Once we hare happy with the plan, we can run a apply which will show us the same but this time prompt us if we want to proceed:

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_instance.ec2: Creating...
aws_instance.ec2: Still creating... [10s elapsed]
aws_instance.ec2: Still creating... [20s elapsed]
aws_instance.ec2: Still creating... [30s elapsed]
aws_instance.ec2: Creation complete after 35s [id=i-005c08b899229fff0]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

instance_id = "i-005c08b899229fff0"
ip = "34.253.196.167"

And now we can see our EC2 instance was provisioned and our outputs returned the instance id as well as the public ip address.

We can also confirm this by looking at the AWS EC2 Console:

aws-ec2-instances-in-console

Note that Terraform Configuration is idempotent, so when we run a terraform apply again, terraform will check what we have defined as what we want our desired infrastructure to be like, and what we actually have in our AWS Account, and since we haven’t made any changes there should be no changes.

We can run a terraform apply to validate that:

terraform apply

And we can see the response shows:

data.aws_vpc.selected: Reading...
data.aws_ami.latest_ubuntu: Reading...
data.aws_ami.latest_ubuntu: Read complete after 1s [id=ami-0f56955469757e5aa]
data.aws_vpc.selected: Read complete after 1s [id=vpc-063d7ac3124053dfa]
data.aws_subnet.selected: Reading...
data.aws_subnet.selected: Read complete after 1s [id=subnet-0b7acd7593611c1bb]
aws_instance.ec2: Refreshing state... [id=i-005c08b899229fff0]

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Cleanup

Destroy the infrastructure that we provisioned:

terraform destroy

It will show us what terraform will destroy, then upon confirming we should see the following output:

Plan: 0 to add, 0 to change, 1 to destroy.

Changes to Outputs:
  - instance_id = "i-005c08b899229fff0" -> null
  - ip          = "34.253.196.167" -> null

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

aws_instance.ec2: Destroying... [id=i-005c08b899229fff0]
aws_instance.ec2: Still destroying... [id=i-005c08b899229fff0, 10s elapsed]
aws_instance.ec2: Still destroying... [id=i-005c08b899229fff0, 20s elapsed]
aws_instance.ec2: Still destroying... [id=i-005c08b899229fff0, 30s elapsed]
aws_instance.ec2: Destruction complete after 31s

Destroy complete! Resources: 1 destroyed.

If you followed along and you also want to clean up the AWS IAM user, head over to the AWS IAM Console and delete the “medium-terraform” IAM User.

Thank You

I hope you enjoyed this post, I will be posting more terraform related content.

Should you want to reach out to me, you can follow me on Twitter at @ruanbekker or check out my website at https://ruan.dev

Getting Started With FerretDB on Docker

Jun 14th, 2023 10:00 pm

how-to-run-ferretdb-on-docker

In this post we will have a look at FerretDB which is a opensource proxy that translates MongoDB queries to SQL, where PostgreSQL being the database engine.

More about FerretDB

From FerretDB website, they describe FerretDB as:

Initially built as open-source software, MongoDB was a game-changer for many developers, enabling them to build fast and robust applications. Its ease of use and extensive documentation made it a top choice for many developers looking for an open-source database. However, all this changed when they switched to an SSPL license, moving away from their open-source roots.

In light of this, FerretDB was founded to become the true open-source alternative to MongoDB, making it the go-to choice for most MongoDB users looking for an open-source alternative to MongoDB. With FerretDB, users can run the same MongoDB protocol queries without needing to learn a new language or command.

What can you expect from this tutorial

We will be doing the following:

deploying ferretdb and postgres on docker containers using docker compose
then use mongosh as a client to logon to ferretdb using the ferretdb endpoint
explore some example queries to insert and read data from ferretdb
use scripting to generate data into ferretedb
explore the embedded prometheus endpoint for metrics

Deploy FerretDB

The following docker-compose.yaml defines a postgres container which will be used as the database engine for ferretdb, and then we define the ferretdb container, which connects to postgres via the environment variable FERRETDB_POSTGRESQL_URL.

version: "3.9"

services:
  postgres:
    image: postgres:14.8-bullseye
    container_name: postgres
    restart: unless-stopped
    environment:
      - POSTGRES_USER=ferret
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=ferretdb
    volumes:
      - pgvol:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready", "-d", "db_prod"]
      interval: 30s
      timeout: 15s
      retries: 5
      start_period: 60s
    networks:
      - ferretdb
    logging:
      driver: "json-file"
      options:
        max-size: "1m"
        max-file: "1"

  ferretdb:
    image: ghcr.io/ferretdb/ferretdb:1.1.0
    container_name: ferretdb
    restart: unless-stopped
    ports:
      - 27017:27017
      - 8080:8080
    environment:
      - FERRETDB_POSTGRESQL_URL=postgres://postgres:5432/ferretdb
    depends_on:
      postgres:
        condition: service_healthy
    networks:
      - ferretdb
    logging:
      driver: "json-file"
      options:
        max-size: "1m"
        max-file: "1"

networks:
  ferretdb:
    name: ferretdb

volumes:
  pgvol:

Once you have the content above saved in docker-compose.yaml you can run the following to run the containers in a detached mode:

docker-compose up -d

Connect to FerretDB

Once the containers started, we can connect to our ferretdb server using mongosh, which is a shell utility to connect to the database). I will make use of a container to do this, where I will reference the network which we defined in our docker compose file, and set the endpoint that mongosh need to connect to:

docker run --rm -it --network=ferretdb --entrypoint=mongosh mongo:6.0 "mongodb://ferret:password@ferretdb/ferretdb?authMechanism=PLAIN"

Once it successfully connects to ferretdb, we should see the following prompt:

Current Mongosh Log ID:  64626c5c259916d1a68b7dad
Connecting to:        mongodb://<credentials>@ferretdb/ferretdb?authMechanism=PLAIN&directConnection=true&appName=mongosh+1.8.2
Using MongoDB:        6.0.42
Using Mongosh:        1.8.2

ferretdb>

Run example queries on FerretDB

If you are familiar with MongoDB, you will find the following identical to MongoDB.

First we show the current databases:

ferretdb> show dbs;
public  0 B

The we create and use the database named mydb:

ferretdb> use mydb
switched to db mydb

To see which database are we currently connected to:

mydb> db
mydb

Now we can create a collection named mycol1 and mycol2:

mydb> db.createCollection("mycol1")
{ ok: 1 }
mydb> db.createCollection("mycol2")
{ ok: 1 }

We can view our collections by running the following:

mydb> show collections
mycol1
mycol2

To write one document into our collection named col1 with the following data:

{
  "name": "ruan",
  "age": 32,
  "hobbies": [
    "golf",
    "programming",
    "music"
  ]
}

We can execute:

mydb> db.mycol1.insertOne({"name": "ruan", "age": 32, "hobbies": ["golf", "programming", "music"]})
{
  acknowledged: true,
  insertedIds: { '0': ObjectId("64626cea259916d1a68b7dae") }
}

And we can insert another document:

mydb> db.mycol1.insertOne({"name": "michelle", "age": 28, "hobbies": ["art", "music", "reading"]})
{
  acknowledged: true,
  insertedIds: { '0': ObjectId("64626cf1259916d1a68b7daf") }
}

We can then use countDocuments() to view the number of documents in our collection named mycol1:

ferretdb> db.mycol1.countDocuments()
2

If we want to find all our documents in our mycol1 collection:

mydb> db.mycol1.find()
[
  {
    _id: ObjectId("64626cea259916d1a68b7dae"),
    name: 'ruan',
    age: 32,
    hobbies: [ 'golf', 'programming', 'music' ]
  },
  {
    _id: ObjectId("64626cf1259916d1a68b7daf"),
    name: 'michelle',
    age: 28,
    hobbies: [ 'art', 'music', 'reading' ]
  }
]

If we want to only display specific fields in our response, such as name and age, we can project fields to return from our query:

mydb> db.mycol1.find({}, {"name": 1, "age": 1})
[
  { _id: ObjectId("64626cea259916d1a68b7dae"), name: 'ruan', age: 32 },
  {
    _id: ObjectId("64626cf1259916d1a68b7daf"),
    name: 'michelle',
    age: 28
  }
]

We can also suppress the _id field by setting the value to 0:

mydb> db.mycol1.find({}, {"_id": 0, "name": 1, "age": 1})
[
  { name: 'ruan', age: 32 },
  { name: 'michelle', age: 28 }
]

Next we can return all the fields name and age from our collection where the age field is equals to 32:

mydb> db.mycol1.find({"age": 32}, {"_id": 0, "name": 1, "age": 1})
[ { name: 'ruan', age: 32 } ]

We can also find a specific document by its id as example, and return only the field value, like name:

mydb> db.mycol1.findOne({_id: ObjectId("64626cea259916d1a68b7dae")}).name
ruan

Next we will find all documents where the age is greater than 30:

mydb> db.mycol1.find({"age": {"$gt": 30}})
[
  {
    _id: ObjectId("64626cea259916d1a68b7dae"),
    name: 'ruan',
    age: 32,
    hobbies: [ 'golf', 'programming', 'music' ]
  }
]

Let’s explore how to insert many documents at once using insertMany(), first create a new collection:

ferretdb> db.createCollection("mycol2")
{ ok: 1 }

We can then define the docs variable, and assign a array with 2 json documents:

ferretdb> var docs = [{name: "peter", age: 34, hobbies: ["ski", "programming", "music"]}, {name: "sam", age: 39, hobbies: ["running", "camping", "music"]}]

Now we can insert our documents to ferretdb using insertMany():

ferretdb> db.mycol2.insertMany(docs)
{
  acknowledged: true,
  insertedIds: {
    '0': ObjectId("6464ceb1413cee26e9bf709f"),
    '1': ObjectId("6464ceb1413cee26e9bf70a0")
  }
}

We can count the documents inside our collection using:

ferretdb> db.mycol2.countDocuments()
2

And we can search for all the documents inside the collection:

ferretdb> db.mycol2.find()
[
  {
    _id: ObjectId("6464ceb1413cee26e9bf709f"),
    name: 'peter',
    age: 34,
    hobbies: [ 'ski', 'programming', 'music' ]
  },
  {
    _id: ObjectId("6464ceb1413cee26e9bf70a0"),
    name: 'sam',
    age: 39,
    hobbies: [ 'running', 'camping', 'music' ]
  }
]

And searching for any data using the name peter:

ferretdb> db.mycol2.find({name: "peter"})
[
  {
    _id: ObjectId("6464ceb1413cee26e9bf709f"),
    name: 'peter',
    age: 34,
    hobbies: [ 'ski', 'programming', 'music' ]
  }
]

Scripting

We will create a script so that we can generate data that we want to write into FerretDB.

Create the following script, write.js:

var txs = []
for (var x = 0; x < 1000 ; x++) {
 var transaction_types = ["credit card", "cash", "account"];
 var store_names = ["edgards", "cna", "makro", "picknpay", "checkers"];
 var random_transaction_type = Math.floor(Math.random() * (2 - 0 + 1)) + 0;
 var random_store_name = Math.floor(Math.random() * (4 - 0 + 1)) + 0;
 var random_age = Math.floor(Math.random() * (80 - 18) + 18)
 txs.push({
   transaction: 'tx_' + x,
   transaction_price: Math.round(Math.random()*1000),
   transaction_type: transaction_types[random_transaction_type],
   store_name: store_names[random_store_name],
   age: random_age
   });
}
console.log("drop and recreate the collection")
db.mycollection1.drop()
db.createCollection("mycollection1")
console.log("insert documents into collection")
db.mycollection1.insertMany(txs)

The script will loop a 1000 times and create documents that will include fields of transaction_types, store_names, random_transaction_type, random_store_name and random_age.

Use docker, mount the file inside the container, point the database endpoint to ferretdb and load the file that we want to execute:

docker run --rm -it --network=ferretdb -v $PWD/write.js:/src/write.js --entrypoint=mongosh mongo:6.0 "mongodb://ferret:password@ferretdb/ferretdb?authMechanism=PLAIN" --eval 'load("/src/write.js")'

Now when we run a mongosh client:

docker run --rm -it --network=ferretdb -v $PWD/write.js:/src/write.js --entrypoint=mongosh mongo:6.0 "mongodb://ferret:password@ferretdb/ferretdb?authMechanism=PLAIN"

And we query for the store_name: picknpay and only show the transaction_type and transaction fields:

ferretdb> db.mycollection1.find({"store_name": "picknpay"}, {_id: 0, transaction_type: 1, transaction: 1})
[
  { transaction_type: 'credit card', transaction: 'tx_3' },
  { transaction_type: 'cash', transaction: 'tx_9' },
  { transaction_type: 'account', transaction: 'tx_10' },
  { transaction_type: 'credit card', transaction: 'tx_15' },
  { transaction_type: 'credit card', transaction: 'tx_19' },
  { transaction_type: 'cash', transaction: 'tx_21' },
  { transaction_type: 'cash', transaction: 'tx_28' },
  { transaction_type: 'account', transaction: 'tx_31' },
  { transaction_type: 'cash', transaction: 'tx_37' },
  { transaction_type: 'cash', transaction: 'tx_39' },
  { transaction_type: 'account', transaction: 'tx_40' },
  { transaction_type: 'cash', transaction: 'tx_51' },
  { transaction_type: 'account', transaction: 'tx_52' },
  { transaction_type: 'cash', transaction: 'tx_58' },
  { transaction_type: 'credit card', transaction: 'tx_62' },
  { transaction_type: 'credit card', transaction: 'tx_65' },
  { transaction_type: 'account', transaction: 'tx_69' },
  { transaction_type: 'account', transaction: 'tx_71' },
  { transaction_type: 'cash', transaction: 'tx_72' },
  { transaction_type: 'account', transaction: 'tx_74' }
]

We can also use the --eval flag with the mongosh container to run ad-hoc queries such as counting documents for a collection:

docker run --rm -it --network=ferretdb \
  -v $PWD/write.js:/src/write.js:ro \
  --entrypoint=mongosh mongo:6.0 \
  "mongodb://ferret:password@ferretdb/ferretdb?authMechanism=PLAIN" --eval 'db.mycollection1.countDocuments()'

Prometheus Metrics

FerretDB provides prometheus metrics out of the box, and outputs prometheus metrics on the :8080/debug/metrics endpoint:

curl http://localhost:8080/debug/metrics

Which will output metrics more or less like the following:

ferretdb_client_accepts_total{error="0"} 98
ferretdb_client_connected 0
ferretdb_client_requests_total{command="aggregate",opcode="OP_MSG"} 5
ferretdb_client_requests_total{command="atlasVersion",opcode="OP_MSG"} 27
ferretdb_client_requests_total{command="buildInfo",opcode="OP_MSG"} 27
ferretdb_client_requests_total{command="buildinfo",opcode="OP_MSG"} 2
ferretdb_client_requests_total{command="count",opcode="OP_MSG"} 5
ferretdb_client_requests_total{command="create",opcode="OP_MSG"} 7
ferretdb_client_requests_total{command="drop",opcode="OP_MSG"} 3
ferretdb_client_requests_total{command="dropDatabase",opcode="OP_MSG"} 4
ferretdb_client_requests_total{command="find",opcode="OP_MSG"} 27
ferretdb_client_requests_total{command="getCmdLineOpts",opcode="OP_MSG"} 27
ferretdb_client_requests_total{command="getFreeMonitoringStatus",opcode="OP_MSG"} 20
ferretdb_client_requests_total{command="getLog",opcode="OP_MSG"} 20
ferretdb_client_requests_total{command="getParameter",opcode="OP_MSG"} 27
ferretdb_client_requests_total{command="hello",opcode="OP_MSG"} 20
ferretdb_client_requests_total{command="insert",opcode="OP_MSG"} 15
ferretdb_client_requests_total{command="ismaster",opcode="OP_MSG"} 238
ferretdb_client_requests_total{command="listCollections",opcode="OP_MSG"} 49
ferretdb_client_requests_total{command="listDatabases",opcode="OP_MSG"} 12
ferretdb_client_requests_total{command="ping",opcode="OP_MSG"} 40
ferretdb_client_requests_total{command="saslStart",opcode="OP_MSG"} 70
ferretdb_client_requests_total{command="setFreeMonitoring",opcode="OP_MSG"} 1
ferretdb_client_requests_total{command="unknown",opcode="OP_QUERY"} 96
ferretdb_client_responses_total{argument="unknown",command="aggregate",opcode="OP_MSG",result="ok"} 5
ferretdb_client_responses_total{argument="unknown",command="atlasVersion",opcode="OP_MSG",result="CommandNotFound"} 27
ferretdb_client_responses_total{argument="unknown",command="buildInfo",opcode="OP_MSG",result="ok"} 27
ferretdb_client_responses_total{argument="unknown",command="buildinfo",opcode="OP_MSG",result="ok"} 2
ferretdb_client_responses_total{argument="unknown",command="count",opcode="OP_MSG",result="ok"} 5
ferretdb_client_responses_total{argument="unknown",command="create",opcode="OP_MSG",result="ok"} 7
ferretdb_client_responses_total{argument="unknown",command="drop",opcode="OP_MSG",result="NamespaceNotFound"} 2
ferretdb_client_responses_total{argument="unknown",command="drop",opcode="OP_MSG",result="ok"} 1
ferretdb_client_responses_total{argument="unknown",command="dropDatabase",opcode="OP_MSG",result="ok"} 4
ferretdb_client_responses_total{argument="unknown",command="find",opcode="OP_MSG",result="ok"} 27
ferretdb_client_responses_total{argument="unknown",command="getCmdLineOpts",opcode="OP_MSG",result="ok"} 27
ferretdb_client_responses_total{argument="unknown",command="getFreeMonitoringStatus",opcode="OP_MSG",result="ok"} 20
ferretdb_client_responses_total{argument="unknown",command="getLog",opcode="OP_MSG",result="ok"} 20
ferretdb_client_responses_total{argument="unknown",command="getParameter",opcode="OP_MSG",result="Unset"} 27
ferretdb_client_responses_total{argument="unknown",command="hello",opcode="OP_MSG",result="ok"} 20
ferretdb_client_responses_total{argument="unknown",command="insert",opcode="OP_MSG",result="ok"} 15
ferretdb_client_responses_total{argument="unknown",command="ismaster",opcode="OP_MSG",result="ok"} 238
ferretdb_client_responses_total{argument="unknown",command="listCollections",opcode="OP_MSG",result="ok"} 49
ferretdb_client_responses_total{argument="unknown",command="listDatabases",opcode="OP_MSG",result="ok"} 12
ferretdb_client_responses_total{argument="unknown",command="ping",opcode="OP_MSG",result="ok"} 40
ferretdb_client_responses_total{argument="unknown",command="saslStart",opcode="OP_MSG",result="ok"} 70
ferretdb_client_responses_total{argument="unknown",command="setFreeMonitoring",opcode="OP_MSG",result="ok"} 1
ferretdb_client_responses_total{argument="unknown",command="unknown",opcode="OP_REPLY",result="ok"} 93
ferretdb_client_responses_total{argument="unknown",command="unknown",opcode="OP_REPLY",result="unhandled"} 3
ferretdb_up{branch="unknown",commit="3344cbb98bb744dd044bcf2d51fe9ab65db22f0b",debug="false",dirty="true",package="docker",telemetry="disabled",update_available="false",uuid="08174d33-05fd-45ed-adb9-d2e343e0af83",version="v1.1.0"} 1
process_cpu_seconds_total 16.98
process_max_fds 1.048576e+06
process_open_fds 13
process_resident_memory_bytes 2.5714688e+07
process_start_time_seconds 1.68425346762e+09
process_virtual_memory_bytes 7.52529408e+08
process_virtual_memory_max_bytes 1.8446744073709552e+19
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0
promhttp_metric_handler_requests_in_flight 1
promhttp_metric_handler_requests_total{code="200"} 2
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0

Resources

Please see the follwoing resources for FerretDB:

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

How to Run a AMD64 Bit Linux VM on a Mac M1

May 26th, 2023 8:35 am

This tutorial will show you how you can run 64bit Ubuntu Linux Virtual Machines on a Apple Mac M1 arm64 architecture macbook using UTM.

Installation

Head over to their documentation and download the UTM.dmg file and install it, once it is installed and you have opened UTM, you should see this screen:

Creating a Virtual Machine

In my case I would like to run a Ubuntu VM, so head over to the Ubuntu Server Download page and download the version of choice, I will be downloading Ubuntu Server 22.04, once you have your ISO image downloaded, you can head over to the next step which is to “Create a New Virtual Machine”:

I will select “Emulate” as I want to run a amd64 bit architecture, then select “Linux”:

In the next step we want to select the Ubuntu ISO image that we downloaded, which we want to use to boot our VM from:

Browse and select the image that you downloaded, once you selected it, it should show something like this:

Select continue, then select the architecture to x86_64, the system I kept on defaults and the memory I have set to 2048MB and cores to 2 but that is just my preference:

The next screen is to configure storage, as this is for testing I am setting mine to 8GB:

The next screen is shared directories, this is purely optional, I have created a directory for this:

mkdir ~/utm

Which I’ve then defined for a shared directory, but this depends if you need to have shared directories from your local workstation.

The next screen is a summary of your choices and you can name your vm here:

Once you are happy select save, and you should see something like this:

You can then select the play button to start your VM.

The console should appear and you can select install or try this vm:

This will start the installation process of a Linux Server:

Here you can select the options that you would like, I would just recommend to ensure that you select Install OpenSSH Server so that you can connect to your VM via SSH.

Once you get to this screen:

The installation process is busy and you will have to wait a couple of minutes for it to complete. Once you see the following screen the installation is complete:

On the right hand side select the circle, then select CD/DVD and select the ubuntu iso and select eject:

Starting your VM

Then power off the guest and power on again, then you should get a console login, then you can proceed to login, and view the ip address:

SSH to your VM

Now from your terminal you should be able to ssh to the VM:

We can also verify that we are running a 64bit vm, by running uname --processor:

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

Running a Multi-Broker Kafka Cluster on Docker

May 17th, 2023 10:50 am

In this post we will run a Kakfa cluster with 3 kafka brokers on docker compose and using a producer to send messages to our topics and a consumer that will receive the messages from the topics, which we will develop in python and explore the kafka-ui.

What is Kafka?

Kafka is a distributed event store and stream processing platform. Kafka is used to build real-time streaming data pipelines and real-time streaming applications.

This is a fantastic resource if you want to understand the components better in detail: - apache-kafka-architecture-what-you-need-to-know

But on a high level, the components of a typical Kafka setup:

Zookeeper: Kafka relies on Zookeeper to do leadership election of Kafka Brokers and Topic Partitions.
Broker: Kafka server that receives messages from producers, assigns them to offsets and commit the messages to disk storage. A offset is used for data consistency in a event of failure, so that consumers know from where to consume from their last message.
Topic: A topic can be thought of categories to organize messages. Producers writes messages to topics, consumers reads from those topics.
Partitions: A topic is split into multiple partitions. This improves scalability through parallelism (not just one broker). Kafka also does replication

For great in detail information about kafka and its components, I encourage you to visit the mentioned post from above.

Launch Kafka

This is the docker-compose.yaml that we will be using to run a kafka cluster with 3 broker containers, 1 zookeeper container, 1 producer, 1 consumer and a kafka-ui.

All the source code is available on my quick-starts github repository .

version: "3.9"

services:
  zookeeper:
    platform: linux/amd64
    image: confluentinc/cp-zookeeper:${CONFLUENT_PLATFORM_VERSION:-7.4.0}
    container_name: zookeeper
    restart: unless-stopped
    ports:
      - '32181:32181'
      - '2888:2888'
      - '3888:3888'
    environment:
      ZOOKEEPER_SERVER_ID: 1
      ZOOKEEPER_CLIENT_PORT: 32181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: zookeeper:2888:3888
    healthcheck:
      test: echo stat | nc localhost 32181
      interval: 10s
      timeout: 10s
      retries: 3
    networks:
      - kafka
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

  kafka-ui:
    container_name: kafka-ui
    image: provectuslabs/kafka-ui:latest
    ports:
      - 8080:8080
    depends_on:
      - broker-1
      - broker-2
      - broker-3
    environment:
      KAFKA_CLUSTERS_0_NAME: broker-1
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: broker-1:29091
      KAFKA_CLUSTERS_0_METRICS_PORT: 19101
      KAFKA_CLUSTERS_1_NAME: broker-2
      KAFKA_CLUSTERS_1_BOOTSTRAPSERVERS: broker-2:29092
      KAFKA_CLUSTERS_1_METRICS_PORT: 19102
      KAFKA_CLUSTERS_2_NAME: broker-3
      KAFKA_CLUSTERS_2_BOOTSTRAPSERVERS: broker-3:29093
      KAFKA_CLUSTERS_2_METRICS_PORT: 19103
      DYNAMIC_CONFIG_ENABLED: 'true'
    networks:
      - kafka
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

  broker-1:
    platform: linux/amd64
    image: confluentinc/cp-kafka:${CONFLUENT_PLATFORM_VERSION:-7.4.0}
    container_name: broker-1
    restart: unless-stopped
    ports:
      - '9091:9091'
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://broker-1:29091,EXTERNAL://localhost:9091
      KAFKA_DEFAULT_REPLICATION_FACTOR: 3
      KAFKA_NUM_PARTITIONS: 3
      KAFKA_JMX_PORT: 19101
      KAFKA_JMX_HOSTNAME: localhost
    healthcheck:
      test: nc -vz localhost 9091
      interval: 10s
      timeout: 10s
      retries: 3
    networks:
      - kafka
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

  broker-2:
    platform: linux/amd64
    image: confluentinc/cp-kafka:${CONFLUENT_PLATFORM_VERSION:-7.4.0}
    container_name: broker-2
    restart: unless-stopped
    ports:
      - '9092:9092'
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 2
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://broker-2:29092,EXTERNAL://localhost:9092
      KAFKA_DEFAULT_REPLICATION_FACTOR: 3
      KAFKA_NUM_PARTITIONS: 3
      KAFKA_JMX_PORT: 19102
      KAFKA_JMX_HOSTNAME: localhost
    healthcheck:
      test: nc -vz localhost 9092
      interval: 10s
      timeout: 10s
      retries: 3
    networks:
      - kafka
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

  broker-3:
    platform: linux/amd64
    image: confluentinc/cp-kafka:${CONFLUENT_PLATFORM_VERSION:-7.4.0}
    container_name: broker-3
    restart: unless-stopped
    ports:
      - '9093:9093'
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 3
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://broker-3:29093,EXTERNAL://localhost:9093
      KAFKA_DEFAULT_REPLICATION_FACTOR: 3
      KAFKA_NUM_PARTITIONS: 3
      KAFKA_JMX_PORT: 19103
      KAFKA_JMX_HOSTNAME: localhost
    healthcheck:
      test: nc -vz localhost 9093
      interval: 10s
      timeout: 10s
      retries: 3
    networks:
      - kafka
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

  producer:
    platform: linux/amd64
    container_name: producer
    image: ruanbekker/kafka-producer-consumer:2023-05-17
    # source: https://github.com/ruanbekker/quick-starts/tree/main/docker/kafka/python-client
    restart: always
    environment:
      - ACTION=producer
      - BOOTSTRAP_SERVERS=broker-1:29091,broker-2:29092,broker-3:29093
      - TOPIC=my-topic
      - PYTHONUNBUFFERED=1 # https://github.com/docker/compose/issues/4837#issuecomment-302765592
    networks:
      - kafka
    depends_on:
      - zookeeper
      - broker-1
      - broker-2
      - broker-3
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

  consumer:
    platform: linux/amd64
    container_name: consumer
    image: ruanbekker/kafka-producer-consumer:2023-05-17
    # source: https://github.com/ruanbekker/quick-starts/tree/main/docker/kafka/python-client
    restart: always
    environment:
      - ACTION=consumer
      - BOOTSTRAP_SERVERS=broker-1:29091,broker-2:29092,broker-3:29093
      - TOPIC=my-topic
      - CONSUMER_GROUP=cg-group-id
      - PYTHONUNBUFFERED=1 # https://github.com/docker/compose/issues/4837#issuecomment-302765592
    networks:
      - kafka
    depends_on:
      - zookeeper
      - broker-1
      - broker-2
      - broker-3
      - producer
    logging:
      driver: "json-file"
      options:
        max-size: "1m"

networks:
  kafka:
    name: kafka

Note: This docker-compose yaml can be found in my kafka quick-starts repository.

In our compose file we defined our core stack:

1 Zookeeper Container
3 Kafka Broker Containers
1 Kafka UI

Then we have our clients:

1 Producer that will send messages to our topics (source code: https://github.com/ruanbekker/quick-starts/blob/main/docker/kafka/python-client/produce.py )
1 Consumer that will read the messages from our topics (source code: https://github.com/ruanbekker/quick-starts/blob/main/docker/kafka/python-client/consume.py )

We can boot the stack with:

docker-compose up -d

You can verify that the brokers are passing their health checks with:

docker-compose ps

NAME                IMAGE                                           COMMAND                  SERVICE             CREATED             STATUS                   PORTS
broker-1            confluentinc/cp-kafka:7.4.0                     "/etc/confluent/dock…"   broker-1            5 minutes ago       Up 4 minutes (healthy)   0.0.0.0:9091->9091/tcp, :::9091->9091/tcp, 9092/tcp
broker-2            confluentinc/cp-kafka:7.4.0                     "/etc/confluent/dock…"   broker-2            5 minutes ago       Up 4 minutes (healthy)   0.0.0.0:9092->9092/tcp, :::9092->9092/tcp
broker-3            confluentinc/cp-kafka:7.4.0                     "/etc/confluent/dock…"   broker-3            5 minutes ago       Up 4 minutes (healthy)   9092/tcp, 0.0.0.0:9093->9093/tcp, :::9093->9093/tcp
consumer            ruanbekker/kafka-producer-consumer:2023-05-17   "sh /src/run.sh $ACT…"   consumer            5 minutes ago       Up 4 minutes
kafka-ui            provectuslabs/kafka-ui:latest                   "/bin/sh -c 'java --…"   kafka-ui            5 minutes ago       Up 4 minutes             0.0.0.0:8080->8080/tcp, :::8080->8080/tcp
producer            ruanbekker/kafka-producer-consumer:2023-05-17   "sh /src/run.sh $ACT…"   producer            5 minutes ago       Up 4 minutes
zookeeper           confluentinc/cp-zookeeper:7.4.0                 "/etc/confluent/dock…"   zookeeper           5 minutes ago       Up 5 minutes (healthy)   0.0.0.0:2888->2888/tcp, :::2888->2888/tcp, 0.0.0.0:3888->3888/tcp, :::3888->3888/tcp, 2181/tcp, 0.0.0.0:32181->32181/tcp, :::32181->32181/tcp

Producers and Consumers

The producer generates random data and sends it to a topic, where the consumer will listen on the same topic and read messages from that topic.

To view the output of what the producer is doing, you can tail the logs:

docker logs -f producer

setting up producer, checking if brokers are available
brokers not available yet
brokers are available and ready to produce messages
message sent to kafka with squence id of 1
message sent to kafka with squence id of 2
message sent to kafka with squence id of 3

And to view the output of what the consumer is doing, you can tail the logs:

docker logs -f consumer

starting consumer, checks if brokers are availabe
brokers not availbe yet
brokers are available and ready to consume messages
{'sequence_id': 10, 'user_id': '20520', 'transaction_id': '4026fd10-2aca-4d2e-8bd2-8ef0201af2dd', 'product_id': '17974', 'address': '71741 Lopez Throughway | South John | BT', 'signup_at': '2023-05-11 06:54:52', 'platform_id': 'Tablet', 'message': 'transaction made by userid 119740995334901'}
{'sequence_id': 11, 'user_id': '78172', 'transaction_id': '4089cee1-0a58-4d9b-9489-97b6bc4b768f', 'product_id': '21477', 'address': '735 Jasmine Village Apt. 009 | South Deniseland | BN', 'signup_at': '2023-05-17 09:54:10', 'platform_id': 'Tablet', 'message': 'transaction made by userid 159204336307945'}

Kafka UI

The Kafka UI will be available on http://localhost:8080

Where we can view lots of information, but in the below screenshot we can see our topics:

And when we look at the my-topic, we can see a overview dashboard of our topic information:

We can also look at the messages in our topic, and also search for messages:

And we can also look at the current consumers:

Resources

My Quick-Starts Github Repository:

https://github.com/ruanbekker/quick-starts

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

Manage Helm Releases With Terraform

Mar 9th, 2023 4:15 pm

helm-releases-with-terraform

In this post we will use terraform to deploy a helm release to kubernetes.

Kubernetes

For this demonstration I will be using kind to deploy a local Kubernetes cluster to the operating system that I am running this on, which will be Ubuntu Linux. For a more in-depth tutorial on Kind, you can see my post on Kind for Local Kubernetes Clusters.

Installing the Pre-Requirements

We will be installing terraform, docker, kind and kubectl on Linux.

Install terraform:

wget https://releases.hashicorp.com/terraform/1.3.0/terraform_1.3.0_linux_amd64.zip
unzip terraform_1.3.0_linux_amd64.zip
rm terraform_1.3.0_linux_amd64.zip
mv terraform /usr/bin/terraform

Verify that terraform has been installed:

terraform -version

Which in my case returns:

Terraform v1.3.0
on linux_amd64

Install Docker on Linux (be careful to curl pipe bash - trust the scripts that you are running):

curl https://get.docker.com | bash

Then running docker ps should return:

CONTAINER ID   IMAGE        COMMAND         CREATED          STATUS          PORTS       NAMES

Install kind on Linux:

apt update
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.17.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

Then to verify that kind was installed with kind --version should return:

kind version 0.17.0

Create a kubernetes cluster using kind:

kind create cluster --name rbkr --image kindest/node:v1.24.0

Now install kubectl:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Then to verify that kubectl was installed:

kubectl version --client

Which in my case returns:

Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.1", GitCommit:"8f94681cd294aa8cfd3407b8191f6c70214973a4", GitTreeState:"clean", BuildDate:"2023-01-18T15:58:16Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7

Now we can test if kubectl can communicate with the kubernetes api server:

kubectl get nodes

In my case it returns:

NAME                 STATUS   ROLES           AGE     VERSION
rbkr-control-plane   Ready    control-plane   6m20s   v1.24.0

Terraform

Now that our pre-requirements are sorted we can configure terraform to communicate with kubernetes. For that to happen, we need to consult the terraform kubernetes provider’s documentation.

As per their documentation they provide us with this snippet:

terraform {
  required_providers {
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "2.18.0"
    }
  }
}

provider "kubernetes" {
  # Configuration options
}

And from their main page, it gives us a couple of options to configure the provider and the easiest is probably to read the ~/.kube/config configuration file.

But in cases where you have multiple configurations in your kube config file, this might not be ideal, and I like to be precise, so I will extract the client certificate, client key and cluster ca certificate and endpoint from our ~/.kube/config file.

If we run cat ~/.kube/config we will see something like this:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRU......FURS0tLS0tCg==
    server: https://127.0.0.1:40305
  name: kind-rbkr
contexts:
- context:
    cluster: kind-rbkr
    user: kind-rbkr
  name: kind-rbkr
current-context: kind-rbkr
kind: Config
preferences: {}
users:
- name: kind-rbkr
  user:
    client-certificate-data: LS0tLS1CRX......FURS0tLS0tCg==
    client-key-data: LS0tLS1CRUejhKWUk2N2.....S0tCg==

First we will create a directory for our certificates:

mkdir ~/certs

I have truncated my kube config for readability, but for our first file certs/client-cert.pem we will copy the value of client-certificate-data:, which will look something like this:

cat certs/client-cert.pem
LS0tLS1CRX......FURS0tLS0tCg==

Then we will copy the contents of client-key-data: into certs/client-key.pem and then lastly the content of certificate-authority-data: into certs/cluster-ca-cert.pem.

So then we should have the following files inside our certs/ directory:

tree certs/
certs/
├── client-cert.pem
├── client-key.pem
└── cluster-ca-cert.pem

0 directories, 3 files

Now make them read only:

chmod 400 ~/certs/*

Now that we have that we can start writing our terraform configuration. In providers.tf:

terraform {
  required_providers {
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "2.18.0"
    }
  }
}

provider "kubernetes" {
  host                   = "https://127.0.0.1:40305"
  client_certificate     = base64decode(file("~/certs/client-cert.pem"))
  client_key             = base64decode(file("~/certs/client-key.pem"))
  cluster_ca_certificate = base64decode(file("~/certs/cluster-ca-cert.pem"))
}

Your host might look different to mine, but you can find your host endpoint in ~/.kube/config.

For a simple test we can list all our namespaces to ensure that our configuration is working. In a file called namespaces.tf, we can populate the following:

data "kubernetes_all_namespaces" "allns" {}

output "all-ns" {
  value = data.kubernetes_all_namespaces.allns.namespaces
}

Now we need to initialize terraform so that it can download the providers:

terraform init

Then we can run a plan which will reveal our namespaces:

terraform plan

data.kubernetes_all_namespaces.allns: Reading...
data.kubernetes_all_namespaces.allns: Read complete after 0s [id=a0ff7e83ffd7b2d9953abcac9f14370e842bdc8f126db1b65a18fd09faa3347b]

Changes to Outputs:
  + all-ns = [
      + "default",
      + "kube-node-lease",
      + "kube-public",
      + "kube-system",
      + "local-path-storage",
    ]

We can now remove our namespaces.tf as our test worked:

rm namespaces.tf

Helm Releases with Terraform

We will need two things, we need to consult the terraform helm release provider documentation and we also need to consult the helm chart documentation which we are interested in.

In my previous post I wrote about Everything you need to know about Helm and I used the Bitnami Nginx Helm Chart, so we will use that one again.

As we are working with helm releases, we need to configure the helm provider, I will just extend my configuration from my previous provider config in providers.tf:

terraform {
  required_providers {
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "2.18.0"
    }
    helm = {
      source = "hashicorp/helm"
      version = "2.9.0"
    }
  }
}

provider "kubernetes" {
  host                   = "https://127.0.0.1:40305"
  client_certificate     = base64decode(file("~/certs/client-cert.pem"))
  client_key             = base64decode(file("~/certs/client-key.pem"))
  cluster_ca_certificate = base64decode(file("~/certs/cluster-ca-cert.pem"))
}

provider "helm" {
  kubernetes {
    host                   = "https://127.0.0.1:40305"
    client_certificate     = base64decode(file("~/certs/client-cert.pem"))
    client_key             = base64decode(file("~/certs/client-key.pem"))
    cluster_ca_certificate = base64decode(file("~/certs/cluster-ca-cert.pem"))
  }
}

We will create three terraform files:

touch {main,outputs,variables}.tf

And our values yaml in helm-chart/nginx/values.yaml:

mkdir -p helm-chart/nginx

Then you can copy the values file from https://artifacthub.io/packages/helm/bitnami/nginx?modal=values into helm-chart/nginx/values.yaml.

In our main.tf I will use two ways to override values in our values.yaml using set and templatefile. The reason for the templatefile, is when we want to fetch a value and want to replace the content with our values file, it could be used when we retrieve a value from a data source as an example. In my example im just using a variable.

We will have the following:

resource "helm_release" "nginx" {
  name             = var.release_name
  version          = var.chart_version
  namespace        = var.namespace
  create_namespace = var.create_namespace
  chart            = var.chart_name
  repository       = var.chart_repository_url
  dependency_update = true
  reuse_values      = true
  force_update      = true
  atomic              = var.atomic

  set {
    name  = "image.tag"
    value = "1.23.3-debian-11-r3"
  }

  set {
    name  = "service.type"
    value = "ClusterIP"
  }

  values = [
    templatefile("${path.module}/helm-chart/nginx/values.yaml", {
      NAME_OVERRIDE   = var.release_name
    }
  )]

}

As you can see we are referencing a NAME_OVERRIDE in our values.yaml, I have cleaned up the values file to the following:

nameOverride: "${NAME_OVERRIDE}"

## ref: https://hub.docker.com/r/bitnami/nginx/tags/
image:
  registry: docker.io
  repository: bitnami/nginx
  tag: 1.23.3-debian-11-r3

The NAME_OVERRIDE must be in a ${} format.

In our variables.tf we will have the following:

variable "release_name" {
  type        = string
  default     = "nginx"
  description = "The name of our release."
}

variable "chart_repository_url" {
  type        = string
  default     = "https://charts.bitnami.com/bitnami"
  description = "The chart repository url."
}

variable "chart_name" {
  type        = string
  default     = "nginx"
  description = "The name of of our chart that we want to install from the repository."
}

variable "chart_version" {
  type        = string
  default     = "13.2.20"
  description = "The version of our chart."
}

variable "namespace" {
  type        = string
  default     = "apps"
  description = "The namespace where our release should be deployed into."
}

variable "create_namespace" {
  type        = bool
  default     = true
  description = "If it should create the namespace if it doesnt exist."
}

variable "atomic" {
  type        = bool
  default     = false
  description = "If it should wait until release is deployed."
}

And lastly our outputs.tf:

output "metadata" {
  value = helm_release.nginx.metadata
}

Now that we have all our configuration ready, we can initialize terraform:

terraform init

Then we can run a plan to see what terraform wants to deploy:

terraform plan

The plan output shows the following:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # helm_release.nginx will be created
  + resource "helm_release" "nginx" {
      + atomic                     = false
      + chart                      = "nginx"
      + cleanup_on_fail            = false
      + create_namespace           = true
      + dependency_update          = false
      + disable_crd_hooks          = false
      + disable_openapi_validation = false
      + disable_webhooks           = false
      + force_update               = false
      + id                         = (known after apply)
      + lint                       = false
      + manifest                   = (known after apply)
      + max_history                = 0
      + metadata                   = (known after apply)
      + name                       = "nginx"
      + namespace                  = "apps"
      + pass_credentials           = false
      + recreate_pods              = false
      + render_subchart_notes      = true
      + replace                    = false
      + repository                 = "https://charts.bitnami.com/bitnami"
      + reset_values               = false
      + reuse_values               = false
      + skip_crds                  = false
      + status                     = "deployed"
      + timeout                    = 300
      + values                     = [
          + <<-EOT
                nameOverride: "nginx"

                ## ref: https://hub.docker.com/r/bitnami/nginx/tags/
                image:
                  registry: docker.io
                  repository: bitnami/nginx
                  tag: 1.23.3-debian-11-r3
            EOT,
        ]
      + verify                     = false
      + version                    = "13.2.20"
      + wait                       = false
      + wait_for_jobs              = false

      + set {
          + name  = "image.tag"
          + value = "1.23.3-debian-11-r3"
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + metadata = (known after apply)

Once we are happy with our plan, we can run a apply:

terraform apply

Plan: 1 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + metadata = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

helm_release.nginx: Creating...
helm_release.nginx: Still creating... [10s elapsed]

metadata = tolist([
  {
    "app_version" = "1.23.3"
    "chart" = "nginx"
    "name" = "nginx"
    "namespace" = "apps"
    "revision" = 1
    "values" = "{\"image\":{\"registry\":\"docker.io\",\"repository\":\"bitnami/nginx\",\"tag\":\"1.23.3-debian-11-r3\"},\"nameOverride\":\"nginx\"}"
    "version" = "13.2.20"
  },
])

Then we can verify if the pod is running:

kubectl get pods -n apps
NAME                    READY   STATUS    RESTARTS   AGE
nginx-59bdc6465-xdbfh   1/1     Running   0          2m35s

Importing Helm Releases into Terraform State

If you have an existing helm release that was deployed with helm and you want to transfer the ownership to terraform, you first need to write the terraform code, then import the resources into terraform state using:

terraform import helm_release.nginx apps/nginx

Where the last argument is <namespace>/<release-name>. Once that is imported you can run terraform plan and apply.

If you want to discover all helm releases managed by helm you can use:

kubectl get all -A -l app.kubernetes.io/managed-by=Helm

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

Persisting Terraform Remote State in Gitlab

Mar 5th, 2023 1:43 am

terraform-state-gitlab

In this tutorial we will demonstrate how to persist your terraform state in gitlab managed terraform state, using the terraform http backend.

For detailed information about this consult their documentation

What are we doing?

We will create a terraform pipeline which will run the plan step automatically and a manual step to run the apply step.

During these steps and different pipelines we need to persist our terraform state remotely so that new pipelines can read from our state what we last stored.

Gitlab offers a remote backend for our terraform state which we can use, and we will use a basic example of using the random resource.

Prerequisites

If you don’t see the “Infrastructure” menu on your left, you need to enable it at “Settings”, “General”, “Visibility”, “Project features”, “Permissions” and under “Operations”, turn on the toggle.

For more information on this see their documentation

Authentication

For this demonstration I created a token which is only scoped for this one project, for this we need a to create a token under, “Settings”, “Access Tokens”:

Select the api under scope:

Store the token name and token value as TF_USERNAME and TF_PASSWORD as a CICD variable under “Settings”, “CI/CD”, “Variables”.

Terraform Code

We will use a basic random_uuid resource for this demonstration, our main.tf:

resource "random_uuid" "uuid" {}

output "uuid" {
  value       = random_uuid.uuid.result
  sensitive   = false
}

Our providers.tf, you will notice the backend "http" {} is what is required for our gitlab remote state:

terraform {
  required_providers {
    random = {
      source = "hashicorp/random"
      version = "3.4.3"
    }
  }
  backend "http" {}
  required_version = "~> 1.3.6"
}

provider "random" {}

Push that up to gitlab for now.

Gitlab Pipeline

Our .gitlab-ci.yml consists of a plan step and a apply step which is a manual step as we first want to review our plan step before we apply.

Our pipeline will only run on the default branch, which in my case is main:

image:
  name: hashicorp/terraform:1.3.6
  entrypoint: [""]

cache:
  paths:
    - .terraform

workflow:
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
    - when: never

variables:
  TF_ADDRESS: "https://gitlab.com/api/v4/projects/${CI_PROJECT_ID}/terraform/state/default-terraform.tfstate"

stages:
  - plan
  - apply

.terraform_init: &terraform_init
  - terraform init
      -backend-config=address=${TF_ADDRESS}
      -backend-config=lock_address=${TF_ADDRESS}/lock
      -backend-config=unlock_address=${TF_ADDRESS}/lock
      -backend-config=username=${TF_USERNAME}
      -backend-config=password=${TF_PASSWORD}
      -backend-config=lock_method=POST
      -backend-config=unlock_method=DELETE
      -backend-config=retry_wait_min=5

terraform:plan:
  stage: plan
  artifacts:
    paths:
      - '**/*.tfplan'
      - '**/.terraform.lock.hcl'
  before_script:
    - *terraform_init
  script:
    - terraform validate
    - terraform plan -input=false -out default.tfplan

terraform:apply:
  stage: apply
  artifacts:
    paths:
      - '**/*.tfplan'
      - '**/.terraform.lock.hcl'
  before_script:
    - *terraform_init
  script:
    - terraform apply -input=false -auto-approve default.tfplan
  when: manual

Where the magic happens is in the terraform init step, that is where we will initialize the terraform state in gitlab, and as you can see we are taking the TF_ADDRESS variable to define the path of our state and in this case our state file will be named default-terraform.tfstate.

If it was a case where you are deploying multiple environments, you can use something like ${ENVIRONMENT}-terraform.tfstate.

When we run our pipeline, we can look at our plan step:

Once we are happy with this we can run the manual step and do the apply step, then our pipeline should look like this:

When we inspect our terraform state in the infrastructure menu, we can see the state file was created:

Thank You

Thanks for reading, feel free to check out my website, feel free to subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon

← Older Blog Archives

Carbon

Slack

Join me on Slack

Twitter

Follow me on Twitter: @ruanbekker

Say Hi!

Send me a note using the saythanks.io project.

View my newsletter on digests.ruanbekker.com

Cheetsheet Repository

Have a look at my Cheetsheets Github Repository:

Store

Check out my Store: Have a look at my latest elasticsearch cheatsheet in PDF format.