Ruan Bekker's Blog

From a Curious mind to Posts on Github

Running vs Code in Your Browser With Docker

vscode

Say Thanks! Slack Status Chat on Slack GitHub followers

Today we will setup a Visual Studio Code instance running on Docker, so that you can access VSCode via the web browser.

VSCode in Docker

The work directory will be under code and the application will store its data under data. Lets go ahead and create them:

1
2
mkdir demo/{code,data}
cd demo

Run the vscode container:

1
2
3
4
$ docker run --rm --name vscode \
  -it -p 8443:8443 -p 8888:8888 \
  -v $(pwd)/data:/data -v $(pwd)/code:/code \
ruanbekker/vscode:python-3.7

The password that you require on login will be prompted in the output:

1
2
3
4
5
6
7
8
9
10
11
INFO  code-server v1.1156-vsc1.33.1
INFO  Additional documentation: http://github.com/cdr/code-server
INFO  Initializing {"data-dir":"/data","extensions-dir":"/data/extensions","working-dir":"/code","log-dir":"/root/.cache/code-server/logs/20190914105631217"}
INFO  Starting shared process [1/5]...
INFO  Starting webserver... {"host":"0.0.0.0","port":8443}
INFO
INFO  Password: 4b050c4fa0ef109d53c10d9f
INFO
INFO  Started (click the link below to open):
INFO  https://localhost:8443/
INFO  Connected to shared process

Access vscode on https://localhost:8443/ and after you accepted the self-signed certificate warning, you will be presented with the login page:

image

After you have logged a example of creating a python file will look like this:

image

The source code for this docker image can be found at https://github.com/ruanbekker/dockerfiles/tree/master/vscode .

Different versions

Currently I have only python available on docker hub with the requests and flask packages available. But you can fork the repository and add the upstream or packages of your choice.

Expire Objects in AWS S3 Automatically After 30 Days

In AWS S3 you can make use of lifecycle policies to manage the lifetime of your objects stored in S3.

In this tutorial, I will show you how to delete objects automatically from S3 after 30 days.

Navigate to your Bucket

Head over to your AWS S3 bucket where you want to delete objects after they have been stored for 30 days:

0400F9CB-9223-4FDF-8FA5-D0BC1FA8EB71

Lifecycle Policies

Select “Management” and click on “Add lifecycle rule”:

9BB26C7C-F251-45C4-AE44-A34459BD0F4B

Set a rule name of choice and you have the option to provide a prefix if you want to delete objects based on a specific prefix. I will leave this blank as I want to delete objects in the root level of the bucket. Head to next on the following section:

AEF8B151-3FA8-454F-AC71-778A531BD1EE

From the “Transitions” section, configure the transition section, by selecting to expire the current version of the object after 30 days:

2B395671-A4C0-4E5A-82E7-00EE6579DB5A

Review the configuration:

F7F8E800-62FF-4156-B506-5FB9BCC148E0

When you select “Save”, you should be returned to the following section:

8421EBCE-9503-4259-92AA-DB66C6F532AF

Housecleaning on your S3 Bucket

Now 30 days after you created objects on AWS S3, they will be deleted.

Reindex Elasticsearch Indices With Logstash

logstash

In this tutorial I will show you how to reindex daily indices to a monthly index on Elasticsearch using Logstash

Use Case

In this scenario we have filebeat indices which have a low document count and would like to aggregate the daily indices into a bigger index, which will be a monthly index. So reindexing from "filebeat-2019.08.*" to "filebeat-monthly-2019.08".

Overview of our Setup

Here we can see all the indices that we would like to read from"

1
2
3
4
5
6
7
$ curl 10.37.117.130:9200/_cat/indices/filebeat-2019.08.*?v
health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   filebeat-2019.08.28 qoKiHUjQT5eNVF_wjLi9fA   5   1         17            0    295.4kb        147.7kb
green  open   filebeat-2019.08.27 8PWngqFdRPKLEnrCCiw6xA   5   1        301            0    900.9kb          424kb
green  open   filebeat-2019.08.29 PiG2ma8zSbSt6sSg7soYPA   5   1         24            0    400.2kb          196kb
green  open   filebeat-2019.08.31 XSWZvqQDR0CugD23y6_iaA   5   1         27            0    451.5kb        222.1kb
green  open   filebeat-2019.08.30 u_Hr9fA5RtOtpabNGUmSpw   5   1         18            0    326.1kb          163kb

I have 3 nodes in my elasticsearch cluster:

1
2
3
4
5
$ curl 10.37.117.130:9200/_cat/nodes?v
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.37.117.132           56          56   5    0.47    0.87     1.10 mdi       -      elasticsearch-01
10.37.117.130           73          56   4    0.47    0.87     1.10 mdi       -      elasticsearch-03
10.37.117.199           29          56   4    0.47    0.87     1.10 mdi       *      elasticsearch-02

As elasticsearch create 5 primary shards by default, I want to override this behavior to creating 3 primary shards. I will be using a template, so whenever a index get created with the index pattern `“-monthly-”, it will apply the settings to create 3 primary shards and 1 replica shards:

1
2
3
$ curl -H 'Content-Type: application/json' -XPUT 10.37.117.130:9200/_template/monthly -d '
{"index_patterns": ["*-monthly-*"], "order": -1, "settings": {"number_of_shards": "3", "number_of_replicas": "1"}}
'

Logstash Configuration

Our logstash configuration which we will use, will read from elasticsearch and the index pattern which we want to read from. Then our ouput configuration instructs where to write the data to:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ cat /tmp/logstash/logstash.conf
input {
  elasticsearch {
    hosts => [ "http://10.37.117.132:9200" ]
    index => "filebeat-2019.08.*"
    size => 500
    scroll => "5m"
    docinfo => true
  }
}

output {
  elasticsearch {
    hosts => ["http://10.37.117.199:9200"]
    index => "filebeat-monthly-2019.08"
    document_id => "%{[@metadata][_id]}"
  }
  stdout {
    codec => "dots"
  }
}

Reindex the Data

I will be using docker to run logstash, and map the configuration to the configuration directory inside the container:

1
2
3
4
5
$ sudo docker run --rm -it -v /tmp/logstash:/usr/share/logstash/pipeline docker.elastic.co/logstash/logstash-oss:6.2.4
[2019-09-08T10:57:36,170][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x7db57d5f run>"}
[2019-09-08T10:57:36,325][INFO ][logstash.agent           ] Pipelines running {:count=>1, :pipelines=>["main"]}
...
[2019-09-08T10:57:39,359][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"main", :thread=>"#<Thread:0x7db57d5f run>"}

Review that the data was reindexed:

1
2
3
4
5
6
7
8
$ curl 10.37.117.130:9200/_cat/indices/*filebeat-*08*?v
health status index                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   filebeat-2019.08.28      qoKiHUjQT5eNVF_wjLi9fA   5   1         17            0    295.4kb        147.7kb
green  open   filebeat-2019.08.29      PiG2ma8zSbSt6sSg7soYPA   5   1         24            0    400.2kb          196kb
green  open   filebeat-2019.08.30      u_Hr9fA5RtOtpabNGUmSpw   5   1         18            0    326.1kb          163kb
green  open   filebeat-2019.08.27      8PWngqFdRPKLEnrCCiw6xA   5   1        301            0    900.9kb          424kb
green  open   filebeat-2019.08.31      XSWZvqQDR0CugD23y6_iaA   5   1         27            0    451.5kb        222.1kb
green  open   filebeat-monthly-2019.08 VZD8iDjfTfeyP-SWB9l2Pg   3   1        387            0    577.8kb        274.7kb

Once we are happy with what we are seeing, we can delete the source data:

1
2
$ curl -XDELETE "10.37.117.130:9200/filebeat-2019.08.*"
{"acknowledged":true}

Deploy a Monitoring Stack on Docker Swarm With Grafana and Prometheus

Say Thanks! Slack Status Chat on Slack GitHub followers

In this tutorial we will deploy a monitoring stack to docker swarm, that includes Grafana, Prometheus, Node-Exporter, cAdvisor and Alertmanager.

If you are looking for more information on Prometheus, have a look at my other Prometheus and Monitoring blog posts.

What you will get out of this

Once you deployed the stacks, you will have the following:

  • Access Grafana through Traefik reverse proxy
  • Node-Exporter to expose node level metrics
  • cAdvisor to expose container level metrics
  • Prometheus to scrape the exposed entpoints and ingest it into Prometheus
  • Prometheus for your Timeseries Database
  • Alertmanager for firing alerts on configured rules

The compose file that I will provide will have pre-populated dashboards

Deploy Traefik

Get the traefik stack sources:

1
2
$ git clone https://github.com/bekkerstacks/traefik
$ pushd traefik

Have a look at HTTPS Mode if you want to deploy traefik on HTTPS, as I will use HTTP in this demonstration.

Set your domain and deploy the stack:

1
2
3
4
5
6
7
8
9
10
$ DOMAIN=localhost PROTOCOL=http bash deploy.sh

Username for Traefik UI: demo
Password for Traefik UI: 
deploying traefik stack in http mode
Creating network public
Creating config proxy_traefik_htpasswd
Creating service proxy_traefik
Traefik UI is available at:
- http://traefik.localhost

Your traefik service should be running:

1
2
3
$ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
0wga71zbx1pe        proxy_traefik       replicated          1/1                 traefik:1.7.14      *:80->80/tcp

Switch back to the previous directory:

1
$ popd

Deploy the Monitoring Stack

Get the sources:

1
2
$ git clone https://github.com/bekkerstacks/monitoring-cpang
$ pushd monitoring-cpang

If you want to deploy the stack with no pre-configured dashboards, you would need to use ./docker-compose.html, but in this case we will deploy the stack with pre-configured dashboards.

Set the domain and deploy the stack:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ docker stack deploy -c alt_versions/docker-compose_http_with_dashboards.yml mon

Creating network private
Creating config mon_grafana_config_datasource
Creating config mon_grafana_dashboard_prometheus
Creating config mon_grafana_dashboard_docker
Creating config mon_grafana_dashboard_nodes
Creating config mon_grafana_dashboard_blackbox
Creating config mon_alertmanager_config
Creating config mon_prometheus_config
Creating config mon_prometheus_rules
Creating service mon_blackbox-exporter
Creating service mon_alertmanager
Creating service mon_prometheus
Creating service mon_grafana
Creating service mon_cadvisor
Creating service mon_node-exporter

The endpoints is configured as ${service_name}.${DOMAIN} so you will be able to access grafana on http://grafana.localhost as showed in my use-case.

Use docker stack services mon to see if all the tasks has checked into its desired count then access grafana on http://grafana.${DOMAIN}

Accessing Grafana

Access Grafana on http://grafana.${DOMAIN} and logon with the user admin and the password admin:

image

You will be asked to reset the password:

image

You will then be directed to the ui:

image

From the top, when you list dashboards, you will see the 3 dashboards that was pre-configured:

image

When looking at the Swarm Nodes Dashboard:

image

The Swarm Services Dashboard:

image

Exploring Metrics in Prometheus

Access prometheus on http://prometheus.${DOMAIN} and from the search input, you can start exploring though all the metrics that is available in prometheus:

image

If we search for node_load15 and select graph, we can have a quick look on how the 15 minute load average looks like for the node where the stack is running on:

image

Having a look at the alerts section:

image

Resources

For more information and configuration on the stack that we use, have a look at the wiki: - https://github.com/bekkerstacks/monitoring-cpang/wiki

The github repository: - https://github.com/bekkerstacks/monitoring-cpang

Thank You

Let me know what you think. If you liked my content, feel free to checkout my content on ruan.dev or follow me on twitter at @ruanbekker

Deploy Traefik Using Bekker Stacks

image

Say Thanks! Slack Status Chat on Slack GitHub followers

After a year or two spending quite a lot of time into docker and more specifically docker swarm, I found it quite tedious to write up docker-compose files for specific stacks that you are working on. I also felt the need for a docker swarm compose package manager.

Fair enough, you store them on a central repository and then you can reuse them as you go, and that is exactly what I did, but I felt that perhaps other people have the same problem.

The Main Idea

So the main idea is to have a central repository with docker swarm stacks, that you can pick and choose what you want, pull down the repository and use environment variables to override the default configuration and use the deploy script to deploy the stack that you want.

Future Ideas

In the future I would like to create a cli tool that you can use to list stacks, as example:

1
2
3
4
5
6
$ bstacks list
traefik
monitoring-cpang (cAdvisor, Prometheus, Alertmanager, Node-Exporter, Grafana)
monitoring-tig   (Telegraf, InfluxDB, Grafana)
logging-efk      (Elasticsearch, Filebeat, Kibana)
...

Listing stacks by category:

1
2
3
$ bstacks list --category logging
logging-efk
...

Deploying a stack:

1
2
3
4
5
6
7
8
$ bstacks deploy --stack traefik --stack-name proxy --env-file ./stack.env
Username for Traefik UI: ruan
Password for Traefik UI: deploying traefik stack in http mode
Creating network public
Creating config proxy_traefik_htpasswd
Creating service proxy_traefik
Traefik UI is available at:
- http://traefik.localhost

At the time of writing the cli tool is not available yet, but the list of available templated docker stack repositories are availabe at github.com/bekkerstacks

What are we doing today

In this tutorial we will deploy a Traefik proxy on Docker Swarm. I will be demonstrating the deployment on my Mac, and currently I have only docker installed, without a swarm being initialized.

If you already have a swarm initialized and running this on servers, you can skip the local dev section.

Local Dev

We will be initializing a 3 node docker swarm on a mac using docker-in-docker. Get the repository:

1
$ git clone https://github.com/bekkerstacks/docker-swarm

Switch to the directory and deploy the swarm:

1
2
3
4
5
6
7
$ bash deploy.sh

ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
lkyjkvuc5uobzgps4m7e98l0u *   docker-desktop      Ready               Active              Leader              19.03.1
6djgz804emj89rs8icd53wfgn     worker-1            Ready               Active                                  18.06.3-ce
gcz6ou0s5p8kxve63ihnky7ai     worker-2            Ready               Active                                  18.06.3-ce
ll8zfvuaek8q4x9nlijib0dfa     worker-3            Ready               Active                                  18.06.3-ce

As you can see we have a 4 node docker swarm running on our local dev environment to continue.

Deploy Traefik

To deploy traefik in HTTPS mode, we need to set 3 environment variables: EMAIL, DOMAIN, PROTOCOL. We also need to setup our DNS to direct traefik to our swarm. In my case I will be using 1.2.3.4 as the IP of my Manager node and using the domain mydomain.com

The DNS setup will look like this:

1
2
A Record: mydomain.com -> 1.1.1.1
A Record: *.mydomain.com -> 1.1.1.1

And if you are using this locally, you can setup your /etc/hosts to 127.0.0.1 traefik.mydomain.com

Clone the repository:

1
$ git clone https://github.com/bekkerstacks/traefik

Change the the repository and deploy the stack:

1
2
3
4
5
6
7
8
$ EMAIL=me@mydomain.com DOMAIN=mydomain.com PROTOCOL=https bash deploy.sh
Username for Traefik UI: ruan
Password for Traefik UI: deploying traefik stack in https mode
Creating network public
Creating config proxy_traefik_htpasswd
Creating service proxy_traefik
Traefik UI is available at:
- https://traefik.mydomain.com

Verify that the Traefik service is running:

1
2
3
$ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
0wga71zbx1pe        proxy_traefik       replicated          1/1                 traefik:1.7.14      *:80->80/tcp

Navigating to the Traefik Dashboard, after providing your username and password, you should see the Traefik UI:

Note: I don’t own mydomain.com therefore I am using the traefik default cert, that will be why it’s showing not secure.

Deploy Traefik in HTTP Mode

If you want to deploy Traefik in HTTP mode rather, you would use:

1
2
3
4
5
6
7
8
$ DOMAIN=localhost PROTOCOL=http bash deploy.sh
Username for Traefik UI: ruan
Password for Traefik UI: deploying traefik stack in http mode
Creating network public
Creating config proxy_traefik_htpasswd
Creating service proxy_traefik
Traefik UI is available at:
- http://traefik.localhost

Navigating to the Traefik Dashboard, after providing your username and password, you should see the Traefik UI:

More Info

In future posts, I will demonstrate how to deploy other stacks using bekkerstacks.

Have a look at the repositories on github for more info:

Thank You

Let me know what you think. If you liked my content, feel free to checkout my content on ruan.dev or follow me on twitter at @ruanbekker

AWS S3 KMS and Python for Secrets Management

So your application need to store secrets and you are looking for a home for them. In this tutorial we will see how we can use Python, S3 and KMS to build our own solution for managing secrets.

There is SSM and Secrets Manager that probably does a better job, but my mind got curious :D

High Level Goal

From a High-Level we want to store secrets encrypted on S3 with KMS, namespaced with team/application/environment/value in json format so that our application receives the json dictionary of configured key/value pairs.

We can leverage IAM to delegate permissions on the namespacing that we decide on, for my example the namespace will look like this on S3:

1
s3://s3bucket/secrets/engineering/app1/production/appconfig.json

We will apply IAM permissions for our user to only Put and Get on secrets/engineering*. So with this idea we can apply IAM permissions on groups for different departments, or even let users manage their own secrets such as:

1
s3://s3bucket/secrets/personal/user.name/app/appconfig.json

After the object has been downloaded from S3 and decrypted using KMS, the value of the object will look like this:

1
{u'surname': u'bekker', u'name': u'ruan', u'job_title': u'systems-development-engineer'}

Requirements

We will create the following resources on AWS:

  • KMS Key
  • S3 Bucket
  • IAM User
  • IAM Policy
  • Python Dependencies: Boto3

Provision AWS Resources

First we will create our S3 Bucket, head over to Amazon S3 create a new s3 bucket, make sure that the bucket is NOT public, by using the default configuration, you should be good.

Once your S3 Bucket is provisioned, head over to Amazon IAM and create a IAM User, enable programmatic access, and keep your access key and secret key safe. For now we will not apply any permissions as we will come back to this step.

Head over to Amazon KMS and create a KMS Key, we will define the key administrator, which will be my user (ruan.bekker in this case) with more privileged permissions:

and then we will define the key usage permissions (app.user in this case), which will be the user that we provisioned from the previous step, this will be the user that will encrypt and decrypt the data:

Next, review the policy generated from the previous selected sections:

Once you select finish, you will be returned to the section where your KMS Key information will be displayed, keep note of your KMS Key Alias, as we will need it later:

Create a IAM Policy for our App User

Next we will create the IAM Policy for the user that will encrypt/decrypt and store data in S3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3PutAndGetAccess",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::arn:aws:s3:::s3-bucket-name/secrets/engineering*"
        },
        {
            "Sid": "KMSDecryptAndEncryptAccess",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt"
            ],
            "Resource": "arn:aws:kms:eu-west-1:123456789012:key/xxxx-xxxx-xxxx-xxxx-xxxx"
        }
    ]
}

After the policy has been saved, associate the policy to the IAM User

Encrypt and Put to S3

Now we will use Python to define the data that we want to store in S3, we will then encrypt the data with KMS, use base64 to encode the ciphertext and push the encrypted value to S3, with Server Side Encryption enabled, which we will also use our KMS key.

Install boto3 in Python:

1
$ pip install boto3

Enter the Python REPL and import the required packages, we will also save the access key and secret key as variables so that we can use it with boto3. You can also save it to the credential provider and utilise the profile name:

1
2
3
4
5
>>> import boto3
>>> import json
>>> import base64
>>> aws_access_key_id='redacted'
>>> aws_secret_access_key='redacted'

Next define the data that we want to encrypt and store in S3:

1
2
3
4
5
>>> mydata = {
    "name": "ruan",
    "surname": "bekker",
    "job_title": "systems-development-engineer"
}

Next we will use KMS to encrypt the data and use base64 to encode the ciphertext:

1
2
3
4
5
6
7
8
9
10
11
12
>>> kms = boto3.Session(
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key
).client('kms')
>>> ciphertext = kms.encrypt(
    KeyId='alias/secrets-key',
    Plaintext=json.dumps(mydata)
)
>>> encoded_ciphertext = base64.b64encode(ciphertext["CiphertextBlob"])
# preview the data
>>> encoded_ciphertext
'AQICAHiKOz...42720nCleoI26UW7P89lPdwvV8Q=='

Next we will use S3 to push the encrypted data onto S3 in our name spaced key: secrets/engineering/app1/production/appconfig.json

1
2
3
4
5
6
7
8
9
10
11
12
>>> s3 = boto3.Session(
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
    region_name='eu-west-1'
).client('s3')
>>> response = s3.put_object(
    Body=encoded_ciphertext,
    Bucket='ruan-secret-store',
    Key='secrets/engineering/app1/production/appconfig.json',
    ServerSideEncryption='aws:kms',
    SSEKMSKeyId='alias/secrets-key'
)

Now our object is stored in S3, encrypted with KMS and ServerSideEncryption Enabled.

You can try to download the object and decode the base64 encoded file and you will find that its complete garbage as its encrypted.

Next we will use S3 to Get the object and use KMS to decrypt and use base64 to decode after the object has been decrypted:

1
2
3
4
5
6
7
>>> response = s3.get_object(
    Bucket='ruan-secret-store',
    Key='secrets/engineering/app1/production/appconfig.json'
)
>>> encoded_ciphertext = response['Body'].read()
>>> encoded_ciphertext
'AQICAHiKOz...42720nCleoI26UW7P89lPdwvV8Q=='

Now let’s decode the result with base64:

1
2
>>> decoded_ciphertext = base64.b64decode(encoded_ciphertext)
>>> plaintext = kms.decrypt(CiphertextBlob=bytes(decoded_ciphertext))

Now we need to deserialize the JSON as it’s in string format:

1
2
>>> json.loads(plaintext["Plaintext"])
{u'surname': u'bekker', u'name': u'ruan', u'job_title': u'systems-development-engineer'}

Using it in a Application

Let’s say you are using Docker and you want to bootstrap your application configs to your environment that you are retrieving from S3.

We will use a get_secrets.py python script that will read the data into memory, decrypt and write the values in plaintext to disk, then we will use the boot.sh script to read the values into the environment and remove the temp file that was written to disk, then start the application since we have the values stored in our environment.

Our “application” in this example will just be a line of echo to return the values for demonstration.

The get_secrets.py file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import boto3
import json
import base64

aws_access_key_id='redacted'
aws_secret_access_key='redacted'

kms = boto3.Session(aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key).client('kms')
s3 = boto3.Session(aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, region_name='eu-west-1').client('s3')

response = s3.get_object(Bucket='ruan-secret-store', Key='secrets/engineering/app1/production/appconfig.json')
encoded_ciphertext = response['Body'].read()

decoded_ciphertext = base64.b64decode(encoded_ciphertext)
plaintext = kms.decrypt(CiphertextBlob=bytes(decoded_ciphertext))
values = json.loads(plaintext["Plaintext"])

with open('envs.tmp', 'w') as f:
    for key in values.keys():
        f.write("{}={}".format(key.upper(), values[key]) + "\n")

And our boot.sh script:

1
2
3
4
#!/usr/bin/env bash
source ./envs.tmp
rm -rf ./envs.tmp
echo "Hello, my name is ${NAME} ${SURNAME}, and I am a ${JOB_TITLE}"

Running that will produce:

1
2
$ bash boot.sh
Hello, my name is ruan bekker, and I am a systems-development-engineer

Thank You

And there we have a simple and effective way of encrypting/decrypting data using S3, KMS and Python at a ridiculously cheap cost, its almost free.

If you liked my content, feel free to checkout my content on ruan.dev or follow me on twitter at @ruanbekker

Making Deploying Functions Even Easier With Faas-cli Up Using OpenFaaS

Say Thanks! Slack Status Chat on Slack GitHub followers Twitter Follow

I recently discovered that the faas-cli allows you to append your function’s yaml to an existing file when generating a new function. And that faas-cli up does the build, push and deploy for you.

The way I always did it:

Usually, I will go through this flow: create, build, push, deploy, when creating 2 functions that will be in the same stack:

1
2
3
4
5
6
7
$ faas-cli new --lang python3 fn-old-foo \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com

$ faas-cli build -f fn-old-foo.yml && \
faas-cli push -f fn-old-foo.yml && \
faas-cli deploy -f fn-old-foo.yml

And for my other function:

1
2
3
4
5
6
7
$ faas-cli new --lang python3 fn-old-bar \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com

$ faas-cli build -f fn-old-bar.yml && \
faas-cli push -f fn-old-bar.yml && \
faas-cli deploy -f fn-old-bar.yml

And then you are ready to invoke those functions.

The new discovered way

So recently I discovered that you can append the yaml definition of your function to an existing yaml file, and use faas-cli up to build, push and deploy your functions:

Generating the first function:

1
2
3
4
5
$ faas-cli new --lang python3 fn-foo \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com

Stack file written: fn-foo.yml

Now that we have fn-foo.yml in our current work directory, we will append the second function the that file:

1
2
3
4
5
6
$ faas-cli new --lang python3 fn-bar \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com \
--append fn-foo.yml

Stack file updated: fn-foo.yml

Now, when using faas-cli up it expects by default that the filename is stack.yml which we can change with -f but to keep this as easy as possible, we will change the filename to stack.yml:

1
$ mv fn-foo.yml stack.yml

At the moment, our stack.yml will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ cat stack.yml
provider:
  name: openfaas
  gateway: https://openfaas.domain.com
functions:
  fn-foo:
    lang: python3
    handler: ./fn-foo
    image: ruanbekker/fn-foo:latest
  fn-bar:
    lang: python3
    handler: ./fn-bar
    image: ruanbekker/fn-bar:latest

Deploying our functions is as easy as:

1
2
3
4
5
6
7
8
9
10
11
$ faas-cli up
...
Deploying: fn-foo.

Deployed. 202 Accepted.
URL: https://openfaas.domain.com/function/fn-foo

Deploying: fn-bar.

Deployed. 202 Accepted.
URL: https://openfaas.domain.com/function/fn-bar

Simply amazing. OpenFaaS done a great job in making it as simple and easy as possible to get your functions from zero to deployed in seconds.

Using OpenFaas With Amazon DynamoDB

image

Say Thanks! Slack Status Chat on Slack GitHub followers Twitter Follow

Using OpenFaaS with Amazon DynamoDB

You can use your OpenFaaS functions to store and retrieve data to and from a persistent layer that sits outside the OpenFaaS framework. The database that we will use in this tutorial is Amazon’s DynamoDB.

If you are not familiar with the service, Amazon’s DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.

At the end of this tutorial you will be able to invoke your functions to read and write items to DynamoDB with a dedicated IAM User that is only allowed to access DynamoDB, and secrets managed by your OpenFaaS framework.

What we will be doing in this Tutorial

In this tutorial we will cover a couple of things, and a summary on the to do list is:

  • Create a OpenFaaS IAM User, DynamoDB IAM Policy, associate the Policy to the User using the AWS CLI
  • Create a AWS Access Key, and save the Access Key and Secret key to file
  • Create OpenFaaS Secrets of the Access Key and Secret Key, remove the files from disk
  • Create 3 OpenFaaS Functions: write, lookup and get
  • Invoke the functions, read and write from DynamoDB

Our 3 functions will do very basic operations for this demonstration, but I believe this is a good starting point.

All the examples of this blog post is available in this github repository

The Use-Case Scenario

In this scenario we want to store user information into DynamoDB, we will use a hash that we will calculate using the users ID Number + Lastname. So when we have thousands or millions of items, we dont need to search through the entire table, but since we can re-calculate the sha hash, we can do a single GetItem operation to find the entry about the user in question.

  • Lookup Function:

The lookup function will calculate the hash by passing the users ID Number and Lastname, this will return a hash which will be teh primary key attribute of our table design. This hash value is required to do a GetItem on the user in question.

  • Get Function:

The Get function will interface with DynamoDB, it reads the AWS access key and secret key from the secrets path to authenticate with AWS and utilizes environment variables for the region and table name. This will do a GetItem on the DynamoDB Table and retrieve the Item. If the item is not found, it will return it in the response.

  • Write Function:

The write function will also interface with DynamoDB, the ID, Name and Payload will be included in the request body on our POST Request.

Note on Secrets and Environment Variables

I am treating my environment variables and secrets different from each other. The secrets such as my AWS access keys are stored on the cluster and the application reads them and stores the values in memory.

The environment variables such as non-secret information, such as my dynamodb table name and aws region, is defined in my environment variables.

This post and this post goes a bit more into detail on why you should not use environment variables for secret data, which I found from this link

Enough info, let’s get to the fun stuff

Pre-Requirements:

You need a AWS Account (or you can use dynamodb-local), OpenFaaS and faas-cli. Documentation available below: - https://docs.openfaas.com/contributing/get-started/

Provision a DynamoDB Table

I have a admin IAM account configured on my default profile, using the aws-cli tools generate the cli-skeleton that is required to provision a dynamodb table:

1
$ aws dynamodb create-table --generate-cli-skeleton > ddb.json

My table name will be lookup-table with the primary key hash_value and provisoned my throughput to 1 Read and Write Capacity Unit. Which will enable us 4KB/s for reads and 1KB/s for writes.

For demonstration purposes, I am sharing my altered ddb.json file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
    "AttributeDefinitions": [
        {
            "AttributeName": "hash_value",
            "AttributeType": "S"
        }
    ],
    "TableName": "lookup_table",
    "KeySchema": [
        {
            "AttributeName": "hash_value",
            "KeyType": "HASH"
        }
    ],
    "ProvisionedThroughput": {
        "ReadCapacityUnits": 1,
        "WriteCapacityUnits": 1
    },
    "Tags": [
        {
            "Key": "Name",
            "Value": "lookup-table"
        }
    ]
}

Now that we have the file saved, create the dynamodb table:

1
$ aws dynamodb create-table --cli-input-json file://ddb.json

List the tables:

1
2
3
4
5
6
$ aws dynamodb list-tables
{
    "TableNames": [
        "lookup_table"
    ]
}

Check if the table is provisioned:

1
2
$ aws dynamodb describe-table --table-name lookup_table | jq -r '.Table.TableStatus'
ACTIVE

Getting the ARN string, as we will need it when we create our IAM Policy:

1
2
$ aws dynamodb describe-table --table-name lookup_table | jq -r '.Table.TableArn'
arn:aws:dynamodb:eu-west-1:x-x:table/lookup_table

Create the OpenFaaS IAM User

Create the IAM Policy document which defines the access that we want to grant. You can see that we are only allowing Put and GetItem on the provisioned DynamoDB resource:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ cat dynamodb-iam-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "OpenFaasFunctionAceessForDynamoDB",
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                "dynamodb:GetItem"
            ],
            "Resource": "arn:aws:dynamodb:eu-west-1:x-accountid-x:table/lookup_table"
        }
    ]
}

Create the IAM Policy and provide the policy document for the given policy name:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ aws iam create-policy --policy-name openfaas-dynamodb-access --policy-document file://dynamodb-iam-policy.json
{
    "Policy": {
        "PolicyName": "openfaas-dynamodb-access",
        "PolicyId": "ANPATPRT2G4SL4K63SUWQ",
        "Arn": "arn:aws:iam::x-accountid-x:policy/openfaas-dynamodb-access",
        "Path": "/",
        "DefaultVersionId": "v1",
        "AttachmentCount": 0,
        "PermissionsBoundaryUsageCount": 0,
        "IsAttachable": true,
        "CreateDate": "2019-07-06T11:54:26Z",
        "UpdateDate": "2019-07-06T11:54:26Z"
    }
}

Create the IAM User that will be used to authenticate requests against DynamoDB:

1
2
3
4
5
6
7
8
9
10
$ aws iam create-user --user-name openfaas-user
{
    "User": {
        "Path": "/",
        "UserName": "openfaas-user",
        "UserId": "AIDATPRT2G4SIRYTNHLZK",
        "Arn": "arn:aws:iam::x-accountid-x:user/openfaas-user",
        "CreateDate": "2019-07-06T11:56:53Z"
    }
}

Create the Access Key, which will be our API keys for our application to authenticate requests. Save the AccessKeyId and SecretAccessKey temporarily to 2 seperate files, which we will delete after we create our secrets to our cluster:

1
2
3
4
5
6
7
8
9
10
$ aws iam create-access-key --user-name openfaas-user
{
    "AccessKey": {
        "UserName": "openfaas-user",
        "AccessKeyId": "AKIAT..redacted.x",
        "Status": "Active",
        "SecretAccessKey": "b..redacted.x",
        "CreateDate": "2019-07-06T11:57:37Z"
    }
}

Associate the IAM Policy to the IAM User:

1
$ aws iam attach-user-policy --user-name openfaas-user --policy-arn arn:aws:iam::x-x:policy/openfaas-dynamodb-access

To test if the access keys work, save them to a new profile using the aws-cli tools:

1
2
3
4
5
$ aws configure --profile openfaas
AWS Access Key ID [None]: AKIAT..
AWS Secret Access Key [None]: b..x
Default region name [None]: eu-west-1
Default output format [None]: json

Write an Item to DynamoDB:

1
2
3
$ aws --profile openfaas dynamodb put-item \
--table-name lookup_table \
--item '{"hash_value": {"S": "aGVsbG8td29ybGQK"}, "message": {"S": "hello-world"}}'

Read the Item from DynamoDB:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ aws --profile openfaas dynamodb get-item \
--table-name lookup_table \
--key '{"hash_value": {"S": "aGVsbG8td29ybGQK"}}'
{
    "Item": {
        "hash_value": {
            "S": "aGVsbG8td29ybGQK"
        },
        "message": {
            "S": "hello-world"
        }
    }
}

We can now confirm our permissions are in place to continue.

Create OpenFaaS Secrets

The AccessKeyId and SecretKey has been saved to disk, and we will use those files to create secrets from:

1
2
3
$ faas-cli secret create openfaas-aws-access-key --from-file=openfaas_aws_access_key.txt
Creating secret: openfaas-aws-access-key
Created: 201 Created
1
2
3
$ faas-cli secret create openfaas-aws-secret-key --from-file=openfaas_aws_secret_key.txt
Creating secret: openfaas-aws-secret-key
Created: 201 Created

Now that the secrets are securely stored in our cluster, we can delete the temporary files:

1
$ rm -f ./openfaas_aws_*_key.txt

Login to OpenFaaS

Login to OpenFaasS using faas-cli:

1
2
3
4
$ faas-cli login \
--gateway https://openfaas.domain.com \
--username ${OPENFAAS_USER} \
--password ${OPENFAAS_PASSWORD}

Export the OPENFAAS_URL:

1
$ export OPENFAAS_URL=https://openfaas.domain.com

One Stack File for All 3 Functions:

We will create our first function to generate the yaml definition, then we will rename our generated filename to stack.yml then the next 2 functions, we will use the append flag to append the functions yaml to our stack.yml file, so that we can simply use faas-cli up

Create the Lookup Function:

Create a Python3 Function, and prefix it with your dockerhub user:

1
2
3
4
5
6
7
$ faas-cli new \
--lang python3 fn-dynamodb-lookup \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com

Function created in folder: fn-foo
Stack file written: fn-dynamodb-lookup.yml

As we will be using one stack file, rename the generated stack file:

1
$ mv fn-dynamodb-lookup.yml stack.yml

Open the stack file and set the environment variables:

1
2
3
4
5
6
7
8
9
10
11
12
$ cat stack.yml
provider:
  name: openfaas
  gateway: https://openfaas.domain.com
functions:
  fn-dynamodb-lookup:
    lang: python3
    handler: ./fn-dynamodb-lookup
    image: ruanbekker/fn-dynamodb-lookup:latest
    environment:
      dynamodb_region: eu-west-1
      dynamodb_table: lookup_table

The python code for our function:

1
$ cat fn-dynamodb-lookup/handler.py
1
2
3
4
5
6
7
8
9
10
11
12
import json
import hashlib

def calc_sha(id_number, lastname):
    string = json.dumps({"id": id_number, "lastname": lastname}, sort_keys=True)
    hash_value = hashlib.sha1(string.encode("utf-8")).hexdigest()
    return hash_value

def handle(req):
    event = json.loads(req)
    hash_value = calc_sha(event['id'], event['lastname'])
    return hash_value

Create the Write Function:

Create a Python3 Function, and prefix it with your dockerhub user, and use the append flag to update our stack file:

1
2
3
4
5
6
7
8
$ faas-cli new \
--lang python3 fn-dynamodb-write \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com
--append stack.yml

Function created in folder: fn-dynamodb-write
Stack file updated: stack.yml

Open the stack file and set the environment variables and include the secrets that was created:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ cat stack.yml
provider:
  name: openfaas
  gateway: https://openfaas.domain.com
functions:
  fn-dynamodb-lookup:
  # ...
  fn-dynamodb-write:
    lang: python3
    handler: ./fn-dynamodb-write
    image: ruanbekker/fn-dynamodb-write:latest
    environment:
      dynamodb_region: eu-west-1
      dynamodb_table: lookup_table
    secrets:
      - openfaas-aws-access-key
      - openfaas-aws-secret-key

Our function relies on a external dependency which we need to install to interact with aws:

1
2
$ cat fn-dynamodb-write/requirements.txt
boto3

Our python code for our function:

1
$ cat fn-dynamodb-write/handler.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import boto3
import os
import json
import hashlib
import datetime

aws_key = open('/var/openfaas/secrets/openfaas-aws-access-key', 'r').read()
aws_secret = open('/var/openfaas/secrets/openfaas-aws-secret-key', 'r').read()
dynamodb_region = os.environ['dynamodb_region']
dynamodb_table  = os.environ['dynamodb_table']

client = boto3.Session(region_name=dynamodb_region).resource('dynamodb', aws_access_key_id=aws_key, aws_secret_access_key=aws_secret)
table = client.Table(dynamodb_table)

def calc_sha(id_number, lastname):
    string = json.dumps({"id": id_number, "lastname": lastname}, sort_keys=True)
    hash_value = hashlib.sha1(string.encode("utf-8")).hexdigest()
    return hash_value

def create_timestamp():
    response = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M")
    return response

def handle(req):
    event = json.loads(req)
    unique_id = calc_sha(event['id'], event['lastname'])
    response = table.put_item(
        Item={
            'hash_value': unique_id,
            'timestamp': create_timestamp(),
            'payload': event['payload']
        }
    )
    return response

Create the Get Function:

Create a Python3 Function, and prefix it with your dockerhub user, and use the append flag to specify the stack file:

1
2
3
4
5
6
7
8
$ faas-cli new \
--lang python3 fn-dynamodb-get \
--prefix=ruanbekker \
--gateway https://openfaas.domain.com
--append stack.yml

Function created in folder: fn-dynamodb-get
Stack file updated: stack.yml

Open the stack file and set the environment variables and include the secrets that was created:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ cat stack.yml
provider:
  name: openfaas
  gateway: https://openfaas.domain.com
functions:
  fn-dynamodb-lookup:
  # .. 
  fn-dynamodb-write:
  # ..
  fn-dynamodb-get:
    lang: python3
    handler: ./fn-dynamodb-get
    image: ruanbekker/fn-dynamodb-get:latest
    environment:
      dynamodb_region: eu-west-1
      dynamodb_table: lookup_table
    secrets:
      - openfaas-aws-access-key
      - openfaas-aws-secret-key

Include the external dependency for aws:

1
2
$ cat fn-dynamodb-get/requirements.txt
boto3

Our python code for our function:

1
$ cat fn-dynamodb-get/handler.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import boto3
import os
import json

aws_key = open('/var/openfaas/secrets/openfaas-aws-access-key', 'r').read()
aws_secret = open('/var/openfaas/secrets/openfaas-aws-secret-key', 'r').read()
dynamodb_region = os.environ['dynamodb_region']
dynamodb_table  = os.environ['dynamodb_table']

client = boto3.Session(region_name=dynamodb_region).resource('dynamodb', aws_access_key_id=aws_key, aws_secret_access_key=aws_secret)
table = client.Table(dynamodb_table)

def handle(req):
    event = json.loads(req)
    response = table.get_item(
        Key={
            'hash_value': event['hash_value']
        }
    )

    if 'Item' not in response:
        item_data = 'Item not found'
    else:
        item_data = response['Item']

    return item_data

Build, Push and Deploy:

It’s time to deploy our functions and since we have all our stack info in one file, we can use faas-cli up which will build, push and deploy our functions.

By default it expects the filename to be stack.yml therefore we don’t need to specify the filename, but if you had a different filename, you can overwrite the default behaviour with -f:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ faas-cli up

Deploying: fn-dynamodb-lookup.
Deployed. 202 Accepted.
URL: https://openfaas.domain.com/function/fn-dynamodb-lookup

Deploying: fn-dynamodb-write.
Deployed. 202 Accepted.
URL: https://openfaas.domain.com/function/fn-dynamodb-write

Deploying: fn-dynamodb-get.
Deployed. 202 Accepted.
URL: https://openfaas.domain.com/function/fn-dynamodb-get

Time for our Functions to interact with DynamoDB:

Write an Item to DynamoDB:

1
2
$ curl -XPOST https://openfaas.domain.com/function/fn-dynamodb-write -d '{"id": 8700000000001, "lastname": "smith", "payload": {"name": "james", "role": "reader"}}'
{'ResponseMetadata': {'RequestId': 'CNHEFHMSL4KGRDE0HRVQ69D5H7VV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Sat, 06 Jul 2019 20:47:00 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '2', 'connection': 'keep-alive', 'x-amzn-requestid': 'CNHEFHMSL4KGRDE0HRVQ69D5H7VV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '2745614147'}, 'RetryAttempts': 0}}

Write another Item to DynamoDB:

1
2
$ curl -XPOST https://openfaas.doamin.com/function/fn-dynamodb-write -d '{"id": 8700000000002, "lastname": "adams", "payload": {"name": "samantha", "role": "admin"}}'
{'ResponseMetadata': {'RequestId': 'KRQL838BVGC9LIUSCOUB7MOEQ7VV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Sat, 06 Jul 2019 20:48:09 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '2', 'connection': 'keep-alive', 'x-amzn-requestid': 'KRQL838BVGC9LIUSCOUB7MOEQ7VV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '2745614147'}, 'RetryAttempts': 0}}

Now recalculate the hash by passing the ID Number and Lastname to get the hash value for the primary key:

1
2
$ curl -XPOST https://openfaas.domain.com/function/fn-dynamodb-lookup -d '{"id": 8700000000002, "lastname": "adams"}'
bd0a248aff2b50b288ba504bd7142ef11b164901

Now that we have the hash value, do a GetItem by using the hash value in the request body:

1
2
$ curl -XPOST https://openfaas.domain.com/function/fn-dynamodb-get -d '{"hash_value": "bd0a248aff2b50b288ba504bd7142ef11b164901"}'
{'payload': {'name': 'samantha', 'role': 'admin'}, 'hash_value': 'bd0a248aff2b50b288ba504bd7142ef11b164901', 'timestamp': '2019-07-06T20:48'}

Note that the lookup function calculates a hash based on the input that you provide it, for example calculating a hash with userdata that does not exist in our table:

1
2
$ curl -XPOST https://openfaas.domain.com/function/fn-dynamodb-lookup -d '{"id": 8700000000003, "lastname": "williams"}'
c68dc272873140f4ae93bb3a3317772a6bdd9aa1

Using that hash value in our request body to read from dynamodb, will show us that the item has not been found:

1
2
$ curl -XPOST https://openfaas.domain.com/function/fn-dynamodb-get -d '{"hash_value": "c68dc272873140f4ae93bb3a3317772a6bdd9aa1"}'
Item not found

You might want to change this behavior but this is just for the demonstration of this post.

When you head over to DynamoDB’s console you will see this in your table:

image

Thanks

This was a basic example using OpenFaaS with Amazon DynamoDB with Python and secrets managed with OpenFaas. I really like the way OpenFaaS let’s you work with secrets, it works great and don’t need an additional resource to manage your sensitive data.

Although this was basic usage with OpenFaaS and DynamoDB, the sky is the limit what you can do with it.

Resources:

Play With Kinesis Data Streams for Free

image

Misleading title?? Perhaps, depends on how you look at it. Amazon Kinesis is a fully managed, cloud-based service for real-time processing of distributed data streams. So if you’re a curious mad person like me, you want to test out stuff and when you can test stuff out for free, why not.

So before paying for that, why not spin something up locally, such as Kinesisalite which is an implementation of Amazon Kinesis built on top of LevelDB.

Kinesis overview:

image

What will we be doing?

In this tutorial we will setup a local kinesis instance using docker then do the following:

  • Create a Kinesis Stream, List, Describe, PutRecord, GetRecords using Python’s Boto3 Interface
  • Write a Python Producer and Consumer
  • Write and Read Records from our Local Kinesis Stream

Building Kinesis Local on Docker

If you would like to skip this step, you can use my docker image: ruanbekker/kinesis-local:latest

Our Dockerfile:

1
2
3
4
5
6
7
FROM node:8.16.0-stretch-slim

RUN apt update && apt install build-essential python-minimal -y
RUN npm install --unsafe-perm -g kinesalite
RUN apt-get clean

CMD ["kinesalite", "--port", "4567", "--createStreaMs", "5"]

Build:

1
$ docker build -t kinesis-local .

Run and expose port 4567:

1
$ docker run -it -p 4567:4567 kinesis-local:latest

Interact with Kinesis Local:

In this next steps we will setup our environment, which will only require python and boto3. To keep things isolated, I will do this with a docker container:

1
$ docker run -it python:3.7-alpine sh

Now we need to install boto3 and enter the python repl:

1
2
3
4
5
6
$ pip3 install boto3
$ python3
Python 3.7.3 (default, May 11 2019, 02:00:41)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Import boto and create the connection to our kinesis local instance:

1
2
3
4
>>> import boto3
>>> client = boto3.Session(
    region_name='eu-west-1').client('kinesis', aws_access_key_id='', aws_secret_access_key='', endpoint_url='http://localhost:4567'
)

Let’s list our streams and as expected, we should have zero streams available:

1
2
>>> client.list_streams()
{u'StreamNames': [], u'HasMoreStreams': False, 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '637xx', 'HTTPHeaders': {'x-amzn-requestid': '6xx', 'content-length': '41', 'x-amz-id-2': 'xx', 'connection': 'keep-alive', 'date': 'Sat, 22 Jun 2019 19:17:34 GMT', 'content-type': 'application/x-amz-json-1.1'}}}

Let’s create a stream named mystream with 1 primary shard:

1
>>> client.create_stream(StreamName='mystream', ShardCount=1)

Let’s list our streams again:

1
2
>>> client.list_streams()
{u'StreamNames': [u'mystream'], u'HasMoreStreams': False, 'ResponseMetadata': ...

Let’s put some data in our kinesis stream, we will push a payload with the body: {"name": "ruan"} to our kinesis stream with partition key: a01 which is used for sharding:

1
2
3
>>> response = client.put_record(StreamName='mystream', Data=json.dumps({"name": "ruan"}), PartitionKey='a01')
>>> response
{u'ShardId': u'shardId-000000000000', 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': 'cb0xx', 'HTTPHeaders': {'x-amzn-requestid': 'xx', 'content-length': '110', 'x-amz-id-2': 'xx', 'connection': 'keep-alive', 'date': 'Sat, 22 Jun 2019 19:20:27 GMT', 'content-type': 'application/x-amz-json-1.1'}}, u'SequenceNumber': u'490xx'}

Now that we have data in our stream we need to read data from our kinesis stream. Before data can be read from the stream we need to obtain the shard iterator for the shard we are interested in. A shard iterator represents the position of the stream and shard from which the consumer will read, in this case we will call the get_shard_operator method and passing the stream name, shard id and shard iterator type.

There are 2 comman iterator types:

  • TRIM_HORIZON: Points to the last untrimmed record in the shard
  • LATEST: Reads the most recent data in the shard

We will use TRIM_HORIZON in this case, get the shard iterator id:

1
2
3
4
>>> shard_id = response['ShardId']
>>> response = client.get_shard_iterator(StreamName='mystream', ShardId=shard_id, ShardIteratorType='TRIM_HORIZON')
>>> response
{u'ShardIterator': u'AAAxx=', 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '22dxx', 'HTTPHeaders': {'x-amzn-requestid': '22dxx', 'content-length': '224', 'x-amz-id-2': 'xx', 'connection': 'keep-alive', 'date': 'Sat, 22 Jun 2019 19:22:55 GMT', 'content-type': 'application/x-amz-json-1.1'}}}

Now that we have the shard iterator id, we can call the get_records method with the shard iterator id, to read the data from the stream:

1
2
3
4
>>> shard_iterator = response['ShardIterator']
>>> response = client.get_records(ShardIterator=shard_iterator)
>>> response
{u'Records': [{u'Data': '{"name": "ruan"}', u'PartitionKey': u'a01', u'ApproximateArrivalTimestamp': datetime.datetime(2019, 6, 22, 21, 20, 27, 937000, tzinfo=tzlocal()), u'SequenceNumber': u'495xx'}], 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '2b6xx', 'HTTPHeaders': {'x-amzn-requestid': '2b6xx', 'content-length': '441', 'x-amz-id-2': 'xx', 'connection': 'keep-alive', 'date': 'Sat, 22 Jun 2019 19:30:19 GMT', 'content-type': 'application/x-amz-json-1.1'}}, u'NextShardIterator': u'AAAxx=', u'MillisBehindLatest': 0}

To loop and parse through the response to make it more readable:

1
2
3
4
5
>>> for record in response['Records']:
...     if 'Data' in record:
...         json.loads(record['Data'])
...
{u'name': u'ruan'}

Once we are done, we can delete our stream:

1
>>> client.delete_stream(StreamName='mystream')

Now that we have the basics, lets create our producer and consumer for a demonstration on pushing data to a kinesis stream from one process and consuming it from another process. As this demonstration we will be producing and consuming data from the same laptop, in real use-cases, you will do them from seperate servers and using Amazon Kinesis.

Our Kinesis Producer

The following will create a Kinesis Local Stream and Write 25 JSON Documents to our stream:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import boto3
import random
import json
import time

names = ['james', 'stefan', 'pete', 'tom', 'frank', 'peter', 'ruan']

session = boto3.Session(region_name='eu-west-1')
client = session.client(
    'kinesis',
    aws_access_key_id='',
    aws_secret_access_key='',
    endpoint_url='http://localhost:4567'
)

list_streams = client.list_streams()

if 'mystream' not in list_streams['StreamNames']:
    client.create_stream(StreamName='mystream', ShardCount=1)
    time.sleep(1)

count = 0
print("Starting at {}".format(time.strftime("%H:%m:%S")))

while count != 25:
    count += 1
    response = client.put_record(
        StreamName='mystream',
        Data=json.dumps({
            "number": count,
            "name": random.choice(names),
            "age": random.randint(20,50)}
        ),
        PartitionKey='a01'
    )
    time.sleep(1)

print("Finished at {}".format(time.strftime("%H:%m:%S")))

Our Kinesis Local Consumer:

This will read 5 records at a time from our stream, you will notice if you run them on the same time it will only read one at a time as the producer only writes one per second.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import boto3
import json
import time
import os

session = boto3.Session(region_name='eu-west-1')
client = session.client(
    'kinesis',
    aws_access_key_id='',
    aws_secret_access_key='',
    endpoint_url='http://localhost:4567'
)

stream_details = client.describe_stream(StreamName='mystream')
shard_id = stream_details['StreamDescription']['Shards'][0]['ShardId']

response = client.get_shard_iterator(
    StreamName='mystream',
    ShardId=shard_id,
    ShardIteratorType='TRIM_HORIZON'
)

shard_iterator = response['ShardIterator']

while True:
    response = client.get_records(ShardIterator=shard_iterator, Limit=5)
    shard_iterator = response['NextShardIterator']
    for record in response['Records']:
        if 'Data' in record and len(record['Data']) > 0:
            print(json.loads(record['Data']))
    time.sleep(0.75)

Demo Time!

Now that we have our producer.py and consumer.py, lets test this out.

Start the server:

1
2
$ docker run -it -p 4567:4567 ruanbekker/kinesis-local:latest
Listening at http://:::4567

Run the Producer from your Python Environment:

1
2
3
$ python producer.py
Starting at 00:06:16
Finished at 00:06:42

Run the Consumer from your Python Environment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ python consumer.py
Starting Consuming at 00:06:31
{u'age': 30, u'number': 1, u'name': u'pete'}
{u'age': 23, u'number': 2, u'name': u'ruan'}
{u'age': 22, u'number': 3, u'name': u'peter'}
{u'age': 45, u'number': 4, u'name': u'stefan'}
{u'age': 49, u'number': 5, u'name': u'tom'}
{u'age': 47, u'number': 6, u'name': u'pete'}
{u'age': 35, u'number': 7, u'name': u'stefan'}
{u'age': 45, u'number': 8, u'name': u'ruan'}
{u'age': 38, u'number': 9, u'name': u'frank'}
{u'age': 20, u'number': 10, u'name': u'tom'}
{u'age': 38, u'number': 11, u'name': u'james'}
{u'age': 20, u'number': 12, u'name': u'james'}
{u'age': 38, u'number': 13, u'name': u'tom'}
{u'age': 25, u'number': 14, u'name': u'tom'}
{u'age': 20, u'number': 15, u'name': u'peter'}
{u'age': 50, u'number': 16, u'name': u'james'}
{u'age': 29, u'number': 17, u'name': u'james'}
{u'age': 42, u'number': 18, u'name': u'pete'}
{u'age': 25, u'number': 19, u'name': u'pete'}
{u'age': 36, u'number': 20, u'name': u'tom'}
{u'age': 45, u'number': 21, u'name': u'peter'}
{u'age': 39, u'number': 22, u'name': u'ruan'}
{u'age': 43, u'number': 23, u'name': u'tom'}
{u'age': 38, u'number': 24, u'name': u'pete'}
{u'age': 40, u'number': 25, u'name': u'frank'}
Finshed Consuming at 00:06:35

Thanks

Say Thanks!

Hope that was useful, feel free to check out Amazon’s Kinesis out if you are planning to run this in any non-testing environment

Setup Traefik as an Ingress Controller on Kubernetes

image

If you have not provisioned a Kubernetes Cluster, you can see this tutorial on how to provision a Kubernetes Cluster on Scaleway

What will we be doing

In this tutorial we will setup Traefik as an Ingress Controller on Kubernetes and deploy a logos web app to our Kubernetes Cluster, using frontend rules to map subdomains to specific services.

We will have 3 subdomains, being mapped to containers from the below docker images:

1
2
3
4
FQDN                     Image Name
- python.domain.com   -> ruanbekker/logos:python
- openfaas.domain.com -> ruanbekker/logos:openfaas
- rancher.domain.com  -> ruanbekker/logos:rancher

Get the sources

If you would like to get the source code for this demonstration you can checkout this repository: https://github.com/ruanbekker/traefik-kubernetes-scaleway-demo

1
2
$ git clone https://github.com/ruanbekker/traefik-kubernetes-scaleway-demo
$ cd traefik-kubernetes-scaleway-demo

Provision Traefik as an Ingress Controller

Apply role based access control to authorize Traefik to use the Kubernetes API:

1
2
3
$ kubectl apply -f traefik/01-traefik-rbac.yaml
clusterrole.rbac.authorization.k8s.io/traefik-ingress-controller created
clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created

Consulting Traefik’s documentation, when deploying Traefik, it’s possible to use a deployment or a demonset, not both. More details on why

I will go ahead and apply the Daemon Set:

1
2
3
4
$ kubectl apply -f traefik/03-traefik-ds.yaml
serviceaccount/traefik-ingress-controller created
daemonset.extensions/traefik-ingress-controller created
service/traefik-ingress-service created

The Traefik UI Service will be associated with a FQDN, remember to set the FQDN for the endpoint, as example:

1
2
3
4
5
6
7
8
9
$ cat traefik/04-traefik-ui.yaml
...
spec:
  rules:
  - host: traefik-ui.x-x-x-x-x.nodes.k8s.fr-par.scw.cloud
    http:
      paths:
      - path: /
...

Create the Traefik UI Service:

1
2
$ kubectl apply -f traefik/04-traefik-ui.yaml
service/traefik-web-ui created

Traefik UI Ingress:

1
2
$ kubectl apply -f traefik/05-traefik-ui-ingress.yaml
ingress.extensions/traefik-web-ui created

View the services:

1
2
3
4
5
6
7
8
9
$ kubectl get services --namespace=kube-system
NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
coredns                   ClusterIP   x.x.x.x         <none>        53/UDP,53/TCP,9153/TCP   11h
heapster                  ClusterIP   x.x.x.x         <none>        80/TCP                   11h
kubernetes-dashboard      ClusterIP   x.x.x.x         <none>        443/TCP                  11h
metrics-server            ClusterIP   x.x.x.x         <none>        443/TCP                  11h
monitoring-influxdb       ClusterIP   x.x.x.x         <none>        8086/TCP                 11h
traefik-ingress-service   ClusterIP   x.x.x.x         <none>        80/TCP,8080/TCP          24m
traefik-web-ui            ClusterIP   x.x.x.x         <none>        80/TCP                   24m

Deploy the Logo App to the Cluster

We will deploy the logo app to our cluster:

1
2
3
4
$ kubectl apply -f logos-app/logos-services.yaml
service/openfaas created
service/rancher created
service/python created

Create the deployment:

1
2
3
4
$ kubectl apply -f logos-app/logos-deployments.yaml
deployment.extensions/openfaas created
deployment.extensions/rancher created
deployment.extensions/python created

Before creating the ingress for the logo’s applications, we need to set the fqdn endpoints that we want to route traffic to as below as an example:

1
2
3
4
5
6
7
8
9
10
11
12
$ cat logos-app/logos-ingress.yaml
...
spec:
  rules:
  - host: openfaas.x-x-x-x-x.nodes.k8s.fr-par.scw.cloud
    http:
      paths:
      - path: /
        backend:
          serviceName: openfaas
          servicePort: http
...

Create the ingress:

1
2
$ kubectl apply -f logos-app/logos-ingress.yaml
ingress.extensions/logo created

After some time, have a look at the pods to get the status:

1
2
3
4
5
6
7
8
$ kubectl get pods
NAME                                     READY   STATUS    RESTARTS   AGE
openfaas-cffdddc4-lvn5w                  1/1     Running   0          4m6s
openfaas-cffdddc4-wbcl6                  1/1     Running   0          4m6s
python-65ccf9c74b-8kmgp                  1/1     Running   0          4m6s
python-65ccf9c74b-dgnqb                  1/1     Running   0          4m6s
rancher-597b6b8554-mgcjr                 1/1     Running   0          4m6s
rancher-597b6b8554-mpk62                 1/1     Running   0          4m6s

Navigating with Kubectl

Show nodes:

1
2
3
4
5
$ kubectl get nodes
NAME                                             STATUS   ROLES    AGE   VERSION
scw-k8s-mystifying-torvald-jovial-mclar-25a942   Ready    node     20h   v1.14.1
scw-k8s-mystifying-torvald-jovial-mclar-eaf1a2   Ready    node     20h   v1.14.1
scw-k8s-mystifying-torvalds-default-7f263aabab   Ready    master   20h   v1.14.1

Show services:

1
2
3
4
5
6
$ kubectl get services
NAME                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)           AGE
kubernetes              ClusterIP   10.32.0.1      <none>        443/TCP           20h
openfaas                ClusterIP   10.41.47.185   <none>        80/TCP            9h
python                  ClusterIP   10.42.56.141   <none>        80/TCP            9h
rancher                 ClusterIP   10.32.41.218   <none>        80/TCP            9h

Show Pods:

To see pods from the kube-system namespace add -n kube-system

1
2
3
4
5
6
7
8
$ kubectl get pods
NAME                                     READY   STATUS    RESTARTS   AGE
openfaas-cffdddc4-lvn5w                  1/1     Running   0          9h
openfaas-cffdddc4-wbcl6                  1/1     Running   0          9h
python-65ccf9c74b-8kmgp                  1/1     Running   0          9h
python-65ccf9c74b-dgnqb                  1/1     Running   0          9h
rancher-597b6b8554-mgcjr                 1/1     Running   0          9h
rancher-597b6b8554-mpk62                 1/1     Running   0          9h

Show deployments:

1
2
3
4
5
$ kubectl get deployments -o wide
NAME                    READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS  IMAGES                      SELECTOR
openfaas                2/2     2            2           9h    logo        ruanbekker/logos:openfaas   app=logo,task=openfaas
python                  2/2     2            2           9h    logo        ruanbekker/logos:python     app=logo,task=python
rancher                 2/2     2            2           9h    logo        ruanbekker/logos:rancher    app=logo,task=rancher

Show ingress:

1
2
3
$ kubectl get ingress -o wide
NAME      HOSTS                                                          ADDRESS   PORTS   AGE
logo      openfaas.domain.com,rancher.domain.com,python.domain.com       80      9h

Show system ingress:

1
2
3
$ kubectl get ingress -o wide -n kube-system
NAME             HOSTS                     ADDRESS   PORTS   AGE
traefik-web-ui   traefik-ui.domain.com               80      9h

Access your Applications

Access the Traefik-UI, and filter for one of the applications. Let’s take OpenFaaS for an example:

image

Access the OpenFaaS Page via the URL:

image

Resources