Ruan Bekker's Blog

From a Curious mind to Posts on Github

Concourse Pipeline With Resources Tutorial

In Concourse, Resources refer to external resource types such as s3, github etc.

So for example, we can run a pipeline which pulls data from github, such as cloning a repository, so in other words the data that was cloned from the github repository is within the container where your tasks will be executed.

Concourse Github Resourse Example

In this tutorial we will use the github resource type, in conjunction with a task that will execute a script, where the script will be inside the github repository.

Our pipeline as pipeline.yml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
resources:
- name: concourse-tutorial
  type: git
  source:
    uri: https://github.com/ruanbekker/concourse-tutorial.git
    branch: master

jobs:
- name: job-hello-world
  public: true
  plan:
  - get: concourse-tutorial
  - task: hello-world
    file: concourse-tutorial/00-basic-tasks/task_hello_world.yml

You can head over to hello-world task on github to see the task, but all it does is running a uname -a

So our job has a task that will call the action defined in our task_hello_world.yml which retrieves it from the get step, as you can see it’s the concourse-tutorial resource, which is defined under the resources section as a git resource type.

Set the pipeline:

1
2
3
4
$ fly -t ci sp -c pipeline.yml -p 04-hello-world

apply configuration? [yN]: y
pipeline created!

Unpause the pipeline:

1
2
$ fly -t ci up -p 04-hello-world
unpaused '04-hello-world'

Trigger the job (trigger is off; default)

1
2
3
4
5
6
7
$ fly -t ci tj -j 04-hello-world/job-hello-world --watch
started 04-hello-world/job-hello-world #4

initializing
running uname -a
Linux 6a91b808-c488-4e3c-7b51-404f73405c31 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64 GNU/Linux
succeeded

So this job cloned the github repository, called the task file which calls the bash script from th github repository to run uname -a

For my other content on concourse, have a look at the concourse category.

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


How to Cache Data With Python Flask

If you depending on a external source to return static data you can implement cachetools to cache data from preventing the overhead to make the request everytime you make a request to Flask.

This is useful when your upstream data does not change often. This is configurable with maxsize and ttl so whenever the first one’s threshold is met, the application will fetch new data whenever the request has been made to your application.

Example

Let’s build a basic flask application that will return the data from our data.txt file to the client:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from flask import Flask
from cachetools import cached, TTLCache

app = Flask(__name__)
cache = TTLCache(maxsize=100, ttl=60)

@cached(cache)
def read_data():
    data = open('data.txt', 'r').read()
    return data

@app.route('/')
def main():
    get_data = read_data()
    return get_data

if __name__ == '__main__':
    app.run()

Create the local file with some data:

1
2
$ touch data.txt
$ echo "version1" > data.txt

Start the server:

1
$ python app.py

Make the request:

1
2
$ curl http://localhost:5000/
version1

Change the data inside the file:

1
$ echo "version2" > data.txt

Make the request again:

1
2
$ curl http://localhost:5000/
version1

As the ttl is set to 60, wait for 60 seconds so that the item kan expire from the cache and try again:

1
2
$ curl http://localhost:5000/
version2

As you can see the cache expired and a new request has been made to read the file again and load it in cache, and then return to the client.

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Concourse Tasks and Inputs Tutorial

In this tutorial I will show you how to execute task scripts and using task inputs to have the ability to pass data to concourse for processing.

For my other content on concourse, have a look at the concourse category.

Task Inputs

First, let’s run a task on concourse that does not rely on any inputs.

no_inputs.yml
1
2
3
4
5
6
7
8
9
platform: linux

image_resource:
  type: docker-image
  source: {repository: busybox}

run:
  path: uname
args: ["-a"]

Running execute with the configuration from above:

1
2
3
4
5
6
7
$ fly -t ci e -c no_inputs.yml

executing build 37 at http://10.20.30.40/builds/37
initializing
running uname -a
Linux 2fd4e261-a708-4e15-4a4a-2bc50221a664 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64 GNU/Linux
succeeded

As you can see we have executed the command uname -a on one of the containers in Concourse.

Tasks Inputs: Specify Path

Now lets say, we have data that needs to be transferred to the container where we are running our tasks. For that we are using inputs.

In this example, we will set the input parameter in our task definition, and override the path with the cli. We will create a couple of files in a folder, then list them in the container where the task is running.

Creating the data:

1
2
$ mkdir my-data-folder
$ touch my-data-folder/test1.txt my-data-folder/test2.txt

Our task definition:

inputs_required.yml
1
2
3
4
5
6
7
8
9
10
11
12
platform: linux

image_resource:
  type: docker-image
  source: {repository: busybox}

inputs:
- name: my-input

run:
  path: ls
  args: ['-alR']

As you can see our input name is called my-input and we will use the cli to map the local folder my-data-folder to the parameter name:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ fly -t ci e -c inputs_required.yml -i my-input=./my-data-folder/

executing build 32 at http://10.20.30.40/builds/32
initializing
my-input: 262.13 KiB/s 0s
running ls -alR
.:
total 0
drwxr-xr-x    1 root     root            16 Feb 10 08:53 .
drwxr-xr-x    1 root     root            16 Feb 10 08:53 ..
drwxr-xr-x    1 root     root            36 Feb 10 08:53 my-input

./my-input:
total 0
drwxr-xr-x    1 root     root            36 Feb 10 08:53 .
drwxr-xr-x    1 root     root            16 Feb 10 08:53 ..
-rw-r--r--    1 501      staff            0 Feb 10 08:52 test1.txt
-rw-r--r--    1 501      staff            0 Feb 10 08:52 test2.txt
succeeded

As you can see from the above output, the folder was uploaded and placed inside the container where we ran our task.

Task Inputs: Parent Directory

Then we can use parent directories. Running a task that relies on the input path which will be our current working directory. Note: the input name should be the same as the current working directory

The input name will be the only thing that differs, which will look like:

1
2
inputs:
- name: my-data-folder

Running the task:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ cd my-data-folder
$ fly -t ci e -c ../input_parent_dir.yml

executing build 35 at http://10.20.30.40/builds/35
initializing
my-data-folder: 395.85 KiB/s 0s
running ls -alR
.:
total 0
drwxr-xr-x    1 root     root            38 Feb 10 09:17 .
drwxr-xr-x    1 root     root            16 Feb 10 09:17 ..
drwxr-xr-x    1 root     root            18 Feb 10 09:17 my-data-folder

./my-data-folder:
total 0
drwxr-xr-x    1 root     root            18 Feb 10 09:17 .
drwxr-xr-x    1 root     root            38 Feb 10 09:17 ..
-rw-r--r--    1 501      staff            0 Feb 10 09:15 test1.txt
-rw-r--r--    1 501      staff            0 Feb 10 09:15 test2.txt
succeeded

The source code for this can be found at https://github.com/ruanbekker/concourse-tutorial/tree/master/02-task-inputs

Task Scripts:

In conjunction with inputs, we can let our task configuration reference a script that we want to execute, and using inputs, we can upload the current working directory to concourse, so then the container has context about the data that it needs.

Our task configuration task_show_hostname.yml

1
2
3
4
5
6
7
8
9
10
11
platform: linux

image_resource:
  type: docker-image
  source: {repository: busybox}

inputs:
- name: 03-task-scripts

run:
  path: ./03-task-scripts/task_show_hostname.sh

Our executable script 03-task-scripts/task_show_hostname.sh

1
2
3
4
#!/bin/sh

get_hostname=$(hostname)
echo "Hostname is: ${get_hostname}"

Make sure to apply the executable permissions to the script:

1
$ chmod +x 03-task-scripts/task_show_hostname.sh

With this configuration, it uploads the current working directory to concourse, and the data inside the directory gets placed on the container’s working directory: 03-task-scripts, which is the name of the input.

1
2
3
4
5
6
7
8
$ fly -t ci e -c 03-task-scripts/task_show_hostname.yml

executing build 39 at http://10.20.30.40/builds/39
initializing
03-task-scripts: 347.15 KiB/s 0s
running ./03-task-scripts/task_show_hostname.sh
Hostname is: 3ccb3c28-d452-4068-5ea1-101153803d93
succeeded

The source code for this example can be found at https://github.com/ruanbekker/concourse-tutorial/tree/master/03-task-scripts

That’s it for Task Inputs and Task Scripts on Concourse, please feel free to have a look at my other content about Concourse

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Sysadmin Linux Troubleshooting Cheatsheet

This is a one pager of all the commands I use when I have to troubleshoot problems. This post will be updated as time goes by.

Curl / Web Response Times

Template file:

1
2
3
4
5
6
7
8
9
$ cat curl-format.txt
time_namelookup:  %{time_namelookup}\n
time_connect:  %{time_connect}\n
time_appconnect:  %{time_appconnect}\n
time_pretransfer:  %{time_pretransfer}\n
time_redirect:  %{time_redirect}\n
time_starttransfer:  %{time_starttransfer}\n
----------\n
time_total:  %{time_total}\n

The host header, source addres, destination address:

1
2
3
4
5
6
7
8
9
10
$ curl -sk -w "@curl-format.txt" -o /dev/null -H "Host: remote-host.mydomain.com" 10.0.2.10 https://10.244.0.240:443 -L

time_namelookup:  0.012178
time_connect:  0.012225
time_appconnect:  0.062149
time_pretransfer:  0.062175
time_redirect:  0.000172
time_starttransfer:  0.125631
----------
time_total:  0.125849

MTR / Network Latencies / Packetloss

No dns, TCP, counts, port, source address, destination address:

1
2
3
4
5
6
$ mtr -n -T -c 10 --port 443 10.2.0.2 10.244.10.5 --report
Start: Sun Feb 10 19:04:50 2019
HOST: my-internet-gatewewy         Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 172.18.110.22              0.0%    10    0.3   0.3   0.3   0.3   0.0
  2.|-- 172.18.110.22              0.0%    10    0.3   0.3   0.3   0.3   0.0
  3.|-- 172.18.110.22              0.0%    10    0.3   0.3   0.3   0.3   0.0

TCPTraceroute

No dns, TCP, port, source address, destination address:

1
2
3
4
$ traceroute -T -n -p 443 -s 10.80.4.7 10.2.129.4; done
traceroute to 10.2.129.4 (10.2.129.4), 30 hops max, 60 byte packets
 1  10.80.4.1   0.322 ms  0.291 ms  0.224 ms
 2  10.2.129.4  179.090 ms  179.022 ms  179.023 ms

Connection Related:

Connection flow: Thanks to

1
2
3
4
Consider two programs attempting a socket connection (call them a and b). Both set up sockets and transition to the LISTEN state. Then one program (say a) tries to connect to the other (b). a sends a request and enters the SYN_SENT state, and b receives the request and enters the SYN_RECV state. When b acknowledges the request, they enter the ESTABLISHED state, and do their business. Now a couple of things can happen:

    a wishes to close the connection, and enters FIN_WAIT1. b receives the FIN request, sends an ACK (then a enters FIN_WAIT2), enters CLOSE_WAIT, tells a it is closing down and the enters LAST_ACK. Once a acknowledges this (and enters TIME_WAIT), b enters CLOSE. a waits a bit to see if anythings is left, then enters CLOSE.
    a and b have finished their business and decide to close the connection (simultaneous closing). When a is in FIN_WAIT, and instead of receiving an ACK from b, it receives a FIN (as b wishes to close it as well), a enters CLOSING. But there are still some messages to send (the ACK that a is supposed to get for its original FIN), and once this ACK arrives, a enters TIME_WAIT as usual.

Active Connections:

1
$ netstat -n -A  inet | grep -v "127.0.0.1"

Established Connections:

1
2
$ netstat -nputw | grep ESTABLISHED
$ netstat -antp | grep :3306 | grep ESTABLISHED

Time Wait Connections:

1
$ netstat -antp | grep TIME_WAIT

How many connections:

1
$ wc -l /proc/net/tcp

Listing Open files per Port:

1
$ lsof -i:3306

Listing Open files per User:

1
$ lsof -u glassfish

Network Throughput

You can test the network throughput between two linux hosts with iperf:

On side-a we will start the server in TCP mode:

1
2
3
4
5
$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------

On side-b we will start the client, which connects to the server:

1
2
3
4
5
6
7
8
$ iperf -c 192.168.1.213
------------------------------------------------------------
Client connecting to 192.168.1.213, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.114 port 43870 connected with 192.168.1.213 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  11.4 MBytes  9.54 Mbits/sec

We can also run this in UDP mode where the server will run iperf -s -u and the client will run iperf -c host-address -u

Resources

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Port Status Checker Script in C Language

This is a simple script in the C Programming Language to test the port status of a remote address.

Requirements:

You will need the gcc package to compile the program:

For RHEL based distro’s:

1
$ yum install gcc -y

For Debian based distro’s:

1
$ apt install gcc -y

Check TCP Port Status in C Language:

Our file: test.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>


int main(int argc, char *argv[]) {

    int portno     = 443;
    char *hostname = "google.com";

    int sockfd;
    struct sockaddr_in serv_addr;
    struct hostent *server;

    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) {
        error("Error opening socket\n");
    }

    server = gethostbyname(hostname);

    if (server == NULL) {
        fprintf(stderr,"ERROR, no such host\n");
        exit(0);
    }

    bzero((char *) &serv_addr, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    bcopy((char *)server->h_addr,
         (char *)&serv_addr.sin_addr.s_addr,
         server->h_length);

    serv_addr.sin_port = htons(portno);
    if (connect(sockfd,(struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) {
        printf("Port is Closed\n");
    } else {
        printf("Port is Open\n");
    }

    close(sockfd);
    return 0;
}

Compile:

Compile using gcc:

1
$ gcc -o test test.c

Execute:

Execute the script:

1
2
$ ./test
Port is Open

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


The AWS CLI Cheatsheet for Bash

This is a post for all the AWS CLI oneliners that I stumble upon. Note that they will be updated over time.

RDS

Describe All RDS DB Instances:

1
$ aws --profile prod rds describe-db-instances --query 'DBInstances[*].[DBInstanceArn,DBInstanceIdentifier,DBInstanceClass,Endpoint]'

Describe a RDS DB Instance with a dbname:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ aws --profile prod rds describe-db-instances --query 'DBInstances[?DBInstanceIdentifier==`db-staging`].[DBInstanceArn,DBInstanceIdentifier,DBInstanceClass,Endpoint]'
[
    [
        "arn:aws:rds:eu-west-1:<customer_id>:db:db-staging",
        "db-staging",
        "db.t2.micro",
        {
            "HostedZoneId": "ASKDJSAKDJBA",
            "Port": 5432,
            "Address": "db-staging.asdkjahsd.eu-west-1.rds.amazonaws.com"
        }
    ]
]

List all RDS DB Instances and limit output:

1
2
3
4
5
6
7
8
9
10
11
12
$ aws --profile prod rds describe-db-instances --query 'DBInstances[*].[DBInstanceArn,DBInstanceIdentifier,DBInstanceClass,Endpoint]'
[
    [
        "arn:aws:rds:eu-west-1:<customer_id>:db:db-name",
        "db-name",
        "db.t2.micro",
        {
            "HostedZoneId": "ABCDEFGHILKL",
            "Port": 5432,
            "Address": "db-name.abcdefg.eu-west-1.rds.amazonaws.com"
        }
    ],

List all RDS DB Instances that has backups enabled, and limit output:

1
2
3
4
5
6
7
8
9
10
11
12
$ aws --profile prod rds describe-db-instances --query 'DBInstances[?BackupRetentionPeriod>`0`].[DBInstanceArn,DBInstanceIdentifier,DBInstanceClass,Endpoint]'
[
    [
        "arn:aws:rds:eu-west-1:<customer_id>:db:db-name",
        "db-name",
        "db.t2.micro",
        {
            "HostedZoneId": "ABCDEFGHILKL",
            "Port": 5432,
            "Address": "db-name.abcdefg.eu-west-1.rds.amazonaws.com"
        }
    ],

Describe DB Snapshots for DB Instance Name:

1
2
3
4
5
6
7
8
$ aws --profile prod rds describe-db-snapshots --db-instance-identifier db --query 'DBSnapshots[?DBInstanceIdentifier==`db`].[DBInstanceIdentifier,DBSnapshotIdentifier,SnapshotCreateTime,Status]'
[
    [
        "db",
        "rds:db-2018-05-16-04-08",
        "2018-05-16T04:08:53.696Z",
        "available"
    ],

Events for the last 24 Hours:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ aws --profile prod rds describe-events --source-identifier "rds:db-2018-05-16-04-08" --source-type db-snapshot --duration 1440 --query 'Events[*]'
[
    {
        "EventCategories": [
            "creation"
        ],
        "SourceType": "db-snapshot",
        "SourceArn": "arn:aws:rds:eu-west-1:<customer_id>:snapshot:rds:db-2018-05-16-04-08",
        "Date": "2018-05-16T04:08:40.264Z",
        "Message": "Creating automated snapshot",
        "SourceIdentifier": "rds:db-2018-05-16-04-08"
    },
    {
        "EventCategories": [
            "creation"
        ],
        "SourceType": "db-snapshot",
        "SourceArn": "arn:aws:rds:eu-west-1:<customer_id>:snapshot:rds:db-2018-05-16-04-08",
        "Date": "2018-05-16T04:32:04.047Z",
        "Message": "Automated snapshot created",
        "SourceIdentifier": "rds:db-2018-05-16-04-08"
    }
]

List Public RDS Instances:

1
2
3
4
5
6
7
8
$ aws --profile prod rds describe-db-instances --query 'DBInstances[?PubliclyAccessible==`true`].[DBInstanceIdentifier,Endpoint.Address]'

[
  [
    "name",
    "name.abcdef.eu-west-1.rds.amazonaws.com"
  ]
]

SSM Parameter Store:

List all parameters by path:

1
2
3
4
$ aws --profile prod ssm get-parameters-by-path --path '/service-a/team-a/my-app-name/' | jq '.Parameters[]' | jq -r '.Name'
/service-a/team-a/my-app-name/db_hostname
/service-a/team-a/my-app-name/db_username
/service-a/team-a/my-app-name/db_password

Get a value from a parameter:

1
2
$ aws --profile prod ssm get-parameters --names '/service-a/team-a/my-app-name/db_username' --with-decryption | jq '.Parameters[]' | jq -r '.Value'
my_db_user

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Python Multiprocessing Tutorial

I stumbled apon a great python multiprocessing tutorial, when I was looking into spawning multiple processes in parallel on a Lambda function.

In this example im getting latencies between regions using tcpping, but instead of running them one at a time, I was looking into spawning them in parralel:

(code made static for demonstration)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import boto3
import os
import json
import multiprocessing as mp
from decimal import Decimal

region_maps = {
    'eu-west-1': {
        'dynamodb': 'dynamodb.eu-west-1.amazonaws.com'
    },
    'us-east-1': {
        'dynamodb': 'dynamodb.us-east-1.amazonaws.com'
    },
    'us-west-1': {
        'dynamodb': 'dynamodb.us-west-1.amazonaws.com'
    },
    'us-west-2': {
        'dynamodb': 'dynamodb.us-west-2.amazonaws.com'
    }
}

def get_results(target_region, target_service, target_endpoint):
    static_results = {
        "address": target_endpoint,
        "attempts": 5,
        "avg": 481.80199999999996,
        "max": 816.25,
        "min": 312.46,
        "port": 443,
        "region": "eu-west-1_{}_{}".format(target_service, target_region),
        "regionTo": target_region,
        "results": [
            {"seq": 1,"time": "816.25"},
            {"seq": 2,"time": "331.50"},
            {"seq": 3,"time": "597.22"},
            {"seq": 4,"time": "312.46"},
            {"seq": 5,"time": "351.58"}
        ],
        "timestamp": "2019-02-05T17:10:32"
    }
    return static_results

def dynamodb_write(data):
    ddb = boto3.Session(profile_name='test', region_name='eu-west-1').resource('dynamodb').Table('mydynamotable')
    ddb_parsed = json.loads(json.dumps(data), parse_float=Decimal)
    response = ddb.put_item(Item=ddb_parsed)
    return response

def spawn_work(region):
    target_region = region
    target_service = 'dynamodb'
    target_endpoint = region_maps[target_region][target_service]
    data = get_results(region, target_service, target_endpoint)
    print("pid: {}, data: {}".format(os.getpid(), data))
    response = dynamodb_write(data)

if __name__ == "__main__":
    pool = mp.Pool(mp.cpu_count())
    result = pool.map(spawn_work, ['eu-west-1', 'us-east-1', 'us-west-1', 'us-west-2'])

When running it locally, I can see that each job ran in its own pid:

1
2
3
4
5
6
7
8
$ python foo.py
pid: 31224, data: {'attempts': 5, 'min': 312.46, 'timestamp': '2019-02-05T17:10:32', 'address': 'dynamodb.eu-west-1.amazonaws.com', 'max': 816.25, 'region': 'eu-west-1_dynamodb_eu-west-1', 'avg': 481.80199999999996, 'port': 443, 'regionTo': 'eu-west-1', 'results': [{'seq': 1, 'time': '816.25'}, {'seq': 2, 'time': '331.50'}, {'seq': 3, 'time': '597.22'}, {'seq': 4, 'time': '312.46'}, {'seq': 5, 'time': '351.58'}]}

pid: 31225, data: {'attempts': 5, 'min': 312.46, 'timestamp': '2019-02-05T17:10:32', 'address': 'dynamodb.us-east-1.amazonaws.com', 'max': 816.25, 'region': 'eu-west-1_dynamodb_us-east-1', 'avg': 481.80199999999996, 'port': 443, 'regionTo': 'us-east-1', 'results': [{'seq': 1, 'time': '816.25'}, {'seq': 2, 'time': '331.50'}, {'seq': 3, 'time': '597.22'}, {'seq': 4, 'time': '312.46'}, {'seq': 5, 'time': '351.58'}]}

pid: 31226, data: {'attempts': 5, 'min': 312.46, 'timestamp': '2019-02-05T17:10:32', 'address': 'dynamodb.us-west-1.amazonaws.com', 'max': 816.25, 'region': 'eu-west-1_dynamodb_us-west-1', 'avg': 481.80199999999996, 'port': 443, 'regionTo': 'us-west-1', 'results': [{'seq': 1, 'time': '816.25'}, {'seq': 2, 'time': '331.50'}, {'seq': 3, 'time': '597.22'}, {'seq': 4, 'time': '312.46'}, {'seq': 5, 'time': '351.58'}]}

pid: 31227, data: {'attempts': 5, 'min': 312.46, 'timestamp': '2019-02-05T17:10:32', 'address': 'dynamodb.us-west-2.amazonaws.com', 'max': 816.25, 'region': 'eu-west-1_dynamodb_us-west-2', 'avg': 481.80199999999996, 'port': 443, 'regionTo': 'us-west-2', 'results': [{'seq': 1, 'time': '816.25'}, {'seq': 2, 'time': '331.50'}, {'seq': 3, 'time': '597.22'}, {'seq': 4, 'time': '312.46'}, {'seq': 5, 'time': '351.58'}]}

Quite useful! Have a look at the link shared for more examples.

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Convert Float to Decimal Data Types for Boto3 DynamoDB Using Python

A quick post on a workaround when you need to convert float to decimal types.


One thing I really don’t like about the AWS SDK for Python, specifically aimed towards DynamoDB is that Float types are not supported and that you should use Decimal types instead.

For example, my payload below:

1
2
>>> data
{'attempts': 5, 'min': 180.87, 'timestamp': '2019-02-05T15:48:27', 'address': 'dynamodb.us-east-1.amazonaws.com', 'max': 747.17, 'region': 'eu-west-1_dynamodb', 'avg': 311.32599999999996, 'port': 443, 'regionTo': 'us-east-1', 'results': [{'seq': 1, 'time': '747.17'}, {'seq': 2, 'time': '215.60'}, {'seq': 3, 'time': '230.67'}, {'seq': 4, 'time': '180.87'}, {'seq': 5, 'time': '182.32'}]}

Trying to write that as an Item to my DynamoDB table and you will be faced with the exception below:

1
2
>>> ddb.put_item(Item=data)
TypeError: Float types are not supported. Use Decimal types instead.

One way around this is to use parse_float in json.loads():

1
2
3
4
5
>>> from decimal import Decimal
>>> import json
>>> ddb_data = json.loads(json.dumps(data), parse_float=Decimal)
>>> ddb_data
{u'max': Decimal('747.17'), u'min': Decimal('180.87'), u'timestamp': u'2019-02-05T15:48:27', u'region': u'eu-west-1_dynamodb', u'regionTo': u'us-east-1', u'results': [{u'seq': 1, u'time': u'747.17'}, {u'seq': 2, u'time': u'215.60'}, {u'seq': 3, u'time': u'230.67'}, {u'seq': 4, u'time': u'180.87'}, {u'seq': 5, u'time': u'182.32'}], u'attempts': 5, u'address': u'dynamodb.us-east-1.amazonaws.com', u'avg': Decimal('311.32599999999996'), u'port': 443}

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Paginate Through IAM Users on AWS Using Python and Boto3

When listing AWS IAM Users in Boto3, you will find that not all the users are retrieved. This is because they are paginated.

To do a normal list_users api call:

1
2
3
4
>>> import boto3
>>> iam = boto3.Session(region_name='eu-west-1', profile_name='default').client('iam')
>>> len(iam.list_users()['Users'])
100

Although I know there’s more than 200 users. Therefore we need to paginate through our users:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>>> import boto3
>>> iam = boto3.Session(region_name='eu-west-1', profile_name='default').client('iam')
>>> paginator = iam.get_paginator('list_users')
>>> users = []
>>> all_users = []
>>> for response in paginator.paginate():
...     users.append(response['Users'])
...
>>> len(users)
3

>>> for iteration in xrange(len(users)):
...     for userobj in xrange(len(users[iteration])):
...         all_users.append((users[iteration][userobj]['UserName']))
...
>>> len(all_users)
210

For more information on this, have a look at AWS Documentation about Pagination

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Setup a 3 Node Docker Swarm Cluster on Ubuntu 16.04

Docker Swarm is a Clustering and Orchestration Framework for the Docker ecosystem. Have a look at their official documentation for detailed information.

In this Tutorial we will Setup a 3 Node Docker Swarm Cluster and to Demonstrate How Easy it is to Deploy a Web Application with 2 Replicas from a Docker Image.



Overview of What we will be Doing

  • Install Docker on 3 Servers with Ubuntu 16.04
  • Initialize the Swarm and Join the Worker Nodes
  • Create a Nginx Service with 2 Replicas
  • Do some Inspection: View some info on the Service

Prerequisites

3 Fresh Deployed Ubuntu 16.04 Servers. ( 1GB Memory Servers will be good for development )

What is Docker

Docker is a Open Source Technology that allows you to create lightweight, isolated, reproducible application instances which is called Containers. Docker is built on top of the LXC technology, so it uses Linux Containers and as mentioned, it’s lightweight compared to a traditional VM.

A Container is isolated and uses the Kernel of the Docker host, it also utilizes Kernel features such as cgroups and namespaces in order to make them isolated.

Installing Docker Community Edition

Remove any older versions of Docker that might be present and install the dependencies:

1
2
3
$ sudo apt remove docker docker-engine -y
$ sudo apt install linux-image-extra-$(uname -r) linux-image-extra-virtual python-setuptools -y
$ sudo apt install apt-transport-https ca-certificates curl software-properties-common -y

Get the needed repository to setup Docker Community Edition:

1
2
3
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo apt-key fingerprint 0EBFCD88
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

Update the repository index and Install Docker Community Edition:

1
2
3
4
$ sudo apt update
$ sudo apt install docker-ce -y
$ sudo easy_install pip
$ sudo pip install docker-compose

Enable Docker on Startup and Start the Docker Engine:

1
2
$ sudo systemctl enable docker
$ sudo systemctl restart docker

If you would like to execute your docker commands without sudo, add your user to the docker group:

1
$ sudo usermod -aG docker $(whoami)

Test your Setup by Running a Hello World Container. You will see that if the image is not in the local docker image cache, it will pull the image from docker hub (or the respective docker registry), then once the image is saved locally, docker will then instantiate the container from that image:

1
2
3
4
5
6
7
8
9
$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
78445dd45222: Pull complete
Digest: sha256:c5515758d4c5e1e838e9cd307f6c6a0d620b5e07e6f927b07d05f6d12a1ac8d7
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

DNS Configuration

If you have a DNS Server you can configure the A Records for these hosts on DNS, but for simplicity, I will add the noted IP Addresses from the previous step into my /etc/hosts file so we can resolve names to IP’s

Open up the the hosts file:

1
$ sudo vim /etc/hosts

In my example, my IP Addresses:

1
2
3
192.0.2.41  manager
192.0.2.42  worker-1
192.0.2.43  worker-2

Repeat the above steps on the other 2 Servers and make note of the IP Addresses of each node. You should be able to ping and reach the nodes that was configured. Make sure to allow all traffic between these nodes.

Initialize the Swarm:

Now we will initialize the swarm on the manager node and as we have more than one network interface, we will specify the –advertise-addr option:

1
2
3
4
5
6
7
8
9
10
$ docker swarm init --advertise-addr 192.0.2.41
Swarm initialized: current node (siqyf3yricsvjkzvej00a9b8h) is now a manager.

    To add a worker to this swarm, run the following command:

    docker swarm join \
    --token SWMTKN-1-0eith07xkcg93lzftuhjmxaxwfa6mbkjsmjzb3d3sx9cobc2zp-97s6xzdt27y2gk3kpm0cgo6y2 \
    192.0.2.41:2377

    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

From the response above, we received the join token that allows the workers to register with the manager node. If its a scenario where you want to have more than one manager node, you can run docker swarm join-token manager to receive the join token for additional manager.

Let’s add the two worker nodes to the manager:

1
2
$ [worker-1] docker swarm join --token SWMTKN-1-0eith07xkcg93lzftuhjmxaxwfa6mbkjsmjzb3d3sx9cobc2zp-97s6xzdt27y2gk3kpm0cgo6y2 192.0.2.41:2377
This node joined a swarm as a worker.
1
2
$ [worker-2] docker swarm join --token SWMTKN-1-0eith07xkcg93lzftuhjmxaxwfa6mbkjsmjzb3d3sx9cobc2zp-97s6xzdt27y2gk3kpm0cgo6y2 192.0.2.41:2377
This node joined a swarm as a worker.

To see the node status, so that we can determine if the nodes are active/available etc, from the manager node, list all the nodes in the swarm:

1
2
3
4
5
[manager] $ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
j14mte3v1jhtbm3pb2qrpgwp6    worker-1  Ready   Active
siqyf3yricsvjkzvej00a9b8h *  master    Ready   Active        Leader
srl5yzme5hxnzxal2t1efmwje    worker-2  Ready   Active

Reobtaining the Join Tokens

If at any time, you lost your join token, it can be retrieved by running the following for the manager token:

1
$ docker swarm join-token manager -q SWMTKN-1-67chzvi4epx28ii18gizcia8idfar5hokojz660igeavnrltf0-09ijujbnnh4v960b8xel58pmj

And the following to retrieve the worker token:

1
$ docker swarm join-token worker -q SWMTKN-1-67chzvi4epx28ii18gizcia8idfar5hokojz660igeavnrltf0-acs21nn28v17uwhw0oqg5ibwx

Swarm Services in Docker uses a declarative model which means that you define the desired state of the service, and rely on Docker to maintain this state. More information on this can be found on their Documentation

At this moment, we will see that we have no services running in our swarm:

1
2
[manager] $ docker service ls
ID  NAME  MODE  REPLICAS  IMAGE

Deploying our First Service

Now onto the creation of a standard nginx service with 2 replicas, which means that there will be 2 containers of nginx running in our swarm.

But first, we need to create a overlay network, which is a network driver that creates a distributed network among multiple Docker daemon hosts. Swarm takes care of the routing automatically, which is routed via the port mappings. So you can have that your container sits on worker-2, when you hit your manager node on the published port, it will route the request to the desired application that resides on the respective container.

To create a overlay network called mynet:

1
[manager] $ docker network create --driver overlay mynet

Now onto creating the Service. If any of these containers fail, they will handled by the manager node and will be spawned again to have the desired number that we set on the replica option:

1
[manager] $ docker service create --name my-web --publish 8080:80 --replicas 2 --network mynet nginx

Let’s have a look at our nginx service:

1
2
3
[manager] $ docker service ls
ID            NAME    MODE        REPLICAS  IMAGE
1okycpshfusq  my-web  replicated  2/2       nginx:latest

After we see that the replica count is 2/2 our service is ready.

To see on which nodes our containers are running that makes up our service:

1
2
3
4
[manager] $ docker service ps my-web
ID            NAME      IMAGE         NODE      DESIRED STATE  CURRENT STATE           ERROR  PORTS
k0qqrh8s0c2d  my-web.1  nginx:latest  worker-1  Running        Running 30 seconds ago
nku9wer6tmll  my-web.2  nginx:latest  worker-2  Running        Running 30 seconds ago

From the above output, we can see that worker-1 and worker-2 are serving our containers for our service. We can also retrieve more information of our service by using the inspect option, which will give you a detailed response in json format of the service:

1
[manager] $ docker service inspect my-web

We can get the Endpoint Port info by using inspect and using the –format parameter to filter the output:

1
[manager] $ docker service inspect --format="" my-web  | python -m json.tool

From the output we will find the PublishedPort is the Port that we Expose, which will be the listener. Our TargetPort will be the port that is listening on the container:

1
2
3
4
5
6
7
8
[
    {
        "Protocol": "tcp",
        "PublishMode": "ingress",
        "PublishedPort": 8080,
        "TargetPort": 80
    }
]

Now that we went through the inspection of our service, its time to test our base nginx service.

Testing Nginx in our Swarm

Make a request against your docker node manager address on the port that was exposed, in this case 8080:

1
2
3
4
5
6
7
8
9
10
11
$ curl -I http://docker-node-manager-ip:8080

HTTP/1.1 200 OK
Server: nginx/1.15.5
Date: Thu, 10 Jan 2019 14:48:40 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 02 Oct 2018 14:49:27 GMT
Connection: keep-alive
ETag: "5bb38577-264"
Accept-Ranges: bytes

Now we have successfull setup a 3 node docker swarm cluster and deployed a basic nginx service to our swarm. Please have a look at my other Docker Swarm Tutorials for other content.

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.

Thanks for reading!