Ruan Bekker's Blog

From a Curious mind to Posts on Github

Using Paramiko Module in Python to Execute Remote Bash Commands

Paramiko is a python implementation of the sshv2 protocol.

Paramiko to execute Remote Commands:

We will use paramiko module in python to execute a command on our remote server.

Client side will be referenced as (side-a) and Server side will be referenced as (side-b)

Getting the Dependencies:

Install Paramiko via pip on side-a:

1
$ pip install paramiko --user

Using Paramiko in our Code:

Our Python Code:

1
2
3
4
5
6
7
8
9
10
11
12
import paramiko

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(hostname='192.168.10.10', username='ubuntu', key_filename='/home/ubuntu/.ssh/mykey.pem')

stdin, stdout, stderr = ssh.exec_command('lsb_release -a')

for line in stdout.read().splitlines():
    print(line)

ssh.close()

Execute our Command Remotely:

Now we will attempt to establish the ssh connection from side-a, then run lsb_release -a on our remote server, side-b:

1
2
3
4
5
6
$ python execute.py

Distributor ID:   Ubuntu
Description:  Ubuntu 16.04.4 LTS
Release:  16.04
Codename: xenial

Setup a SSH Tunnel With the Sshtunnel Module in Python

Sometimes we need to restrict access to a port, where a port should listen on localhost, but you want to access that port from a remote source. One secure way of doing that, is to establish a SSH Tunnel to the remote side, and forward to port via the SSH Tunnel.

Today we will setup a Flask Web Service on our Remote Server (Side B) which will be listening on 127.0.0.1:5000 and setup the SSH Tunnel with the sshtunnel module in Python from our client side (Side A). Then we will make a GET request on our client side to the port that we are forwarding via the tunnel to our remote side.

Remote Side:

Our Demo Python Flask Application:

1
2
3
4
5
6
7
8
9
10
from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
    return 'OK'

if __name__ == '__main__':
    app.run(host='127.0.0.1', port=5000)

Run the server:

1
2
$ python app.py
Listening on 127.0.0.1:5000

Client Side:

From our client side we first need to install sshtunnel via pip:

1
$ pip install sshtunnel requests --user

Our code for our client that will establish the tunnel and do the GET request:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from sshtunnel import SSHTunnelForwarder
import requests

remote_user = 'ubuntu'
remote_host = '192.168.10.10'
remote_port = 22
local_host = '127.0.0.1'
local_port = 5000

server = SSHTunnelForwarder(
   (remote_host, remote_port),
   ssh_username=remote_user,
   ssh_private_key='/home/ubuntu/.ssh/mykey.pem',
   remote_bind_address=(local_host, local_port),
   local_bind_address=(local_host, local_port),
   )

server.start()

headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.0; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0'}
r = requests.get('http://127.0.0.1:5000', headers=headers).content
print(r)
server.stop()

Running our app:

1
2
$ python ssh_tunnel.py
OK

So we have sucessfully established our ssh tunnel to our remote side, and able to access the network restricted port via the tunnel.

Resources:

Basic RESTFul API Server With Python Flask

A Basic RESTFul API Service with Python Flask. We will be using the Flask, jsonify and request classes to build our API service.

Description of this demonstration:

Our API will be able to do the following:

  • Create, Read, Update, Delete

In this demonstration, we will add some information about people to our API, then go through each method that is mentioned above.

Getting the Dependencies:

Setup the virtualenv and install the dependencies:

1
2
3
$ virtualenv .venv
$ source .venv/bin/activate
$ pip install flask

The API Server Code:

Here’s the complete code, as you can see I have a couple of decorators for each url endpoint, and a id_generator function, that will generate id’s for each document. The id will be used for getting users information, updates and deletes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
from flask import Flask, jsonify, request
from multiprocessing import Value

counter = Value('i', 0)
app = Flask(__name__)

a = []
help_message = """
API Usage:
 
- GET    /api/list
- POST   /api/add data={"key": "value"}
- GET    /api/get/<id>
- PUT    /api/update/<id> data={"key": "value_to_replace"}
- DELETE /api/delete/<id> 

"""

def id_generator():
    with counter.get_lock():
        counter.value += 1
        return counter.value

@app.route('/api', methods=['GET'])
def help():
    return help_message

@app.route('/api/list', methods=['GET'])
def list():
    return jsonify(a)

@app.route('/api/add', methods=['POST'])
def index():
    payload = request.json
    payload['id'] = id_generator()
    a.append(payload)
    return "Created: {} \n".format(payload)

@app.route('/api/get', methods=['GET'])
def get_none():
    return 'ID Required: /api/get/<id> \n'

@app.route('/api/get/<int:_id>', methods=['GET'])
def get(_id):
    for user in a:
        if _id == user['id']:
            selected_user = user
    return jsonify(selected_user)

@app.route('/api/update', methods=['PUT'])
def update_none():
    return 'ID and Desired K/V in Payload required: /api/update/<id> -d \'{"name": "john"}\' \n'

@app.route('/api/update/<int:_id>', methods=['PUT'])
def update(_id):
    update_req = request.json
    key_to_update = update_req.keys()[0]
    update_val = (item for item in a if item['id'] == _id).next()[key_to_update] = update_req.values()[0]
    update_resp = (item for item in a if item['id'] == _id).next()
    return "Updated: {} \n".format(update_resp)

@app.route('/api/delete/<int:_id>', methods=['DELETE'])
def delete(_id):
    deleted_user = (item for item in a if item['id'] == _id).next()
    a.remove(deleted_user)
    return "Deleted: {} \n".format(deleted_user)

if __name__ == '__main__':
    app.run()

Demo Time:

Retrieving the Help output:

1
2
3
4
5
6
7
8
9
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api

API Usage:

- GET    /api/list
- POST   /api/add data={"key": "value"}
- GET    /api/get/<id>
- PUT    /api/update/<id> data={"key": "value_to_replace"}
- DELETE /api/delete/<id>

Doing a list, to list all the users, its expected for it to be empty as we have not added any info to our API:

1
2
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/list
[]

Adding our first user:

1
2
$ curl -XPOST -H 'Content-Type: application/json' http://localhost:5000/api/add -d '{"name": "ruan", "country": "south africa", "age": 30}'
Created: {u'country': u'south africa', u'age': 30, u'name': u'ruan', 'id': 1}

Adding our second user:

1
2
$ curl -XPOST -H 'Content-Type: application/json' http://localhost:5000/api/add -d '{"name": "stefan", "country": "south africa", "age": 29}'
Created: {u'country': u'south africa', u'age': 29, u'name': u'stefan', 'id': 2}

Doing a list again, will retrieve all our users:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/list
[
  {
    "age": 30,
    "country": "south africa",
    "id": 1,
    "name": "ruan"
  },
  {
    "age": 29,
    "country": "south africa",
    "id": 2,
    "name": "stefan"
  }
]

Doing a GET on the userid, to only display the users info:

1
2
3
4
5
6
7
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/get/2
{
  "age": 29,
  "country": "south africa",
  "id": 2,
  "name": "stefan"
}

Now, let’s update some details. Let’s say that Stefan relocated to New Zealand. We will need to provide his id and also the key/value that we want to update:

1
2
$ curl -XPUT -H 'Content-Type: application/json' http://localhost:5000/api/update/2 -d '{"country": "new zealand"}'
Updated: {u'country': u'new zealand', u'age': 29, u'name': u'stefan', 'id': 2}

As you can see the response confirmed that the value was updated, but let’s verify the output, by doing a get on his id:

1
2
3
4
5
6
7
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/get/2
{
  "age": 29,
  "country": "new zealand",
  "id": 2,
  "name": "stefan"
}

And lastly, lets delete our user, which will only require the userid:

1
2
$ curl -XDELETE -H 'Content-Type: application/json' http://localhost:5000/api/delete/2
Deleted: {u'country': u'new zealand', u'age': 29, u'name': u'stefan', 'id': 2}

To verify this, list all the users:

1
2
3
4
5
6
7
8
9
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/list
[
  {
    "age": 30,
    "country": "south africa",
    "id": 1,
    "name": "ruan"
  }
]

Using Python Requests:

We can also use python’s requests module to do the same, to give a demonstration I will create a new user:

1
2
$ pip install requests
$ python
1
2
3
4
5
6
7
8
9
10
>>> import requests
>>> import json

>>> base_url = 'http://localhost:5000/api/add'
>>> headers = {"Content-Type": "application/json"}
>>> payload = json.dumps({"name": "shaun", "country": "australia", "age": 24})

>>> r = requests.post(base_url, headers=headers, data=payload)
>>> r.content
Created: {u'country': u'australia', u'age': 24, u'name': u'shaun', 'id': 4}

Thats it. I’ve stumbled upon Flask-Restful which I still want to check out, and as soon as I do, I will do a post on it, maybe baked with a NoSQL db or something like that.

Cheers!

Resources:

Basic Introduction to Use Arguments With Argparse on Python

I used to work a lot with sys.argv for using arguments in my applications, until I stumbled upon the argparse module! (Thanks Donovan!)

What I like about argparse, is that it builds up the help menu for you, and you also have a lot of options, as you can set the argument to be required, set the datatypes, addtional help context etc.

The Basic Demonstration:

Today we will just run through a very basic example on how to use argparse:

  • Return the generated help menu
  • Return the required value
  • Return the additional arguments
  • Compare arguments with a IF statement

The Python Argparse Tutorial Code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import argparse

parser = argparse.ArgumentParser(description='argparse demo')
parser.add_argument('-w', '--word', help='a word (required)', required=True)
parser.add_argument('-s', '--sentence', help='a sentence (not required)', required=False)
parser.add_argument('-c', '--comparison', help='a word to compare (not required)', required=False)
args = parser.parse_args()

print("Word: {}".format(args.word))

if args.sentence:
  print("Sentence: :{}".format(args.sentence))

if args.comparison:
  if args.comparison == args.word:
      print("Comparison: the provided word argument and provided comparison argument is the same")
  else:
      print("Comparison: the provided word argument and provided comparison argument is NOT the same")

Seeing it in action:

To return a usage/help info, run it with the -h or --help argument:

1
2
3
4
5
6
7
8
9
10
11
12
$ python foo.py -h
usage: foo.py [-h] -w WORD [-s SENTENCE] [-c COMPARISON]

argparse demo

optional arguments:
  -h, --help            show this help message and exit
  -w WORD, --word WORD  a word (required)
  -s SENTENCE, --sentence SENTENCE
                        a sentence (not required)
  -c COMPARISON, --comparison COMPARISON
                        a word to compare (not required)

For this to work, the application is expecting the word argument to run, as we declared it as required=True:

1
2
$ python foo.py -w hello
Word: hello

Now to use the arguments that is not required, which makes it optional:

1
2
3
$ python foo.py -w hello -s "hello, world"
Word: hello
Sentence: :hello, world

We can also implement some if statements into our application to compare if arguments are the same (as a basic example):

1
2
3
4
$ python foo.py -w hello -s "hello, world" -c goodbye
Word: hello
Sentence: :hello, world
Comparison: the provided word argument and provided comparison argument is NOT the same

We can see that the word and comparison arguments are not the same. When they match up:

1
2
3
4
$ python foo.py -w hello -s "hello, world" -c hello
Word: hello
Sentence: :hello, world
Comparison: the provided word argument and provided comparison argument is the same

This was a very basic demonstration on the argparse module.

Resource:

How to Monitor a Amazon Elasticsearch Service Cluster Update Process

When you make a configuration change on Amazon’s Elasticsearch, it does a blue/green deployment. So new nodes will be allocated to the cluster (which you will notice from CloudWatch when looking at the nodes metrics). Once these nodes are deployed, data gets copied accross to the new nodes, and traffic gets directed to the new nodes, and once its done, the old nodes gets terminated.

Note: While there will be more nodes in the cluster, you will not get billed for the extra nodes.

While this process is going, you can monitor your cluster to see the progress:

The Shards API:

Using the /_cat/shards API, you will find that the shards are in a RELOCATING state (keeping in mind, this is when the change is still busy)

1
2
3
4
5
6
7
8
9
10
11
12
curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cat/shards?v' | grep -v 'STARTED'
index                                   shard prirep state         docs    store ip            node
example-app1-2018.02.23                 4     r      RELOCATING  323498 1018.3mb x.x.x.x x2mKoe_ -> x.x.x.x GyNiRJyeSTifN_9JZisGuQ GyNiRJy
example-app1-2018.02.28                 2     p      RELOCATING  477609    1.5gb x.x.x.x x2mKoe_ -> x.x.x.x sOihejw1SrKtag_LO1RGIA sOihejw
example-app1-2018.03.01                 3     r      RELOCATING  463143    1.5gb x.x.x.x  ZZfv-Ha -> x.x.x.x jOchdCZWQq-TAPZNTadNoA jOchdCZ
fortinet-syslog-2018.02                 0     p      RELOCATING 1218556  462.2mb x.x.x.x  moQA57Y -> x.x.x.x sOihejw1SrKtag_LO1RGIA sOihejw
example-app1-2018.03.23                 3     r      RELOCATING  821254    2.4gb x.x.x.x  moQA57Y -> x.x.x.x GyNiRJyeSTifN_9JZisGuQ GyNiRJy
example-app1-2018.04.02                 2     p      RELOCATING 1085279    3.4gb x.x.x.x x2mKoe_ -> x.x.x.x jOchdCZWQq-TAPZNTadNoA jOchdCZ
example-app1-2018.02.08                 3     p      RELOCATING  136321    125mb x.x.x.x ZUZSFWu -> x.x.x.x tyU_V_KLS5mZXEwnF-YEAQ tyU_V_K
fortinet-syslog-2018.04                 4     r      RELOCATING 7513842    2.8gb x.x.x.x  ZZfv-Ha -> x.x.x.x il1WsroNSgGmXJugds_aMQ il1Wsro
example-app1-2018.04.09                 1     r      RELOCATING 1074581    3.5gb x.x.x.x  ZRzKGe5 -> x.x.x.x il1WsroNSgGmXJugds_aMQ il1Wsro
example-app1-2018.04.09                 0     p      RELOCATING 1074565    3.5gb x.x.x.x  moQA57Y -> x.x.x.x tyU_V_KLS5mZXEwnF-YEAQ tyU_V_K

The Recovery API:

We can then use the /_cat/recovery API, which will show the progress of the shards transferring to the other nodes, you will find the following:

  • index, shard, time, type, stage, source_host, target_host, files, files_percent, bytes, bytes_percent

As Amazon masks their node ip addresses, we will find that the ips are not available. To make it more human readable, we will only pass the columns that we are interested in and not to show the shards that has been set to done:

1
2
3
4
5
6
7
8
9
10
11
12
$ curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cat/recovery?v&h=i,s,t,ty,st,shost,thost,f,fp,b,bp' | grep -v 'done'
i                                       s t     ty          st       shost         thost         f   fp     b          bp
example-app1-2018.04.11                 1 2m    peer        index    x.x.x.x x.x.x.x  139 97.1%  3435483673 65.9%
web-syslog-2018.04                 4 7.6m  peer        finalize x.x.x.x x.x.x.x  109 100.0% 2854310892 100.0%
example-app1-2018.04.16                 3 2.9m  peer        translog x.x.x.x x.x.x.x  130 100.0% 446180036  100.0%
example-app1-2018.03.30                 3 2.1m  peer        index    x.x.x.x  x.x.x.x  127 97.6%  3862498583 62.5%
example-app1-2018.04.01                 0 4.4m  peer        index    x.x.x.x  x.x.x.x  140 99.3%  3410543270 87.9%
example-app1-2018.04.06                 0 5.1m  peer        index    x.x.x.x x.x.x.x  128 97.7%  4291421948 66.3%
example-app1-2018.04.07                 0 52.2s peer        index    x.x.x.x x.x.x.x 149 91.9%  3969581277 27.4%
network-capture-2018.04.01               2 11.4s peer        index    x.x.x.x  x.x.x.x 107 95.3%  359987163  55.0%
example-app1-2018.03.17                 1 1.7m  peer        index    x.x.x.x  x.x.x.x 117 98.3%  2104196548 74.5%
example-app1-2018.02.25                 3 58.4s peer        index    x.x.x.x  x.x.x.x 102 98.0%  945437614  74.7%

We can also see the human readable output, which is displayed in json format, with much more detail:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
$ curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/example-app1-2018.04.03/_recovery?human' | python -m json.tool
{
    "example-app1-2018.04.03": {
        "shards": [
            {
                "id": 0,
                "index": {
                    "files": {
                        "percent": "100.0%",
                        "recovered": 103,
                        "reused": 0,
                        "total": 103
                    },
                    "size": {
                        "percent": "100.0%",
                        "recovered": "3.6gb",
                        "recovered_in_bytes": 3926167091,
                        "reused": "0b",
                        "reused_in_bytes": 0,
                        "total": "3.6gb",
                        "total_in_bytes": 3926167091
                    },
                    "source_throttle_time": "2m",
                    "source_throttle_time_in_millis": 121713,
                    "target_throttle_time": "2.1m",
                    "target_throttle_time_in_millis": 126170,
                    "total_time": "7.2m",
                    "total_time_in_millis": 434142
                },
                "primary": true,
                "source": {
                    "host": "x.x.x.x",
                    "id": "ZRzKGe5WSg2SzilZGb3RbA",
                    "ip": "x.x.x.x",
                    "name": "ZRzKGe5",
                    "transport_address": "x.x.x.x:9300"
                },
                "stage": "DONE",
                "start_time": "2018-04-10T19:26:48.668Z",
                "start_time_in_millis": 1523388408668,
                "stop_time": "2018-04-10T19:34:04.980Z",
                "stop_time_in_millis": 1523388844980,
                "target": {
                    "host": "x.x.x.x",
                    "id": "x2mKoe_GTpe3b1CnXOKisA",
                    "ip": "x.x.x.x",
                    "name": "x2mKoe_",
                    "transport_address": "x.x.x.x:9300"
                },
                "total_time": "7.2m",
                "total_time_in_millis": 436311,
                "translog": {
                    "percent": "100.0%",
                    "recovered": 0,
                    "total": 0,
                    "total_on_start": 0,
                    "total_time": "1.1s",
                    "total_time_in_millis": 1154
                },
                "type": "PEER",
                "verify_index": {
                    "check_index_time": "0s",
                    "check_index_time_in_millis": 0,
                    "total_time": "0s",
                    "total_time_in_millis": 0
                }
            },

The Cluster Health API:

Amazon restricts most of the /_cluster API actions, but we can however see the health endpoint, where we can see the number of nodes, active_shards, relocating_shards, number_of_pending_tasks etc:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl -XGET https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cluster/health?pretty
{
  "cluster_name" : "0123456789012:example-elasticsearch-cluster-6",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 16,
  "number_of_data_nodes" : 10,
  "active_primary_shards" : 803,
  "active_shards" : 1606,
  "relocating_shards" : 10,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

The Pending Tasks API:

We also have some insights into the /_cat/pending_tasks API:

1
2
3
$ curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cat/pending_tasks?v'
insertOrder timeInQueue priority source
1757        53ms URGENT   shard-started shard id [[network-metrics-2018.04.13][0]], allocation id [Qh91o_OGRX-lFnY8KxYgQw], primary term [0], message [after peer recovery]

Resources:

Experimenting With Python and TinyMongo a MongoDB Wrapper for TinyDB

TinyMongo is a wrapper for MongoDB on top of TinyDB.

This is awesome for testing, where you need a local document orientated database which is backed by a flat file. It feels just like using MongoDB, except that its local, lightweight and using TinyDB in the backend.

Installing Dependencies:

1
$ pip install tinymongo

Usage Examples:

Initialize tinymongo and create the database and collection:

1
2
3
4
>>> from tinymongo import TinyMongoClient
>>> connection = TinyMongoClient('foo')
>>> db_init = connection.mydb
>>> db = db_init.users

Insert a Document, catch the document id and search for that document:

1
2
3
4
>>> record_id = db .insert_one({'username': 'ruanb', 'name': 'ruan', 'age': 31, 'gender': 'male', 'location': 'south africa'}).inserted_id
>>> user_info = db.find_one({"_id": record_id})
>>> print(user_info)
{u'username': u'ruanb', u'name': u'ruan', u'gender': u'male', u'age': 31, u'_id': u'8d2ce01140ec11e888110242ac110004', u'location': u'south africa'}

Update a document: Update the age attribute from 31 to 32

1
2
3
>>> db.users.update_one({'_id': '8d2ce01140ec11e888110242ac110004'}, {'$set': {'age': 32 }})
>>> print(user_info)
{u'username': u'ruanb', u'name': u'ruan', u'gender': u'male', u'age': 32, u'_id': u'8d2ce01140ec11e888110242ac110004', u'location': u'south africa'}

Insert some more data:

1
2
>>> record_id = db .insert_one({'username': 'stefanb', 'name': 'stefan', 'age': 30, 'gender': 'male', 'location': 'south africa'}).inserted_id
>>> record_id = db .insert_one({'username': 'alexa', 'name': 'alex', 'age': 34, 'gender': 'male', 'location': 'south africa'}).inserted_id

Find all the users, sorted by descending age, oldest to youngest:

1
2
3
4
5
6
7
>>> response = db.users.find(sort=[('age', -1)])
>>> for doc in response:
...     print(doc)
...
{u'username': u'alexa', u'name': u'alex', u'gender': u'male', u'age': 34, u'_id': u'66b1cc3d40ee11e892980242ac110004', u'location': u'south africa'}
{u'username': u'ruanb', u'name': u'ruan', u'gender': u'male', u'age': 32, u'_id': u'8d2ce01140ec11e888110242ac110004', u'location': u'south africa'}
{u'username': u'stefanb', u'name': u'stefan', u'gender': u'male', u'age': 30, u'_id': u'fbe9da8540ed11e88c5e0242ac110004', u'location': u'south africa'}

Find the number of documents in the collection:

1
2
>>> db.users.find().count()
3

Resources:

Experimenting With Python and Flata the Lightweight Document Orientated Database

Flata is a lightweight document orientated database, which was inspired by TinyDB and LowDB.

Why Flata?

Most of the times my mind gets in its curious states and I think about alternative ways on doing things, especially testing lightweight apps and today I wondered if theres any NoSQL-like software out there that is easy to spin up and is backed by a flat file, something like sqlite for SQL-like services, so this time just something for NoSQL-like.

So I stumbled upon TinyDB and Flata which is really easy to use and awesome!

What will we be doing today:

  • Create Database / Table
  • Write to the Table
  • Update Documents from the Table
  • Scan the Table
  • Query the Table
  • Delete Documents from the Table
  • Purge the Table

Getting the Dependencies:

Flata is written in Python, so no external dependencies is needed. To install it:

1
$ pip install flata

Usage Examples:

My home working directory:

1
2
$ pwd
/home/ruan

This will be the directory where we will save our database in .json format.

Import the Dependencies:

1
2
>>> from flata import Flata, Query, where
>>> from flata.storages import JSONStorage

Create the Database file where all the data will be persisted:

1
>>> db_init = Flata('mydb.json', storage=JSONStorage)

Create the collection / table, with a custom id field. If the resource already exists a retrieve will be done:

1
>>> db_init.table('collection1', id_field = 'uid')

List the tables:

1
2
>>> db_init.all()
{u'collection1': {}}

a get method can only be done if the resource exists, and we will assign it to the db object:

1
>>> db = db_init.get('collection1')

Insert some data into our table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
>>> db.insert({'username': 'ruanb', 'name': 'ruan', 'age': 31, 'gender': 'male', 'location': 'south africa'})
{'username': 'ruanb', 'uid': 1, 'gender': 'male', 'age': 31, 'location': 'south africa', 'name': 'ruan'}

>>> db.insert({'username': 'stefanb', 'name': 'stefan', 'age': 30, 'gender': 'male', 'location': 'south africa'})
{'username': 'stefanb', 'uid': 2, 'gender': 'male', 'age': 30, 'location': 'south africa', 'name': 'stefan'}

>>> db.insert({'username': 'mikec', 'name': 'mike', 'age': 28, 'gender': 'male', 'location': 'south africa'})
{'username': 'mikec', 'uid': 3, 'gender': 'male', 'age': 28, 'location': 'south africa', 'name': 'mike'}

>>> db.insert({'username': 'sam', 'name': 'samantha', 'age': 24, 'gender': 'female', 'location': 'south africa'})
{'username': 'sam', 'uid': 4, 'gender': 'female', 'age': 24, 'location': 'south africa', 'name': 'samantha'}

>>> db.insert({'username': 'michellek', 'name': 'michelle', 'age': 32, 'gender': 'female', 'location': 'south africa'})
{'username': 'michellek', 'uid': 5, 'gender': 'female', 'age': 32, 'location': 'south africa', 'name': 'michelle'}

Scan the whole table:

1
2
>>> db.all()
[{u'username': u'ruanb', u'uid': 1, u'name': u'ruan', u'gender': u'male', u'age': 31, u'location': u'south africa'}, {u'username': u'stefanb', u'uid': 2, u'name': u'stefan', u'gender': u'male', u'age': 30, u'location': u'south africa'}, {u'username': u'mikec', u'uid': 3, u'name': u'mike', u'gender': u'male', u'age': 28, u'location': u'south africa'}, {u'username': u'sam', u'uid': 4, u'name': u'samantha', u'gender': u'female', u'age': 24, u'location': u'south africa'}, {u'username': u'michellek', u'uid': 5, u'name': u'michelle', u'gender': u'female', u'age': 32, u'location': u'south africa'}]

Query data from the table.

Query the table for the username => ruanb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>>> import json
>>> q = Query()

>>> response = db.search(q.username == 'ruanb')
>>> print(json.dumps(response, indent=2))
[
  {
    u'username': u'ruanb',
    u'uid': 1,
    u'name': u'ruan',
    u'gender': u'male',
    u'age': 31,
    u'location': u'south africa'
  }
]

Query the table for everyone that is older than 29 and only male genders:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> db.search(( q.gender == 'male' ) & (q.age >= 29 ))
[
  {
    u'username': u'ruanb',
    u'uid': 1,
    u'name': u'ruan',
    u'gender': u'male',
    u'age': 31,
    u'location': u'south africa'
  },
  {
    u'username': u'stefanb',
    u'uid': 2,
    u'name': u'stefan',
    u'gender': u'male',
    u'age': 30,
    u'location': u'south africa'
  }
]

Query the table for everyone that is younger than 25 or males:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
>>> db.search(( q.age < 25 ) | (q.gender == 'male' ) )
[
  {
    "username": "ruanb",
    "uid": 1,
    "name": "ruan",
    "gender": "male",
    "age": 31,
    "location": "south africa"
  },
  {
    "username": "stefanb",
    "uid": 2,
    "name": "stefan",
    "gender": "male",
    "age": 30,
    "location": "south africa"
  },
  {
    "username": "mikec",
    "uid": 3,
    "name": "mike",
    "gender": "male",
    "age": 28,
    "location": "south africa"
  },
  {
    "username": "sam",
    "uid": 4,
    "name": "samantha",
    "gender": "female",
    "age": 24,
    "location": "south africa"
  }
]

Update the location value: Lets say Samantha relocated to New Zealand, and we need to update her location from South Africa to New Zealand:

1
2
3
4
5
>>> db.update({'location': 'new zealand'}, where('username') == 'sam' )
([4], [{u'username': u'sam', u'uid': 4, u'name': u'samantha', u'gender': u'female', u'age': 24, u'location': 'new zealand'}])

>>> db.search(q.username == 'sam')
[{u'username': u'sam', u'uid': 4, u'name': u'samantha', u'gender': u'female', u'age': 24, u'location': u'new zealand'}]

Delete a document by its id:

1
2
>>> db.remove(ids=[4])
([4], [])

Delete all documents matching a query, for this example, all people with the gender: male:

1
2
>>> db.remove(q.gender == 'male')
([1, 2, 3], [])

Delete all the data in the table:

1
>>> db.purge()

When we exit, you will find the database file, which we created:

1
2
$ ls
mydb.json

Resources:

Set Docker Environment Variables During Build Time

When using that ARG option in your Dockerfile, you can specify the --build-args option to define the value for the key that you specify in your Dockerfile to use for a environment variable as an example.

Today we will use the arg and env to set environment variables at build time.

The Dockerfile:

Our Dockerfile

1
2
3
4
FROM alpine:edge
ARG NAME
ENV OWNER=${NAME:-NOT_DEFINED}
CMD ["sh", "-c", "echo env var: ${OWNER}"]

Building our Image, we will pass the value to our NAME argument:

1
$ docker build --build-arg NAME=james -t ruan:test .

Now when we run our container, we will notice that the build time argument has passed through to our environment variable from the running container:

1
2
$ docker run -it ruan:test
env var: james

When we build the image without specifying build arguments, and running the container:

1
2
3
$ docker build -t ruan:test .
$ docker run -it ruan:test
env var: NOT_DEFINED

Docker Environment Substitution With Dockerfile

The 12 Factor way, is a general guideline that provides best practices when building applications. One of them is using environment variables to store application configuration.

What will we be doing:

In this post we will build a simple docker application that returns the environment variable’s value to standard out. We are using environment substitution, so if the environment variable is not provided, we will set a default value of NOT_DEFINED.

We will have the environment variable OWNER and when no value is set for that Environment Variable, the NOT_DEFINED value will be returned.

The Dockerfile

Our Dockerfile:

1
2
3
FROM alpine:edge
ENV OWNER=${OWNER:-NOT_DEFINED}
CMD ["sh", "-c", "echo env var: ${OWNER}"]

Building the image:

1
$ docker build -t test:envs .

Putting it to action:

Now we will run a container and pass the OWNER environment variable as an option:

1
2
$ docker run -it -e OWNER=ruan test:envs .
env var: ruan

When we run a container without specifying the environment variable:

1
2
$ docker run -it ruan:test
env var: NOT_DEFINED

Resources:

Using AWS SSM Parameter Store to Retrieve Secrets Encrypted by KMS Using Python

Today we will use Amazon Web Services SSM Service to store secrets in their Parameter Store which we will encyrpt using KMS.

Then we will read the data from SSM and decrypt using our KMS key. We will then end it off by writing a Python Script that reads the AWS credentials, authenticates with SSM and then read the secret values that we stored.

The Do List:

We will break up this post in the following topics:

  • Create a KMS Key which will use to Encrypt/Decrypt the Parameter in SSM
  • Create the IAM Policy which will be used to authorize the Encrypt/Decrypt by the KMS ID
  • Create the KMS Alias
  • Create the Parameter using PutParameter as a SecureString to use Encryption with KMS
  • Describe the Parameters
  • Read the Parameter with and without Decryption to determine the difference using GetParameter
  • Read the Parameters using GetParameters
  • Environment Variable Example

Create the KMS Key:

As the administrator, or root account, create the KMS Key:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
>>> import boto3
>>> session = boto3.Session(region_name='eu-west-1', profile_name='personal')
>>> iam = session.client('iam')
>>> kms = session.client('kms')
>>> response = kms.create_key(
    Description='Ruan Test Key',
    KeyUsage='ENCRYPT_DECRYPT',
    Origin='AWS_KMS',
    BypassPolicyLockoutSafetyCheck=False,
    Tags=[{'TagKey': 'Name', 'TagValue': 'RuanTestKey'}]
)

>>> print(response['KeyMetadata']['KeyId'])
foobar-2162-4363-ba02-a953729e5ce6

Create the IAM Policy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
>>> response = iam.create_policy(
    PolicyName='ruan-kms-test-policy',
    PolicyDocument='{
        "Version": "2012-10-17",
        "Statement": [{
            "Sid": "Stmt1517212478199",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:kms:eu-west-1:0123456789012:key/foobar-2162-4363-ba02-a953729e5ce6"
        }]
    }', 
    Description='Ruan KMS Test Policy'
)
>>> print(response['Policy']['Arn'])
arn:aws:iam::0123456789012:policy/ruan-kms-test-policy

Create the KMS Alias:

1
>>> response = kms.create_alias(AliasName='alias/ruan-test-kms', TargetKeyId='foobar-2162-4363-ba02-a953729e5ce6')

Publish the Secrets to SSM:

As the administrator, write the secret values to the parameter store in SSM. We will publish a secret with the Parameter: /test/ruan/mysql/db01/mysql_hostname and the Value: db01.eu-west-1.mycompany.com:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>>> from getpass import getpass
>>> secretvalue = getpass()
Password:

>>> print(secretvalue)
db01.eu-west-1.mycompany.com

>>> response = ssm.put_parameter(
    Name='/test/ruan/mysql/db01/mysql_hostname',
    Description='RuanTest MySQL Hostname',
    Value=secretvalue,
    Type='SecureString',
    KeyId='foobar-2162-4363-ba02-a953729e5ce6',
    Overwrite=False
)

Describe Parameters

Describe the Parameter that we written to SSM:

1
2
3
4
5
>>> response = ssm.describe_parameters(
    Filters=[{'Key': 'Name', 'Values': ['/test/ruan/mysql/db01/mysql_hostname']}]
)
>>> print(response['ResponseMetadata']['Parameters'][0]['Name'])
'/test/ruan/mysql/db01/mysql_hostname'

Reading from SSM:

Read the Parameter value from SSM without using decryption via KMS:

1
2
3
>>> response = ssm.get_parameter(Name='/test/ruan/mysql/db01/mysql_hostname')
>>> print(response['Parameter']['Value'])
AQICAHh7jazUUBgNxMQbYFeve2/p+UWTuyAd5F3ZJkZkf9+hwgF+H+kSABfPCTEarjXqYBaJAAAAejB4BgkqhkiG9w0BBwagazBpAgEAMGQGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMJUEuT8wDGCQ3zRBmAgEQgDc8LhLgFe+Rutgi0hOKnjTEVQa2lKTy3MmTDZEeLy3Tlr5VUl6AVJNBpd4IWJTbj5YuqrrAAWWJ

As you can see the value is encrypted, this time read the parameter value with specifying decryption via KMS:

1
2
3
>>> response = ssm.get_parameter(Name='/test/ruan/mysql/db01/mysql_hostname', WithDecryption=True)
>>> print(response['Parameter']['Value'])
db01.eu-west-1.mycompany.com

Grant Permissions to Instance Profile:

Now we will create a policy that can only decrypt and read values from SSM that matches the path: /test/ruan/mysql/db01/mysql_*. This policy will be associated to a instance profile role, which will be used by EC2, where our application will read the values from.

Our policy will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1517398919242",
      "Action": [
        "kms:Decrypt"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:kms:eu-west-1:0123456789012:key/foobar-2162-4363-ba02-a953729e5ce6"
    },
    {
      "Sid": "Stmt1517399021096",
      "Action": [
        "ssm:GetParameter"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:ssm:eu-west-1:0123456789012:parameter/test/ruan/mysql/db01/mysql_*"
    }
  ]
}

Create the Policy:

1
2
>>> pol = '{"Version": "2012-10-17","Statement": [{"Sid": "Stmt1517398919242","Action": ["kms:Decrypt"],"Effect": "Allow","Resource": "arn:aws:kms:eu-west-1:0123456789012:key/foobar-2162-4363-ba02-a953729e5ce6"},{"Sid": "Stmt1517399021096","Action": ["ssm:GetParameter"],"Effect": "Allow","Resource": "arn:aws:ssm:eu-west-1:0123456789012:parameter/test/ruan/mysql/db01/mysql_*"}]}'
>>> response = iam.create_policy(PolicyName='RuanGetSSM-Policy', PolicyDocument=pol, Description='Test Policy to Get SSM Parameters')

Create the instance profile:

1
>>> response = iam.create_instance_profile(InstanceProfileName='RuanTestSSMInstanceProfile')

Create the Role:

1
>>> response = iam.create_role(RoleName='RuanTestGetSSM-Role', AssumeRolePolicyDocument='{"Version": "2012-10-17","Statement": [{"Sid": "","Effect": "Allow","Principal": {"Service": "ec2.amazonaws.com"},"Action": "sts:AssumeRole"}]}')

Associate the Role and Instance Profile:

1
>>> response = iam.add_role_to_instance_profile(InstanceProfileName='RuanTestSSMInstanceProfile', RoleName='RuanTestGetSSM-Role')

Attach the Policy to the Role:

1
>>> response = iam.put_role_policy(RoleName='RuanTestGetSSM-Role', PolicyName='RuanTestGetSSMPolicy1', PolicyDocument=pol')

Launch the EC2 instance with the above mentioned Role. Create the get_ssm.py and run it to decrypt and read the value from SSM:

get_ssm.py
1
2
3
4
5
import boto3
session = boto3.Session(region_name='eu-west-1')
ssm = session.client('ssm')
hostname = ssm.get_parameter(Name='/test/ruan/mysql/db01/mysql_hostname', WithDecryption=True)
print(hostname['Parameter']['Value'])

Run it:

1
2
$ python get_ssm.py
db01.eu-west-1.mycompany.com

Reading with GetParameters:

So say that we created more than one parameter in the path that we allowed, lets use GetParameters to read more than one Parameter:

get_parameters.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import boto3
session = boto3.Session(region_name='eu-west-1')
ssm = session.client('ssm')
response = ssm.get_parameters(
    Names=[
        '/test/ruan/mysql/db01/mysql_hostname',
        '/test/ruan/mysql/db01/mysql_user'
    ],
    WithDecryption=True
)

for secrets in response['Parameters']:
    if secrets['Name'] == '/test/ruan/mysql/db01/mysql_hostname':
        print("Hostname: {}".format(secrets['Value']))
    if secrets['Name'] == '/test/ruan/mysql/db01/mysql_user':
        print("Username: {}".format(secrets['Value']))

Run it:

1
2
3
$ python get_parameters.py
Hostname: db01.eu-west-1.mycompany.com
Username: super_dba

Environment Variable Example from an Application:

Set the Environment Variable value to the SSM key:

1
2
$ export MYSQL_HOSTNAME="/test/ruan/mysql/db01/mysql_hostname"
$ export MYSQL_USERNAME="/test/ruan/mysql/db01/mysql_user"

The application code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import os
import boto3

session = boto3.Session(region_name='eu-west-1')
ssm = session.client('ssm')

MYSQL_HOSTNAME = os.environ.get('MYSQL_HOSTNAME')
MYSQL_USERNAME = os.environ.get('MYSQL_USERNAME')

hostname = ssm.get_parameter(Name=MYSQL_HOSTNAME, WithDecryption=True)
username = ssm.get_parameter(Name=MYSQL_USERNAME, WithDecryption=True)

print("Hostname: {}".format(hostname['Parameter']['Value']))
print("Username: {}".format(username['Parameter']['Value']))

Let the application transform the key to the SSM Value:

1
2
3
$ python app.py
Hostname: db01.eu-west-1.mycompany.com
Username: super_dba

Resources:

Great thanks to the following resources: