Ruan Bekker's Blog

From a Curious mind to Posts on Github

Encryption and Decryption With the PyCrypto Module Using the AES Cipher in Python

While I’m learning a lot about encryption at the moment, I wanted to test out encryption with the PyCrypto module in Python using the Advanced Encryption Standard (AES) Symmetric Block Cipher.

Installing PyCrypto:

1
$ pip install pycrypto --user

PyCrypto Example:

Our AES Key needs to be either 16, 24 or 32 bytes long and our Initialization Vector needs to be 16 Bytes long. That will be generated using the random and string modules.

Encrypting:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> from Crypto.Cipher import AES
>>> import random, string, base64

>>> key = ''.join(random.choice(string.ascii_uppercase + string.ascii_lowercase + string.digits) for x in range(32))
>>> iv = ''.join(random.choice(string.ascii_uppercase + string.ascii_lowercase + string.digits) for x in range(16))

>>> print(key, len(key))
('BLhgpCL81fdLBk23HkZp8BgbT913cqt0', 32)
>>> print(iv, len(iv))
('OWFJATh1Zowac2xr', 16)

>>> enc_s = AES.new(key, AES.MODE_CFB, iv)
>>> cipher_text = enc_s.encrypt('this is a super important message')
>>> encoded_cipher_text = base64.b64encode(cipher_text)
>>> print(encoded_cipher_text)
'AtBa6zVB0UQ3U/50ogOb6g09FlyPdpmJB7UzoCqxhsQ6'

Decrypting:

1
2
3
4
5
6
7
8
9
>>> from Crypto.Cipher import AES
>>> import base64
>>> key = 'BLhgpCL81fdLBk23HkZp8BgbT913cqt0'
>>> iv = 'OWFJATh1Zowac2xr'

>>> decryption_suite = AES.new(key, AES.MODE_CFB, iv)
>>> plain_text = decryption_suite.decrypt(base64.b64decode(encoded_cipher_text))
>>> print(plain_text)
this is a super important message

It’s not needed to use base64, but to have the ability to stay away from strange characters I decided to encode them with base64 :D

References:

Running a 3 Node Elasticsearch Cluster With Docker Compose on Your Laptop for Testing

Having a Elasticsearch cluster on your laptop with Docker for testing is great. And in this post I will show you how quick and easy it is, to have a 3 node elasticsearch cluster running on docker for testing.

Pre-Requisites

We need to set the vm.max_map_count kernel parameter:

1
$ sudo sysctl -w vm.max_map_count=262144

To set this permanently, add it to /etc/sysctl.conf and reload with sudo sysctl -p

Docker Compose:

The docker compose file that we will reference:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
version: '2.2'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
    container_name: elasticsearch
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - http.cors.enabled=true
      - http.cors.allow-origin=*
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata1:/home/ruan/workspace/docker/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - esnet
  elasticsearch2:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
    container_name: elasticsearch2
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - http.cors.enabled=true
      - http.cors.allow-origin=*
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "discovery.zen.ping.unicast.hosts=elasticsearch"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata2:/home/ruan/workspace/docker/elasticsearch/data
    networks:
      - esnet
  elasticsearch3:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
    container_name: elasticsearch3
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - http.cors.enabled=true
      - http.cors.allow-origin=*
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "discovery.zen.ping.unicast.hosts=elasticsearch"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata3:/home/ruan/workspace/docker/elasticsearch/data
    networks:
      - esnet

  kibana:
    image: 'docker.elastic.co/kibana/kibana:6.3.2'
    container_name: kibana
    environment:
      SERVER_NAME: kibana.local
      ELASTICSEARCH_URL: http://elasticsearch:9200
    ports:
      - '5601:5601'
    networks:
      - esnet

  headPlugin:
    image: 'mobz/elasticsearch-head:5'
    container_name: head
    ports:
      - '9100:9100'
    networks:
      - esnet

volumes:
  esdata1:
    driver: local
  esdata2:
    driver: local
  esdata3:
    driver: local

networks:
  esnet:

Now make sure the paths exist that we referenced in the compose file, in my case /home/ruan/workspace/docker/elasticsearch/data

Deploy

Deploy your elasticsearch cluster with docker compose:

1
$ docker-compose up

This will run in the foreground, and you should see console output.

Testing Elasticsearch

Let’s run a couple of queries, first up, check the cluster health api:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl http://127.0.0.1:9200/_cluster/health?pretty
{
  "cluster_name" : "docker-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 1,
  "active_shards" : 2,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Create a index with replication count of 2:

1
$ curl -H "Content-Type: application/json" -XPUT http://127.0.0.1:9200/test -d '{"number_of_replicas": 2}'

Ingest a document to elasticsearch:

1
2
$ curl -H "Content-Type: application/json" -XPUT http://127.0.0.1:9200/test/docs/1 -d '{"name": "ruan"}'
{"_index":"test","_type":"docs","_id":"1","_version":1,"result":"created","_shards":{"total":3,"successful":3,"failed":0},"_seq_no":0,"_primary_term":1}

View the indices:

1
2
3
4
$ curl http://127.0.0.1:9200/_cat/indices?v
health status index                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   test                        w4p2Q3fTR4uMSYBfpNVPqw   5   2          1            0      3.3kb          1.1kb
green  open   .monitoring-es-6-2018.04.29 W69lql-rSbORVfHZrj4vug   1   1       1601           38        4mb            2mb

Kibana

Kibana is also included in the stack and is accessible via http://localhost:5601/ and you it should look more or less like:

Elasticsearch Head UI

I always prefer working directly with the RESTFul API, but if you would like to use a UI to interact with Elasticsearch, you can access it via http://localhost:9100/ and should look like this:

Deleting the Cluster:

As its running in the foreground, you can just hit ctrl + c and as we persisted data in our compose, when you spin up the cluster again, the data will still be there.

Resources:

Using the Bulk API With Elasticsearch

This tutorial will guide you how to use the Bulk API with Elasticsearch, this is great for when having a dataset that contains a lot of documents, where you want to insert them into elasticsearch in bulk uploads.

The Dataset

We will be using a dataset from elastic that contains 1000 documents that holds account data.

Getting the Dataset:

1
$ wget -O accounts.json https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json?raw=true

Preview the data:

1
2
3
4
5
6
7
8
9
10
11
$ head -10  accounts.json
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"}
{"index":{"_id":"18"}}
{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","employer":"Boink","email":"daleadams@boink.com","city":"Orick","state":"MD"}
{"index":{"_id":"20"}}
{"account_number":20,"balance":16418,"firstname":"Elinor","lastname":"Ratliff","age":36,"gender":"M","address":"282 Kings Place","employer":"Scentric","email":"elinorratliff@scentric.com","city":"Ribera","state":"WA"}

Using the Bulk API:

We will ingest the data to our bank_accounts index, and to the account type:

1
$ curl -s -H "Content-Type: application/json" -XPOST localhost:9200/accounts/docs/_bulk --data-binary "@accounts.json"

When it’s done, have a look at the indices:

1
2
3
$ curl http://127.0.0.1:9200/_cat/indices/bank_accounts?v
health status index         uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   bank_accounts BK_OJYOFTD67tqsQBUWSuQ   5   1       1000            0    950.3kb        475.1kb

Doing a search and display one document:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
$ curl -XGET 'http://127.0.0.1:9200/bank_accounts/_search?pretty&size=1'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bank_accounts",
        "_type" : "account",
        "_id" : "25",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 25,
          "balance" : 40540,
          "firstname" : "Virginia",
          "lastname" : "Ayala",
          "age" : 39,
          "gender" : "F",
          "address" : "171 Putnam Avenue",
          "employer" : "Filodyne",
          "email" : "virginiaayala@filodyne.com",
          "city" : "Nicholson",
          "state" : "PA"
        }
      }
    ]
  }
}

Demo Recording:

This has also been reccored, which can be viewed here:

Using Bulk with Auto Generated ID’s

As you might know when you do a POST request to the type, the _id field gets auto populated. Timo, one of my friends had the requirement to use the Bulk API to post auto generated Id’s and not the static id’s that is given in the example dataset.

I have answered this on Elastic’s discuss page: https://discuss.elastic.co/t/looking-for-working-example-data-set-to-bulk-index-into-es6/128678/3

I will provide the steps below as well:

convert.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/usr/bin/env python

src_file = 'src_file.json'
dest_file = 'dest_file.json'
metadata = '{"index": {"_index": "bank_accounts", "_type": "account"}}'

with open(src_file) as open_file:
    lines = open_file.readlines()

lines = [line.replace(' ', '') for line in lines]

with open(dest_file, 'w') as f:
    for each_line in lines:
        f.write(metadata + '\n')
        f.writelines(each_line)

The original file:

1
2
3
4
5
$ head -4 file.json
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}

Removing the initial metadata:

1
2
$ cat file.json | grep account_number >> src_file.json
$ ./convert.py

Previewing the destination file:

1
2
3
4
5
$ head -4 dest_file.json
{"index": {"_index": "bank_accounts", "_type": "account"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880HolmesLane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index": {"_index": "bank_accounts", "_type": "account"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671BristolStreet","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}

Looking at my current indices:

1
2
3
$ curl http://localhost:9200/_cat/indices?v
health status index                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .monitoring-es-6-2018.05.06 3OgdIbDWQWCR8WJlQTXr9Q   1   1     114715            6      104mb           50mb

Ingesting the data via Bulk API:

1
$ curl -s -H 'Content-Type: application/json' -XPOST localhost:9200/_bulk --data-binary @dest_file.json

Looking at my indices to verify that the index exist:

1
2
3
4
$ curl http://localhost:9200/_cat/indices?v
health status index                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   bank_accounts               u37MQvzhSPe97BJzp1u49Q   5   1       1000            0    296.4kb           690b
green  open   .monitoring-es-6-2018.05.06 3OgdIbDWQWCR8WJlQTXr9Q   1   1     114750            6    103.9mb         49.9mb

Looking at one document: :smiley:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
$ curl 'http://localhost:9200/bank_accounts/_search?pretty&size=1'
{
  "took" : 641,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1000,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bank_accounts",
        "_type" : "account",
        "_id" : "cohJN2MBCa89A-FEmiJs",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 6,
          "balance" : 5686,
          "firstname" : "Hattie",
          "lastname" : "Bond",
          "age" : 36,
          "gender" : "M",
          "address" : "671BristolStreet",
          "employer" : "Netagy",
          "email" : "hattiebond@netagy.com",
          "city" : "Dante",
          "state" : "TN"
        }
      }
    ]
  }
}

Resources:

Encryption and Decryption With Simple Crypt Using Python

Today I wanted to encrypt sensitive information to not expose passwords, hostnames etc. I wanted to have a way to encrypt my strings with a master password and stumbled upon Simple Crypt.

Simple Crypt

Why simple-crypt? Referenced from their docs:

  • Simple Crypt uses standard, well-known algorithms following the recommendations from this link.
  • The PyCrypto library provides the algorithm implementation, where AES256 cipher is used.
  • It includes a check (an HMAC with SHA256) to warn when ciphertext data are modified.
  • It tries to make things as secure as possible when poor quality passwords are used (PBKDF2 with SHA256, a 256 bit random salt, and 100,000 rounds).
  • Using a library, rather than writing your own code, means that we have less solutions to the same problem.

Installing Simple-Crypt:

From a base alpine image:

1
2
3
4
$ apk update
$ apk add python python-dev py2-pip
$ apk add gcc g++ make libffi-dev openssl-dev
$ pip install simple-crypt

Simple Examples:

Two simple examples to encrypt and decrypt data with simple-crypt. We will use a password sekret and we will encrypt the string: this is a secure message:

1
2
3
4
5
6
7
>>> from simplecrypt import encrypt, decrypt
>>> password = 'sekret'
>>> message = 'this is a secret message'
>>> ciphertext = encrypt(password, message)
>>>
>>> print(ciphertext)
sc#$%^&*(..........

Now that we have our encrypted string, lets decrypt it. First we will use the wrong password, so that you will see how the expected output should look when using a different password, than was used when it was encrypted:

1
2
3
4
5
6
7
8
>>> print(decrypt('badpass', ciphertext))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/simplecrypt/__init__.py", line 72, in decrypt
    _assert_hmac(hmac_key, hmac, hmac2)
  File "/usr/lib/python2.7/site-packages/simplecrypt/__init__.py", line 116, in _assert_hmac
    raise DecryptionException('Bad password or corrupt / modified data.')
simplecrypt.DecryptionException: Bad password or corrupt / modified data.

Now using the correct password to decrypt:

1
2
>>> print(decrypt('sekret', ciphertext))
this is a secret message

SimpleCrypt Base64 and Getpass

I wanted to store the encrypted string in a database, but the ciphertext has a combination of random special characters, so I decided to encode the ciphertext with base64. And the password input will be used with the getpass module.

Our encryption app:

encrypt.py
1
2
3
4
5
6
7
8
9
10
11
import sys
from simplecrypt import encrypt, decrypt
from base64 import b64encode, b64decode
from getpass import getpass

password = getpass()
message = sys.argv[1]

cipher = encrypt(password, message)
encoded_cipher = b64encode(cipher)
print(encoded_cipher)

Our decryption app:

1
2
3
4
5
6
7
8
9
10
11
import sys
from simplecrypt import encrypt, decrypt
from base64 import b64encode, b64decode
from getpass import getpass

password = getpass()
encoded_cipher = sys.argv[1]

cipher = b64decode(encoded_cipher)
plaintext = decrypt(password, cipher)
print(plaintext)

Encrypt and Decrypting Data using our Scripts:

Encrypting the string this is a secret message:

1
2
3
$ python encrypt.py "this is a secret message"
Password:
c2MAAnyfWIfOBV43vxo3sVCEYMG4C6hx69hv2Ii1JKlVHJUgBAlADJPOsD5cJO6MMI9faTDm1As/VfesvBzIe5S16mNyg2q7xfnP5iJ0RlK92vMNRbKOvNibg3M=

Now that we have our encoded ciphertext, lets decrypt it with the password that we encrypted it with:

1
2
3
$ python decrypt.py 'c2MAAnyfWIfOBV43vxo3sVCEYMG4C6hx69hv2Ii1JKlVHJUgBAlADJPOsD5cJO6MMI9faTDm1As/VfesvBzIe5S16mNyg2q7xfnP5iJ0RlK92vMNRbKOvNibg3M='
Password:
this is a secret message

This is one way of working with sensitive info that you would like to encrypt/decrypt.

Using Paramiko Module in Python to Execute Remote Bash Commands

Paramiko is a python implementation of the sshv2 protocol.

Paramiko to execute Remote Commands:

We will use paramiko module in python to execute a command on our remote server.

Client side will be referenced as (side-a) and Server side will be referenced as (side-b)

Getting the Dependencies:

Install Paramiko via pip on side-a:

1
$ pip install paramiko --user

Using Paramiko in our Code:

Our Python Code:

1
2
3
4
5
6
7
8
9
10
11
12
import paramiko

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(hostname='192.168.10.10', username='ubuntu', key_filename='/home/ubuntu/.ssh/mykey.pem')

stdin, stdout, stderr = ssh.exec_command('lsb_release -a')

for line in stdout.read().splitlines():
    print(line)

ssh.close()

Execute our Command Remotely:

Now we will attempt to establish the ssh connection from side-a, then run lsb_release -a on our remote server, side-b:

1
2
3
4
5
6
$ python execute.py

Distributor ID:   Ubuntu
Description:  Ubuntu 16.04.4 LTS
Release:  16.04
Codename: xenial

Setup a SSH Tunnel With the Sshtunnel Module in Python

Sometimes we need to restrict access to a port, where a port should listen on localhost, but you want to access that port from a remote source. One secure way of doing that, is to establish a SSH Tunnel to the remote side, and forward to port via the SSH Tunnel.

Today we will setup a Flask Web Service on our Remote Server (Side B) which will be listening on 127.0.0.1:5000 and setup the SSH Tunnel with the sshtunnel module in Python from our client side (Side A). Then we will make a GET request on our client side to the port that we are forwarding via the tunnel to our remote side.

Remote Side:

Our Demo Python Flask Application:

1
2
3
4
5
6
7
8
9
10
from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
    return 'OK'

if __name__ == '__main__':
    app.run(host='127.0.0.1', port=5000)

Run the server:

1
2
$ python app.py
Listening on 127.0.0.1:5000

Client Side:

From our client side we first need to install sshtunnel via pip:

1
$ pip install sshtunnel requests --user

Our code for our client that will establish the tunnel and do the GET request:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from sshtunnel import SSHTunnelForwarder
import requests

remote_user = 'ubuntu'
remote_host = '192.168.10.10'
remote_port = 22
local_host = '127.0.0.1'
local_port = 5000

server = SSHTunnelForwarder(
   (remote_host, remote_port),
   ssh_username=remote_user,
   ssh_private_key='/home/ubuntu/.ssh/mykey.pem',
   remote_bind_address=(local_host, local_port),
   local_bind_address=(local_host, local_port),
   )

server.start()

headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.0; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0'}
r = requests.get('http://127.0.0.1:5000', headers=headers).content
print(r)
server.stop()

Running our app:

1
2
$ python ssh_tunnel.py
OK

So we have sucessfully established our ssh tunnel to our remote side, and able to access the network restricted port via the tunnel.

Resources:

Basic RESTFul API Server With Python Flask

A Basic RESTFul API Service with Python Flask. We will be using the Flask, jsonify and request classes to build our API service.

Description of this demonstration:

Our API will be able to do the following:

  • Create, Read, Update, Delete

In this demonstration, we will add some information about people to our API, then go through each method that is mentioned above.

Getting the Dependencies:

Setup the virtualenv and install the dependencies:

1
2
3
$ virtualenv .venv
$ source .venv/bin/activate
$ pip install flask

The API Server Code:

Here’s the complete code, as you can see I have a couple of decorators for each url endpoint, and a id_generator function, that will generate id’s for each document. The id will be used for getting users information, updates and deletes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
from flask import Flask, jsonify, request
from multiprocessing import Value

counter = Value('i', 0)
app = Flask(__name__)

a = []
help_message = """
API Usage:
 
- GET    /api/list
- POST   /api/add data={"key": "value"}
- GET    /api/get/<id>
- PUT    /api/update/<id> data={"key": "value_to_replace"}
- DELETE /api/delete/<id> 

"""

def id_generator():
    with counter.get_lock():
        counter.value += 1
        return counter.value

@app.route('/api', methods=['GET'])
def help():
    return help_message

@app.route('/api/list', methods=['GET'])
def list():
    return jsonify(a)

@app.route('/api/add', methods=['POST'])
def index():
    payload = request.json
    payload['id'] = id_generator()
    a.append(payload)
    return "Created: {} \n".format(payload)

@app.route('/api/get', methods=['GET'])
def get_none():
    return 'ID Required: /api/get/<id> \n'

@app.route('/api/get/<int:_id>', methods=['GET'])
def get(_id):
    for user in a:
        if _id == user['id']:
            selected_user = user
    return jsonify(selected_user)

@app.route('/api/update', methods=['PUT'])
def update_none():
    return 'ID and Desired K/V in Payload required: /api/update/<id> -d \'{"name": "john"}\' \n'

@app.route('/api/update/<int:_id>', methods=['PUT'])
def update(_id):
    update_req = request.json
    key_to_update = update_req.keys()[0]
    update_val = (item for item in a if item['id'] == _id).next()[key_to_update] = update_req.values()[0]
    update_resp = (item for item in a if item['id'] == _id).next()
    return "Updated: {} \n".format(update_resp)

@app.route('/api/delete/<int:_id>', methods=['DELETE'])
def delete(_id):
    deleted_user = (item for item in a if item['id'] == _id).next()
    a.remove(deleted_user)
    return "Deleted: {} \n".format(deleted_user)

if __name__ == '__main__':
    app.run()

Demo Time:

Retrieving the Help output:

1
2
3
4
5
6
7
8
9
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api

API Usage:

- GET    /api/list
- POST   /api/add data={"key": "value"}
- GET    /api/get/<id>
- PUT    /api/update/<id> data={"key": "value_to_replace"}
- DELETE /api/delete/<id>

Doing a list, to list all the users, its expected for it to be empty as we have not added any info to our API:

1
2
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/list
[]

Adding our first user:

1
2
$ curl -XPOST -H 'Content-Type: application/json' http://localhost:5000/api/add -d '{"name": "ruan", "country": "south africa", "age": 30}'
Created: {u'country': u'south africa', u'age': 30, u'name': u'ruan', 'id': 1}

Adding our second user:

1
2
$ curl -XPOST -H 'Content-Type: application/json' http://localhost:5000/api/add -d '{"name": "stefan", "country": "south africa", "age": 29}'
Created: {u'country': u'south africa', u'age': 29, u'name': u'stefan', 'id': 2}

Doing a list again, will retrieve all our users:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/list
[
  {
    "age": 30,
    "country": "south africa",
    "id": 1,
    "name": "ruan"
  },
  {
    "age": 29,
    "country": "south africa",
    "id": 2,
    "name": "stefan"
  }
]

Doing a GET on the userid, to only display the users info:

1
2
3
4
5
6
7
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/get/2
{
  "age": 29,
  "country": "south africa",
  "id": 2,
  "name": "stefan"
}

Now, let’s update some details. Let’s say that Stefan relocated to New Zealand. We will need to provide his id and also the key/value that we want to update:

1
2
$ curl -XPUT -H 'Content-Type: application/json' http://localhost:5000/api/update/2 -d '{"country": "new zealand"}'
Updated: {u'country': u'new zealand', u'age': 29, u'name': u'stefan', 'id': 2}

As you can see the response confirmed that the value was updated, but let’s verify the output, by doing a get on his id:

1
2
3
4
5
6
7
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/get/2
{
  "age": 29,
  "country": "new zealand",
  "id": 2,
  "name": "stefan"
}

And lastly, lets delete our user, which will only require the userid:

1
2
$ curl -XDELETE -H 'Content-Type: application/json' http://localhost:5000/api/delete/2
Deleted: {u'country': u'new zealand', u'age': 29, u'name': u'stefan', 'id': 2}

To verify this, list all the users:

1
2
3
4
5
6
7
8
9
$ curl -XGET -H 'Content-Type: application/json' http://localhost:5000/api/list
[
  {
    "age": 30,
    "country": "south africa",
    "id": 1,
    "name": "ruan"
  }
]

Using Python Requests:

We can also use python’s requests module to do the same, to give a demonstration I will create a new user:

1
2
$ pip install requests
$ python
1
2
3
4
5
6
7
8
9
10
>>> import requests
>>> import json

>>> base_url = 'http://localhost:5000/api/add'
>>> headers = {"Content-Type": "application/json"}
>>> payload = json.dumps({"name": "shaun", "country": "australia", "age": 24})

>>> r = requests.post(base_url, headers=headers, data=payload)
>>> r.content
Created: {u'country': u'australia', u'age': 24, u'name': u'shaun', 'id': 4}

Thats it. I’ve stumbled upon Flask-Restful which I still want to check out, and as soon as I do, I will do a post on it, maybe baked with a NoSQL db or something like that.

Cheers!

Resources:

Basic Introduction to Use Arguments With Argparse on Python

I used to work a lot with sys.argv for using arguments in my applications, until I stumbled upon the argparse module! (Thanks Donovan!)

What I like about argparse, is that it builds up the help menu for you, and you also have a lot of options, as you can set the argument to be required, set the datatypes, addtional help context etc.

The Basic Demonstration:

Today we will just run through a very basic example on how to use argparse:

  • Return the generated help menu
  • Return the required value
  • Return the additional arguments
  • Compare arguments with a IF statement

The Python Argparse Tutorial Code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import argparse

parser = argparse.ArgumentParser(description='argparse demo')
parser.add_argument('-w', '--word', help='a word (required)', required=True)
parser.add_argument('-s', '--sentence', help='a sentence (not required)', required=False)
parser.add_argument('-c', '--comparison', help='a word to compare (not required)', required=False)
args = parser.parse_args()

print("Word: {}".format(args.word))

if args.sentence:
  print("Sentence: :{}".format(args.sentence))

if args.comparison:
  if args.comparison == args.word:
      print("Comparison: the provided word argument and provided comparison argument is the same")
  else:
      print("Comparison: the provided word argument and provided comparison argument is NOT the same")

Seeing it in action:

To return a usage/help info, run it with the -h or --help argument:

1
2
3
4
5
6
7
8
9
10
11
12
$ python foo.py -h
usage: foo.py [-h] -w WORD [-s SENTENCE] [-c COMPARISON]

argparse demo

optional arguments:
  -h, --help            show this help message and exit
  -w WORD, --word WORD  a word (required)
  -s SENTENCE, --sentence SENTENCE
                        a sentence (not required)
  -c COMPARISON, --comparison COMPARISON
                        a word to compare (not required)

For this to work, the application is expecting the word argument to run, as we declared it as required=True:

1
2
$ python foo.py -w hello
Word: hello

Now to use the arguments that is not required, which makes it optional:

1
2
3
$ python foo.py -w hello -s "hello, world"
Word: hello
Sentence: :hello, world

We can also implement some if statements into our application to compare if arguments are the same (as a basic example):

1
2
3
4
$ python foo.py -w hello -s "hello, world" -c goodbye
Word: hello
Sentence: :hello, world
Comparison: the provided word argument and provided comparison argument is NOT the same

We can see that the word and comparison arguments are not the same. When they match up:

1
2
3
4
$ python foo.py -w hello -s "hello, world" -c hello
Word: hello
Sentence: :hello, world
Comparison: the provided word argument and provided comparison argument is the same

This was a very basic demonstration on the argparse module.

Resource:

How to Monitor a Amazon Elasticsearch Service Cluster Update Process

When you make a configuration change on Amazon’s Elasticsearch, it does a blue/green deployment. So new nodes will be allocated to the cluster (which you will notice from CloudWatch when looking at the nodes metrics). Once these nodes are deployed, data gets copied accross to the new nodes, and traffic gets directed to the new nodes, and once its done, the old nodes gets terminated.

Note: While there will be more nodes in the cluster, you will not get billed for the extra nodes.

While this process is going, you can monitor your cluster to see the progress:

The Shards API:

Using the /_cat/shards API, you will find that the shards are in a RELOCATING state (keeping in mind, this is when the change is still busy)

1
2
3
4
5
6
7
8
9
10
11
12
curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cat/shards?v' | grep -v 'STARTED'
index                                   shard prirep state         docs    store ip            node
example-app1-2018.02.23                 4     r      RELOCATING  323498 1018.3mb x.x.x.x x2mKoe_ -> x.x.x.x GyNiRJyeSTifN_9JZisGuQ GyNiRJy
example-app1-2018.02.28                 2     p      RELOCATING  477609    1.5gb x.x.x.x x2mKoe_ -> x.x.x.x sOihejw1SrKtag_LO1RGIA sOihejw
example-app1-2018.03.01                 3     r      RELOCATING  463143    1.5gb x.x.x.x  ZZfv-Ha -> x.x.x.x jOchdCZWQq-TAPZNTadNoA jOchdCZ
fortinet-syslog-2018.02                 0     p      RELOCATING 1218556  462.2mb x.x.x.x  moQA57Y -> x.x.x.x sOihejw1SrKtag_LO1RGIA sOihejw
example-app1-2018.03.23                 3     r      RELOCATING  821254    2.4gb x.x.x.x  moQA57Y -> x.x.x.x GyNiRJyeSTifN_9JZisGuQ GyNiRJy
example-app1-2018.04.02                 2     p      RELOCATING 1085279    3.4gb x.x.x.x x2mKoe_ -> x.x.x.x jOchdCZWQq-TAPZNTadNoA jOchdCZ
example-app1-2018.02.08                 3     p      RELOCATING  136321    125mb x.x.x.x ZUZSFWu -> x.x.x.x tyU_V_KLS5mZXEwnF-YEAQ tyU_V_K
fortinet-syslog-2018.04                 4     r      RELOCATING 7513842    2.8gb x.x.x.x  ZZfv-Ha -> x.x.x.x il1WsroNSgGmXJugds_aMQ il1Wsro
example-app1-2018.04.09                 1     r      RELOCATING 1074581    3.5gb x.x.x.x  ZRzKGe5 -> x.x.x.x il1WsroNSgGmXJugds_aMQ il1Wsro
example-app1-2018.04.09                 0     p      RELOCATING 1074565    3.5gb x.x.x.x  moQA57Y -> x.x.x.x tyU_V_KLS5mZXEwnF-YEAQ tyU_V_K

The Recovery API:

We can then use the /_cat/recovery API, which will show the progress of the shards transferring to the other nodes, you will find the following:

  • index, shard, time, type, stage, source_host, target_host, files, files_percent, bytes, bytes_percent

As Amazon masks their node ip addresses, we will find that the ips are not available. To make it more human readable, we will only pass the columns that we are interested in and not to show the shards that has been set to done:

1
2
3
4
5
6
7
8
9
10
11
12
$ curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cat/recovery?v&h=i,s,t,ty,st,shost,thost,f,fp,b,bp' | grep -v 'done'
i                                       s t     ty          st       shost         thost         f   fp     b          bp
example-app1-2018.04.11                 1 2m    peer        index    x.x.x.x x.x.x.x  139 97.1%  3435483673 65.9%
web-syslog-2018.04                 4 7.6m  peer        finalize x.x.x.x x.x.x.x  109 100.0% 2854310892 100.0%
example-app1-2018.04.16                 3 2.9m  peer        translog x.x.x.x x.x.x.x  130 100.0% 446180036  100.0%
example-app1-2018.03.30                 3 2.1m  peer        index    x.x.x.x  x.x.x.x  127 97.6%  3862498583 62.5%
example-app1-2018.04.01                 0 4.4m  peer        index    x.x.x.x  x.x.x.x  140 99.3%  3410543270 87.9%
example-app1-2018.04.06                 0 5.1m  peer        index    x.x.x.x x.x.x.x  128 97.7%  4291421948 66.3%
example-app1-2018.04.07                 0 52.2s peer        index    x.x.x.x x.x.x.x 149 91.9%  3969581277 27.4%
network-capture-2018.04.01               2 11.4s peer        index    x.x.x.x  x.x.x.x 107 95.3%  359987163  55.0%
example-app1-2018.03.17                 1 1.7m  peer        index    x.x.x.x  x.x.x.x 117 98.3%  2104196548 74.5%
example-app1-2018.02.25                 3 58.4s peer        index    x.x.x.x  x.x.x.x 102 98.0%  945437614  74.7%

We can also see the human readable output, which is displayed in json format, with much more detail:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
$ curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/example-app1-2018.04.03/_recovery?human' | python -m json.tool
{
    "example-app1-2018.04.03": {
        "shards": [
            {
                "id": 0,
                "index": {
                    "files": {
                        "percent": "100.0%",
                        "recovered": 103,
                        "reused": 0,
                        "total": 103
                    },
                    "size": {
                        "percent": "100.0%",
                        "recovered": "3.6gb",
                        "recovered_in_bytes": 3926167091,
                        "reused": "0b",
                        "reused_in_bytes": 0,
                        "total": "3.6gb",
                        "total_in_bytes": 3926167091
                    },
                    "source_throttle_time": "2m",
                    "source_throttle_time_in_millis": 121713,
                    "target_throttle_time": "2.1m",
                    "target_throttle_time_in_millis": 126170,
                    "total_time": "7.2m",
                    "total_time_in_millis": 434142
                },
                "primary": true,
                "source": {
                    "host": "x.x.x.x",
                    "id": "ZRzKGe5WSg2SzilZGb3RbA",
                    "ip": "x.x.x.x",
                    "name": "ZRzKGe5",
                    "transport_address": "x.x.x.x:9300"
                },
                "stage": "DONE",
                "start_time": "2018-04-10T19:26:48.668Z",
                "start_time_in_millis": 1523388408668,
                "stop_time": "2018-04-10T19:34:04.980Z",
                "stop_time_in_millis": 1523388844980,
                "target": {
                    "host": "x.x.x.x",
                    "id": "x2mKoe_GTpe3b1CnXOKisA",
                    "ip": "x.x.x.x",
                    "name": "x2mKoe_",
                    "transport_address": "x.x.x.x:9300"
                },
                "total_time": "7.2m",
                "total_time_in_millis": 436311,
                "translog": {
                    "percent": "100.0%",
                    "recovered": 0,
                    "total": 0,
                    "total_on_start": 0,
                    "total_time": "1.1s",
                    "total_time_in_millis": 1154
                },
                "type": "PEER",
                "verify_index": {
                    "check_index_time": "0s",
                    "check_index_time_in_millis": 0,
                    "total_time": "0s",
                    "total_time_in_millis": 0
                }
            },

The Cluster Health API:

Amazon restricts most of the /_cluster API actions, but we can however see the health endpoint, where we can see the number of nodes, active_shards, relocating_shards, number_of_pending_tasks etc:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl -XGET https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cluster/health?pretty
{
  "cluster_name" : "0123456789012:example-elasticsearch-cluster-6",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 16,
  "number_of_data_nodes" : 10,
  "active_primary_shards" : 803,
  "active_shards" : 1606,
  "relocating_shards" : 10,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

The Pending Tasks API:

We also have some insights into the /_cat/pending_tasks API:

1
2
3
$ curl -s -XGET 'https://search-example-elasticsearch-cluster-6-abc123defghijkl5airxticzvjaqy.eu-west-1.es.amazonaws.com/_cat/pending_tasks?v'
insertOrder timeInQueue priority source
1757        53ms URGENT   shard-started shard id [[network-metrics-2018.04.13][0]], allocation id [Qh91o_OGRX-lFnY8KxYgQw], primary term [0], message [after peer recovery]

Resources:

Experimenting With Python and TinyMongo a MongoDB Wrapper for TinyDB

TinyMongo is a wrapper for MongoDB on top of TinyDB.

This is awesome for testing, where you need a local document orientated database which is backed by a flat file. It feels just like using MongoDB, except that its local, lightweight and using TinyDB in the backend.

Installing Dependencies:

1
$ pip install tinymongo

Usage Examples:

Initialize tinymongo and create the database and collection:

1
2
3
4
>>> from tinymongo import TinyMongoClient
>>> connection = TinyMongoClient('foo')
>>> db_init = connection.mydb
>>> db = db_init.users

Insert a Document, catch the document id and search for that document:

1
2
3
4
>>> record_id = db .insert_one({'username': 'ruanb', 'name': 'ruan', 'age': 31, 'gender': 'male', 'location': 'south africa'}).inserted_id
>>> user_info = db.find_one({"_id": record_id})
>>> print(user_info)
{u'username': u'ruanb', u'name': u'ruan', u'gender': u'male', u'age': 31, u'_id': u'8d2ce01140ec11e888110242ac110004', u'location': u'south africa'}

Update a document: Update the age attribute from 31 to 32

1
2
3
>>> db.users.update_one({'_id': '8d2ce01140ec11e888110242ac110004'}, {'$set': {'age': 32 }})
>>> print(user_info)
{u'username': u'ruanb', u'name': u'ruan', u'gender': u'male', u'age': 32, u'_id': u'8d2ce01140ec11e888110242ac110004', u'location': u'south africa'}

Insert some more data:

1
2
>>> record_id = db .insert_one({'username': 'stefanb', 'name': 'stefan', 'age': 30, 'gender': 'male', 'location': 'south africa'}).inserted_id
>>> record_id = db .insert_one({'username': 'alexa', 'name': 'alex', 'age': 34, 'gender': 'male', 'location': 'south africa'}).inserted_id

Find all the users, sorted by descending age, oldest to youngest:

1
2
3
4
5
6
7
>>> response = db.users.find(sort=[('age', -1)])
>>> for doc in response:
...     print(doc)
...
{u'username': u'alexa', u'name': u'alex', u'gender': u'male', u'age': 34, u'_id': u'66b1cc3d40ee11e892980242ac110004', u'location': u'south africa'}
{u'username': u'ruanb', u'name': u'ruan', u'gender': u'male', u'age': 32, u'_id': u'8d2ce01140ec11e888110242ac110004', u'location': u'south africa'}
{u'username': u'stefanb', u'name': u'stefan', u'gender': u'male', u'age': 30, u'_id': u'fbe9da8540ed11e88c5e0242ac110004', u'location': u'south africa'}

Find the number of documents in the collection:

1
2
>>> db.users.find().count()
3

Resources: