Ruan Bekker's Blog

From a Curious mind to Posts on Github

Build Small Golang Docker Containers

In this tutorial I will show you how to build really small docker containers for golang applications. And I mean the difference between 310MB down to 2MB

But Alpine..

So we thinking lets go with alpine right? Yeah sure lets build a small, app running on go with alpine.

Our application code:

app.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package main

import (
  "fmt"
  "math/rand"
  "time"
)

func main() {
  lekkewords := []string{
    "dog", "cat", "fish", "giraffe",
    "moo", "spider", "lion", "apple",
    "tree", "moon", "snake", "mountain lion",
    "trooper", "burger", "nasa", "yes",
  }

  rand.Seed(time.Now().UnixNano())
  var zelength int = len(lekkewords)
  var indexnum int = rand.Intn(zelength-1)
  word := lekkewords[indexnum]

  fmt.Println("Number of words:", zelength)
  fmt.Println("Selected index number:", indexnum)
  fmt.Println("Selected word is:", word)
}

Our Dockerfile:

Dockerfile
1
2
3
4
5
6
7
FROM golang:alpine

WORKDIR $GOPATH/src/mylekkepackage/mylekkeapp/
COPY app.go .
RUN go build -o /go/app

CMD ["/go/app"]

Let’s package our app to an image:

Dockerfile
1
❯ docker build -t mygolangapp:using-alpine .

Inspect the size of our image, as you can see it being 310MB

Dockerfile
1
2
3
❯ docker images "mygolangapp:*"
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
mygolangapp         using-alpine        eea1d7bde218        About a minute ago   310MB

Just make sure it actually works:

Dockerfile
1
2
3
4
❯ docker run mygolangapp:using-alpine
Number of words: 16
Selected index number: 11
Selected word is: mountain lion

But for something just returning random selected text, 310MB is a bit crazy.

Multi Stage Builds

As Go binaries are self-contained, we can make use of docker’s multi stage builds, where we can build our application on alpine and use the binary on a scratch image:

Our multi stage Dockerfile:

Dockerfile.mult
1
2
3
4
5
6
7
8
9
10
11
FROM golang:alpine AS builder

WORKDIR $GOPATH/src/mylekkepackage/mylekkeapp/
COPY app.go .
RUN go build -o /go/app

FROM scratch

COPY --from=builder /go/app /go/app

CMD ["/go/app"]

Build it:

Dockerfile.mult
1
❯ docker build -t mygolangapp:using-multistage -f Dockerfile.multi .

Notice that the image is only 2.01MB, say w000t!

Dockerfile.mult
1
2
3
4
❯ docker images "mygolangapp:*"
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
mygolangapp         using-multistage    31474c61ba5b        15 seconds ago      2.01MB
mygolangapp         using-alpine        eea1d7bde218        2 minutes ago       310MB

Run the app:

Dockerfile.mult
1
2
3
4
❯ docker run mygolangapp:using-multistage
Number of words: 16
Selected index number: 5
Selected word is: spider

Resources

Source code for this demonstration can be found at github.com/ruanbekker/golang-build-small-images

Secure Your Elasticsearch Cluster With Basic Auth Using Nginx and SSL From Letsencrypt

In this tutorial we will setup a reverse proxy using nginx to translate and load balance traffic through to our elasticsearch nodes. We will also protect our elasticsearch cluster with basic auth and use letsencrypt to retrieve free ssl certificates.

We want to allow certain requests to be bypassed from authentication such as getting status from the cluster and certain requests we want to enforce authentication, such as indexing and deleting data.

Install Nginx:

Install nginx and the dependency package to create basic auth:

1
$ apt install nginx apache2-utils -y

Configure Nginx for Reverse Proxy

We want to access our nginx proxy on port 80: 0.0.0.0:80 and the requests should be proxied through to elasticsearch private addresses: 10.0.0.10:9200 and 10.0.0.11:9200. Traffic will be load balanced between our 2 nodes.

Edit the main nginx configuration:

1
$ vim /etc/nginx/nginx.conf

and populate the information as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
user www-data;
worker_processes auto;
pid /run/nginx.pid;

events {
  worker_connections 1024;
  # multi_accept on;
}

http {

  # basic Settings
  sendfile on;
  tcp_nopush on;
  tcp_nodelay on;
  keepalive_timeout 65;
  types_hash_max_size 2048;
  include /etc/nginx/mime.types;
  default_type application/octet-stream;

  # ssl settings
  ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
  ssl_prefer_server_ciphers on;

  # logging settings
  access_log /var/log/nginx/access.log;
  error_log /var/log/nginx/error.log;

  # gzip settings
  gzip on;
  gzip_disable "msie6";

  # virtual host configs
  include /etc/nginx/conf.d/*.conf;
}

Next, edit the virtual host config:

1
$ vim /etc/nginx/conf.d/elasticsearch.conf

And populate the following config:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# https://gist.github.com/sahilsk/b16cb51387847e6c3329

upstream elasticsearch {
    # define your es nodes
    server 10.0.0.10:9200;
    server 10.0.0.11:9200;
    # persistent http connections
    # https://www.elastic.co/blog/playing-http-tricks-nginx
    keepalive 15;
}

server {
  listen 80;
  server_name elasticsearch.domain.com;

  auth_basic "server auth";
  auth_basic_user_file /etc/nginx/passwords;

  location / {

    # deny node shutdown api
    if ($request_filename ~ "_shutdown") {
      return 403;
      break;
    }

    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
    proxy_redirect off;
  }

  location = / {
    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
    proxy_redirect off;
    auth_basic "off";
  }

  location ~* ^(/_cluster/health|/_cat/health) {
    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
    proxy_redirect off;
    auth_basic "off";
  }
}

Set your username and password to protect your endpoint:

1
$ htpasswd -c /etc/nginx/passwords admin

Enable nginx on boot and restart the process:

1
2
$ systemctl enable nginx
$ systemctl restart nginx

Test it

Now make requests to elasticsearch via your nginx reverse proxy:

1
2
3
$ curl -H 'Content-Type: application/json' -u 'admin:admin' http://myproxy.domain.com/_cat/indices?v
health status index       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   first-index 1o6yM7tCSqagqoeihKM7_g   5   1          3            0     40.6kb         20.3kb

Letsencrypt SSL Certificates

Add free SSL Certificates to your reverse proxy. Install certbot:

1
2
3
4
5
6
$ apt-get update
$ apt-get install software-properties-common -y
$ add-apt-repository universe
$ add-apt-repository ppa:certbot/certbot
$ apt-get update
$ apt-get install certbot python-certbot-nginx -y

Request a Certificate for your domain:

1
2
3
4
5
$ certbot --manual certonly -d myproxy.domain.com -m my@email.com --preferred-challenges dns --agree-tos

Obtaining a new certificate
Performing the following challenges:
dns-01 challenge for myproxy.domain.com

You will be prompted to make a dns change, since we requested the dns challenge. While this screen is here, we can go our dns provider and make the TXT record change as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Please deploy a DNS TXT record under the name
_acme-challenge.myproxy.domain.com with the following value:

xLP4y_YJvdAK7_aZMJ50gkudTDeIC3rX0x83aNJctGw

Before continuing, verify the record is deployed.
Press Enter to Continue
Waiting for verification...
Cleaning up challenges

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/myproxy.domain.com/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/myproxy.domain.com/privkey.pem
   Your cert will expire on 2019-07-01. To obtain a new or tweaked
   version of this certificate in the future, simply run certbot
   again. To non-interactively renew *all* of your certificates, run
   "certbot renew"
 - If you like Certbot, please consider supporting our work by:

   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le

Update Nginx Config

Now that we have our ssl certificates, we need to update our nginx config to enable ssl, redirect http to https and point the ssl certificates and ssl private keys to the certificates that we retrieved from letsencrypt.

Open up the virtual host nginx configuration:

1
$ vim /etc/nginx/conf.d/elasticsearch.conf

Update the config like the one below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
upstream elasticsearch {
    server 10.0.0.10:9200;
    server 10.0.0.11:9200;
    keepalive 15;
}

server {
  listen 80;
  server_name myproxy.domain.com;
  return 301 https://$host$request_uri;
}

server {
  listen 443 ssl;
  server_name myproxy.domain.com;

  ssl_certificate /etc/letsencrypt/live/myproxy.domain.com/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/myproxy.domain.com/privkey.pem;

  auth_basic "server auth";
  auth_basic_user_file /etc/nginx/passwords;

  location ^~ /.well-known/acme-challenge/ {
    auth_basic off;
  }

  location / {

    # deny node shutdown api
    if ($request_filename ~ "_shutdown") {
      return 403;
      break;
    }

    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Host $http_host;
    proxy_redirect off;
  }

  location = / {
    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Host $http_host;
    proxy_redirect off;
    auth_basic "off";
  }

  location ~* ^(/_cluster/health|/_cat/health) {
    proxy_pass http://elasticsearch;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Host $http_host;
    proxy_redirect off;
    auth_basic "off";
  }
}

Restart the nginx process:

1
$ systemctl restart nginx

Test the Nginx Proxy with SSL

Test the proxy with HTTP so that we can see that our nginx config redirects us to HTTPS:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ curl -iL -u 'admin:admin' http://myproxy.domain.com/_cat/nodes?v
HTTP/1.1 301 Moved Permanently
Server: nginx/1.14.0 (Ubuntu)
Date: Tue, 02 Apr 2019 21:40:09 GMT
Content-Type: text/html
Content-Length: 194
Connection: keep-alive
Location: https://myproxy.domain.com/_cat/nodes?v

HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Tue, 02 Apr 2019 21:40:10 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 276
Connection: keep-alive

ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.0.0.10               40          97   3    0.15    0.10     0.08 mdi       -      Lq9P7eP
10.0.0.11               44          96   3    0.21    0.10     0.09 mdi       *      F5edOwK

Test the proxy with HTTPS:

1
2
3
4
5
6
7
8
9
10
11
$ curl -i -u 'admin:admin' https://myproxy.domain.com/_cat/nodes?v
HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Tue, 02 Apr 2019 21:40:22 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 276
Connection: keep-alive

ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.0.0.10               44          96   4    0.18    0.10     0.09 mdi       *      F5edOwK
10.0.0.11               39          97   5    0.13    0.09     0.08 mdi       -      Lq9P7eP

Setup a cronjob to auto renew the certificates:

1
$ crontab -e

Populate the following line:

1
6 1,13 * * * /usr/bin/certbot renew --post-hook "systemctl restart nginx" --quiet

Resources:

Setup Kibana Dashboards for Nginx Log Data to Understand the Behavior

kibana

In this tutorial we will setup a Basic Kibana Dashboard for a Web Server that is running a Blog on Nginx.

What do we want to achieve?

We will setup common visualizations to give us an idea on how our blog/website is doing.

In some situations we need to create visualizations to understand the behavior of our log data in order to answer these type of questions:

Number Scenario
1 Geographical map to see where people are connecting from
2 Piechart that represents the percentage of cities accessing my blog
3 Top 10 Most Accessed Pages
4 Top 5 HTTP Status Codes
5 Top 10 Pages that returned 404 Responses
6 The Top 10 User Agents
7 Timeseries: Status Codes Over Time
8 Timeseries: Successfull Website Hits over time
9 Counter with Website Hits
10 Average Bytes Returned
11 Tag Cloud with the City Names that Accessed my Blog

Pre-Requirements

I am consuming my nginx access logs with filebeat and shipping them to elasticsearch. You can check out this blogpost to set that up.

The GeoIP Processor plugin is installed on elasticsearch to enrich our data with geographical information. You can check out this blogpost to setup geoip.

You can setup Kibana and Elasticsearch on Docker or setup a 5 Node Elasticsearch Cluster

Setup Kibana Visulizations

Head over to Kibana, make sure that you have added the filebeat-* index patterns. If not, head over to Management -> Index Patterns -> Create Index -> Enter filebeat-* as you Index Pattern, select Next, select your @timestamp as your timestamp field, select create.

Now from the visualization section we will add 11 Visualizations. Everytime that you create a visualization, make sure that you select filebeat as your pattern (thats if you are using filebeat).

When configuring your visualization it will look like the configuration box from below:

image

Geomap: Map to see where people are connecting from

kibana geomap

Select New Visualization: Coordinate Map

1
2
3
  -> Metrics, Value: Count. 
     Buckets, Geo Coordinates, Aggregation: Geohash, 
     Field: nginx.access.geoip.location. 

Save the visualization, in my case Nginx:GeoMap:Filebeat

Piechart: Cities

This can give us a quick overview on the percentage of people interested in our website grouped per city.

image

Select New Visualization, Pie

1
2
3
4
5
 -> Metrics: Slice Size, Aggregation: Count
 -> Buckets: Split Slices, 
    Aggregation: Terms, Field: nginx.access.geoip.city_name, 
    Order by: metric: count, 
    Order: Descending, Size: 20

Save Visualization.

Top 10 Accessed Pages

Great for seeing which page is popular, and Kibana makes it easy to see which page is doing good over a specific time.

image

New Visualization: Vertical

1
2
3
4
5
  -> Metrics: Y-Axis, Aggregation: Count
  -> Buckets: X-Axis, Aggregation: Terms, 
     Field: nginx.access.url, 
     Ordery by: Metric count, 
     Order: Descending, Size 10

I would like to remove /rss and / from my results, so in the search box:

1
NOT (nginx.access.url:"/" OR nginx.access.url:"/rss/" OR nginx.access.url:"/subscribe/" OR nginx.access.url:*.txt)

Save Visualization.

Top 5 HTTP Status Codes

A Grouping of Status Codes (You should see more 200’s) but its quick to identify when 404’s spike etc.

image

Select new visualization: Vertical Bar

1
2
3
4
5
  -> Metrics: Y-Axis, Aggregation: Count
  -> Buckets: X-Axis, Aggregation: Terms, 
     Field: nginx.access.response_code, 
     Ordery by: Metric count, 
     Order: Descending, Size 5

Save Visualization

Top 404 Pages

So when people are requesting pages that does not exist, it could most probably be bots trying to attack your site, or trying to gain access etc. This is a great view to see which ones are they trying and then you can handle it from there.

image

1
2
3
4
5
  -> Metrics: Y-Axis, Aggregation: Count
  -> Buckets: X-Axis, Aggregation: Terms, 
     Terms, Field: nginx.access.url, 
     Order by: Metric count, 
     Order: Descending, Size 20

Top 10 User Agents

Some insights to see the top 10 browsers.

image

New Visualization: Data Table

1
2
3
4
5
6
  -> Metrics: Y-Axis, Aggregation: Count
  -> Buckets: Split Rows, 
     Aggregation: Terms, 
     Field: nginx.access.user_agent.name, 
     Ordery by: Metric count, 
     Order: Descending, Size 10

Save Visualization

Timeseries: Status Codes over Time

With timeseries data its great to see when there was a spike in status codes, when you identify the time, you can further investigate why that happened.

New Visualization: Timelion

1
.es(index=filebeat*, timefield='@timestamp', q=nginx.access.response_code:200).label('OK'), .es(index=filebeat*, timefield='@timestamp', q=nginx.access.response_code:404).label('Page Not Found')

Timeseries: Successfull Website Hits over Time

This is a good view to see how your website is serving traffic over time.

image

New Visualization: Timelion

1
.es(index=filebeat*, timefield='@timestamp', q=nginx.access.response_code:200).label('200')

Count Metric: Website Hits

A counter to see the number of website hits over time.

image

New Visualization: Metric

1
2
  -> Search Query: fields.blog_name:sysadmins AND nginx.access.response_code:200
  -> Metrics: Y-Axis, Aggregation: Count

Average Bytes Transferred

Line chart with the amount of bandwidth being transferred.

image

1
2
3
4
New Visualization: Line

-> Metrics: Y-Axis, Aggregation: Average, Field: nginx.access.body_sent.bytes
-> Buckets: X-Axis, Aggregation: Date Histogram, Field: @timestamp

Tag Cloud with Most Popular Cities

I’ve used cities here, but its a nice looking visualization to group the most accessed fields. With server logs you can use this for the usernames failed in ssh attempts for example.

image

1
2
3
4
5
6
  -> Metrics: Tag size, Aggregation: Count
  -> Buckets: Tags, 
     Aggregation: Terms, 
     Field: nginx.access.geoip.city_name, 
     Ordery by: Metric count, 
     Order: Descending, Size 10

Create the Dashboard

Now that we have all our visualizations, lets build the dashboard that hosts all our visualizations.

Select Dashboard -> Create New Dashboard -> Add -> Select your visualizations -> Reorder and Save

The visualizations in my dashboard looks like this:

image

This is a basic dashboard but its just enough so that you can get your hands dirty and build some awesome visualizations.

Setup a 5 Node Highly Available Elasticsearch Cluster

elasticsearch

This is post 1 of my big collection of elasticsearch-tutorials which includes, setup, index, management, searching, etc. More details at the bottom.

In this tutorial we will setup a 5 node highly available elasticsearch cluster that will consist of 3 Elasticsearch Master Nodes and 2 Elasticsearch Data Nodes.

“Three master nodes is the way to start, but only if you’re building a full cluster, which at minimum is 3 master nodes plus at least 2 data nodes.” - https://discuss.elastic.co/t/should-dedicated-master-nodes-and-data-nodes-be-considered-separately/75093/14

The Overview:

In short the responsibilites of the node types:

Master Nodes: Master nodes are responsible for Cluster related tasks, creating / deleting indexes, tracking of nodes, allocate shards to nodes, etc.

Data Nodes: Data nodes are responsible for hosting the actual shards that has the indexed data also handles data related operations like CRUD, search, and aggregations.

For more concepts of Elasticsearch, have a look at their basic-concepts documentation.

Our Inventory will consist of:

Master Nodes:

1
2
3
Hostname: es-master-1, Private IP: 172.31.0.77
Hostname: es-master-2, Private IP: 172.31.0.45
Hostname: es-master-3, Private IP: 172.31.1.31

Data Nodes:

1
2
Hostname: es-data-1, Private IP:172.31.2.30
Hostname: es-data-2, Private IP:172.31.0.83

Reserved Volumes for Data Nodes:

1
2
es-data-1: 10GB assigned to /dev/vdb
es-data-2: 10GB assigned to /dev/vdb

Authentication:

Note that I have configured the bind address for elasticsearch to 0.0.0.0 using network.host: 0.0.0.0 for this demonstration, but this means that if your server has a public ip address with no firewall rules or no auth, that anyone will be able to interact with your cluster.

This address will also be reachable for all nodes to see each other.

It’s advisable do protect your endpoint, either with basic auth using nginx which can be found in the embedded link, or using firewall rules to protect communication from the outside (depending on your setup)

Setup the Elasticsearch Master Nodes

The setup below how to provision a elasticsearch master node. Repeat this on node: es-master-1, es-master-2, es-master-3

Set your hosts file for name resolution (if you don’t have private dns in place):

1
2
3
4
5
6
7
8
$ cat > /etc/hosts << EOF
127.0.0.1 localhost
172.31.0.77 es-master-1
172.31.0.45 es-master-2
172.31.1.31 es-master-3
172.31.2.30 es-data-1
172.31.0.83 es-data-2
EOF

Get the elasticsearch repositories, install the java development kit dependency and install elasticsearch:

1
2
3
4
5
6
7
$ apt update && apt upgrade -y
$ apt install software-properties-common python-software-properties apt-transport-https -y
$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
$ apt update
$ apt install default-jdk -y
$ apt install elasticsearch -y

The elasticsearch config, before we get to the full example config, I just want to show a snippet of how you could split up logs and data.

Note that you can seperate your logging between data/logs like this:

1
2
3
4
5
6
# example of log splitting:
...
path:
  logs: /var/log/elasticsearch
  data: /var/data/elasticsearch
...

Also, your data can be divided between paths:

1
2
3
4
5
6
7
8
# example of data paths:
...
path:
  data:
    - /mnt/elasticsearch_1
    - /mnt/elasticsearch_2
    - /mnt/elasticsearch_3
...

Bootstrap the elasticsearch config with a cluster name (all the nodes should have the same cluster name), set the nodes as master node.master: true disable the node.data and specify that the cluster should at least have a minimum of 2 master nodes before it stops. This is used to prevent split brain.

To avoid a split brain, this setting should be set to a quorum of master-eligible nodes: (master_eligible_nodes / 2) + 1

The full example config:

1
2
3
4
5
6
7
8
9
10
11
$ cat > /etc/elasticsearch/elasticsearch.yml << EOF
cluster.name: es-cluster
node.name: \${HOSTNAME}
node.master: true
node.data: false
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["es-master-1", "es-master-2", "es-master-3"]
EOF

Important settings for your elasticsearch cluster is described on their docs:

1
2
3
4
5
$ cat > /etc/default/elasticsearch << EOF
ES_STARTUP_SLEEP_TIME=5
MAX_OPEN_FILES=65536
MAX_LOCKED_MEMORY=unlimited
EOF

Ensure that pages are not swapped out to disk by requesting the JVM to lock the heap in memory by setting LimitMEMLOCK=infinity. Set the maxiumim file descriptor number for this process: LimitNOFILE and increase the number of threads using LimitNPROC:

1
2
3
4
5
6
7
$ vim /usr/lib/systemd/system/elasticsearch.service

[Service]
LimitMEMLOCK=infinity
LimitNOFILE=65535
LimitNPROC=4096
...

Increase the limit on the number of open files descriptors to user elasticsearch of 65536 or higher

1
2
3
4
$ cat > /etc/security/limits.conf << EOF
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
EOF

Increase the value of the mmap counts as elasticsearch uses mmapfs directory to store its indices:

1
$ sysctl -w vm.max_map_count=262144

For a permanent setting, update vm.max_map_count in /etc/sysctl.conf and run

1
$ sysctl -p /etc/sysctl.conf 

Prepare the directories and set the ownership to elasticsearch:

1
2
$ mkdir /usr/share/elasticsearch/data
$ chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data

Reload the systemd daemon, enable and start elasticsearch

1
2
3
$ systemctl daemon-reload
$ systemctl enable elasticsearch
$ systemctl restart elasticsearch

Once all 3 elasticsearch masters has been started, verify that they are listening: netstat -tulpn | grep 9200 then look at the cluster health:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl http://127.0.0.1:9200/_cluster/health?pretty
{
  "cluster_name" : "es-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 0,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Have a look at the nodes, you will see that the node.role for now shows mi:

1
2
3
4
5
$ curl http://127.0.0.1:9200/_cat/nodes?v
ip          heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.163.68.8           11          80  18    0.28    0.14     0.09 mi        -      es-master-2
10.163.68.5           14          80  14    0.27    0.18     0.11 mi        *      es-master-1
10.163.68.4           15          79   6    0.62    0.47     0.18 mi        -      es-master-3

Setup the Elasticsearch Data Nodes

Now that we have our 3 elasticsearch master nodes running, its time to provision the 2 elasticsearch data nodes. This setup needs to be repeated on both es-data-1 and es-data-2.

Configure the hosts file for name resolution:

1
2
3
4
5
6
7
8
$ cat > /etc/hosts << EOF
127.0.0.1 localhost
172.31.0.77 es-master-1
172.31.0.45 es-master-2
172.31.1.31 es-master-3
172.31.2.30 es-data-1
172.31.0.83 es-data-2
EOF

Get the elasticsearch repositories, install the java development kit dependency and install elasticsearch:

1
2
3
4
5
6
7
$ apt update && apt upgrade -y
$ apt install software-properties-common python-software-properties apt-transport-https -y
$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
$ apt update
$ apt install default-jdk -y
$ apt install elasticsearch -y

Since we attached an extra disk to our data nodes, verify that you can see the disk:

1
2
3
4
5
$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda    253:0    0  25G  0 disk
└─vda1 253:1    0  25G  0 part /
vdb    253:16   0  10G  0 disk             <----

Provision the block device with xfs or anything else that you prefer, create the directory where elasticsearch data will reside, change the ownership that elasticsearch has permission to write/read, set the device on startup and mount the disk:

1
2
3
4
5
6
7
$ mkfs.xfs /dev/vdb
$ mkdir /data
$ mkdir /data/nodes
$ chown -R elasticsearch:elasticsearch /data
$ chown -R elasticsearch:elasticsearch /data/nodes
$ echo '/dev/vdb /data xfs defaults 0 0' >> /etc/fstab
$ mount -a

Verify that the disk is mounted:

1
2
3
4
5
6
$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            994M     0  994M   0% /dev
tmpfs           201M  3.1M  197M   2% /run
/dev/vda1        25G  1.8G   23G   8% /
/dev/vdb         10G   33M   10G   1% /data

Bootstrap the elasticsearch config with a cluster name, set the node.name to an identifier, in this case I will use the servers hostname, set the node.master to false as this will be data nodes, also enable these nodes as data nodes: node.data: true, configure the path.data: /data to the path that we configured, etc:

1
2
3
4
5
6
7
8
9
10
11
12
$ cat > /etc/elasticsearch/elasticsearch.yml << EOF
cluster.name: es-cluster
node.name: \${HOSTNAME}
node.master: false
node.data: true
path.data: /data
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["es-master-1", "es-master-2", "es-master-3"]
EOF

Set a couple of important settings for your elasticsearch cluster is described on their docs:

1
2
3
4
5
$ cat > /etc/default/elasticsearch << EOF
ES_STARTUP_SLEEP_TIME=5
MAX_OPEN_FILES=65536
MAX_LOCKED_MEMORY=unlimited
EOF

Disable swapping, increase the file descriptors and increase the maximum number of threads:

1
2
3
4
5
$ vim /usr/lib/systemd/system/elasticsearch.service
[Service]
LimitMEMLOCK=infinity
LimitNOFILE=65535
LimitNPROC=4096

Also update them via limits.conf:

1
2
3
4
$ cat > /etc/security/limits.conf << EOF
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
EOF

Reload the systemd daemon, enable and start elasticsearch. Allow it to start and check if the ports are listening with netstat -tulpn | grep 9200, then:

1
2
3
$ systemctl daemon-reload
$ systemctl enable elasticsearch
$ systemctl restart elasticsearch

Verify that everything works as expected, look at the cluster health and look at the status and number of nodes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl http://127.0.0.1:9200/_cluster/health?pretty
{
  "cluster_name" : "es-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 5,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Look at the nodes api and you will see that we now have the extra 2 nodes showing up on node.role as di:

1
2
3
4
5
6
7
$ curl http://127.0.0.1:9200/_cat/nodes?v
ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.163.68.7             9          96   6    0.12    0.11     0.03 di        -      es-data-2
10.163.68.5            10          80   2    0.20    0.09     0.08 mi        *      es-master-1
10.163.68.11           12          96   9    0.12    0.09     0.03 di        -      es-data-1
10.163.68.4            10          79   0    0.00    0.12     0.11 mi        -      es-master-3
10.163.68.8            12          79   1    0.05    0.06     0.07 mi        -      es-master-2

Interact with Elasticsearch

Let’s interact with elasticsearch, the overview:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl http://127.0.0.1:9200
{
  "name" : "es-data-1",
  "cluster_name" : "es-cluster",
  "cluster_uuid" : "5BLs4sxsSEK-4OxlGnmlmw",
  "version" : {
    "number" : "6.7.0",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "8453f77",
    "build_date" : "2019-03-21T15:32:29.844721Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Let’s look at the Health API:

1
2
3
$ curl http://127.0.0.1:9200/_cat/health?v
epoch      timestamp cluster    status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1554154652 21:37:32  es-cluster green           5         2     10   5    0    0        0             0                  -                100.0%

Let’s ingest some data into elasticsearch, we will create an index named first-index with some dummy data about people, username, name, surname, location and hobbies:

1
2
3
4
5
$ curl -H 'Content-Type: application/json' -XPOST http://127.0.0.1:9200/first-index/docs/ -d '{"username": "mikes", "name": "mike", "surname": "steyn", "location": {"country": "south africa", "city": "cape town"}, "hobbies": ["sport", "coffee"]}'

$ curl -H 'Content-Type: application/json' -XPOST http://127.0.0.1:9200/first-index/docs/ -d '{"username": "clarissas", "name": "clarissa", "surname": "smith", "location": {"country": "ireland", "city": "dublin"}, "hobbies": ["shopping", "reading", "chess"]}'

$ curl -H 'Content-Type: application/json' -XPOST http://127.0.0.1:9200/first-index/docs/ -d '{"username": "franka", "name": "frank", "surname": "adams", "location": {"country": "new zealand", "city": "auckland"}, "hobbies": ["programming", "swimming", "rugby"]}'

Now that we ingested our data into elasticsearch, lets have a look at the Indices API, where the number of documents, size etc should reflect:

1
2
3
$ curl http://127.0.0.1:9200/_cat/indices?v
health status index       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   first-index 1o6yM7tCSqagqoeihKM7_g   5   1          3            0     40.6kb         20.3kb

Now lets request a search, which will give you by default 10 returned documents:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
$ curl http://127.0.0.1:9200/first-index/_search?pretty
{
  "took" : 116,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "first-index",
        "_type" : "docs",
        "_id" : "-NTO2mkB8pugP4aC2jtZ",
        "_score" : 1.0,
        "_source" : {
          "username" : "mikes",
          "name" : "mike",
          "surname" : "steyn",
          "location" : {
            "country" : "south africa",
            "city" : "cape town"
          },
          "hobbies" : [
            "sport",
            "coffee"
          ]
        }
      },
      {
        "_index" : "first-index",
        "_type" : "docs",
        "_id" : "-tTR2mkB8pugP4aCAzvG",
        "_score" : 1.0,
        "_source" : {
          "username" : "franka",
          "name" : "frank",
          "surname" : "adams",
          "location" : {
            "country" : "new zealand",
            "city" : "auckland"
          },
          "hobbies" : [
            "programming",
            "swimming",
            "rugby"
          ]
        }
      },
      {
        "_index" : "first-index",
        "_type" : "docs",
        "_id" : "-dTP2mkB8pugP4aC1ztI",
        "_score" : 1.0,
        "_source" : {
          "username" : "clarissas",
          "name" : "clarissa",
          "surname" : "smith",
          "location" : {
            "country" : "ireland",
            "city" : "dublin"
          },
          "hobbies" : [
            "shopping",
            "reading",
            "chess"
          ]
        }
      }
    ]
  }
}

Let’s have a look at our shards using the Shards API, you will also see where each document is assigned to a specific shard, and also if its a primary or replica shard:

1
2
3
4
5
6
7
8
9
10
11
12
$ curl http://127.0.0.1:9200/_cat/shards?v
index       shard prirep state   docs store ip           node
first-index 4     p      STARTED    0  230b 10.163.68.7  es-data-2
first-index 4     r      STARTED    0  230b 10.163.68.11 es-data-1
first-index 2     p      STARTED    0  230b 10.163.68.7  es-data-2
first-index 2     r      STARTED    0  230b 10.163.68.11 es-data-1
first-index 3     r      STARTED    1 6.6kb 10.163.68.7  es-data-2
first-index 3     p      STARTED    1 6.6kb 10.163.68.11 es-data-1
first-index 1     r      STARTED    2  13kb 10.163.68.7  es-data-2
first-index 1     p      STARTED    2  13kb 10.163.68.11 es-data-1
first-index 0     p      STARTED    0  230b 10.163.68.7  es-data-2
first-index 0     r      STARTED    0  230b 10.163.68.11 es-data-1

Then we can also use the Allocation API to see the size of our indices, disk space per node:

1
2
3
4
$ curl http://127.0.0.1:9200/_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host         ip           node
     5       20.3kb    32.4mb      9.9gb      9.9gb            0 10.163.68.11 10.163.68.11 es-data-1
     5       20.3kb    32.4mb      9.9gb      9.9gb            0 10.163.68.7  10.163.68.7  es-data-2

Let’s search for anyone with the surname smith:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
$ curl -s http://127.0.0.1:9200/first-index/_search?q=surname=smith | jq .
{
  "took": 22,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "first-index",
        "_type": "docs",
        "_id": "-dTP2mkB8pugP4aC1ztI",
        "_score": 0.2876821,
        "_source": {
          "username": "clarissas",
          "name": "clarissa",
          "surname": "smith",
          "location": {
            "country": "ireland",
            "city": "dublin"
          },
          "hobbies": [
            "shopping",
            "reading",
            "chess"
          ]
        }
      }
    ]
  }
}

Let’s search for anyone with rugby as one of their hobbies:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
$ curl -s http://127.0.0.1:9200/first-index/_search?q=hobbies=rugby | jq .
{
  "took": 23,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.64072424,
    "hits": [
      {
        "_index": "first-index",
        "_type": "docs",
        "_id": "-tTR2mkB8pugP4aCAzvG",
        "_score": 0.64072424,
        "_source": {
          "username": "franka",
          "name": "frank",
          "surname": "adams",
          "location": {
            "country": "new zealand",
            "city": "auckland"
          },
          "hobbies": [
            "programming",
            "swimming",
            "rugby"
          ]
        }
      }
    ]
  }
}

More on Elasticsearch

I am planning to write up elasticsearch articles on the following topics:

  • Setting up a 5 Node HA Elasticsearch Cluster
  • Indexes / Replicas
  • Search Queries
  • Delete Queries
  • Elasticsearch Snapshots and Restores on S3
  • Mapping Templates
  • Resizing Index Shards
  • Dealing with Old Timeseries Data
  • Elasticsearch Percentiles
  • Managing Yellow and Red Status Clusters
  • Managing High JVM Memory Pressure
  • and more

As I finish up the writing of these posts they will be published under the #elasticsearch-tutorials category on my blog and for any other elasticsearch tutorials, you can find them under the #elasticsearch category.

Oke byyyyyye :D

Resources

Snippet: Create Custom CloudWatch Metrics With Python

A quick post on how create custom CloudWatch Metrics using Python on AWS.

After you produced the metrics into CloudWatch, you will be able to see them when navigating to:

  • CloudWatch / Metrics / Custom Namespaces / statusdash/ec2client

When selecting:

1
2
Select Metric: SomeKey1, SomeKey2
Select MetricName HttpResponseTime

And should look like this:

The Script:

The python script that will be using boto3 to talk to AWS:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import boto3
import random
cloudwatch = boto3.Session(region_name='eu-west-1').client('cloudwatch')
response = cloudwatch.put_metric_data(
MetricData = [
    {
        'MetricName': 'HttpResponseTime',
        'Dimensions': [
            {
                'Name': 'Server',
                'Value': 'app.example.com'
            },
            {
                'Name': 'Client',
                'Value': 'Client-ABC'
            },
        ],
        'Unit': 'Milliseconds',
        'Value': random.randint(20, 50)
    },
],
Namespace = 'statusdash/ec2client'
)
print response

Resources:

https://stackify.com/custom-metrics-aws-lambda/ https://www.syntouch.nl/custom-cloudwatch-metrics-in-python-yes-we-can/ <- psutil https://aws.amazon.com/blogs/devops/new-how-to-better-monitor-your-custom-application-metrics-using-amazon-cloudwatch-agent/ https://medium.com/@mrdoro/aws-lambda-as-the-website-monitoring-tool-184b09202ae2

Concourse Pipeline to Build a Docker Image Automatically on Git Commit

In this tutorial we will build a ci pipeline using concourse to build and push a image to dockerhub automatically, whenever a new git commit is made to the master branch.

Our Project Setup

Our Directory Tree:

1
2
3
4
5
6
$ find .
./Dockerfile
./ci
./ci/pipeline.yml
./README.md
./docker-tunnel

The project used in this example is not important, but you can check it out at https://github.com/ruanbekker/docker-remote-tunnel

Our Pipeline

A visual to see how the pipeline will look like in concourse:

Our pipeline definition will consist of 3 resources, github repo, dockerhub image and a slack resource to inform use whether a build has completed.

Then we are specifying that the job should be triggered on a git commit for the master branch, build and push to our dockerhub repo.

Our pipeline definition ci/pipeline.yml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
resources:
- name: git-repo
  type: git
  source:
    uri: git@github.com:ruanbekker/docker-remote-tunnel.git
    branch: master
    private_key: ((github_private_key))

- name: docker-remote-tunnel-image
  type: docker-image
  source:
    repository: ruanbekker/docker-remote-tunnel
    tag: test
    username: ((dockerhub_user))
    password: ((dockerhub_password))

- name: slack-alert
  type: slack-notification
  source:
    url: ((slack_notification_url))

resource_types:
  - name: slack-notification
    type: docker-image
    source:
      repository: cfcommunity/slack-notification-resource
      tag: v1.3.0

jobs:
- name: build-cached-image
  plan:
  - get: git-repo
    trigger: true
  - task: build-cached-image-workspace
    config:
      platform: linux
      image_resource:
        type: docker-image
        source:
          repository: rbekker87/build-tools

      outputs:
      - name: workspace
      inputs:
      - name: git-repo

      run:
        path: /bin/sh
        args:
        - -c
        - |
          output_dir=workspace

          cat << EOF > "${output_dir}/Dockerfile"
          FROM alpine

          ADD git-repo /tmp/git-repo
          RUN mv /tmp/git-repo/docker-tunnel /usr/bin/docker-tunnel
          RUN apk --no-cache add screen docker openssl openssh-client apache2-utils
          RUN /usr/bin/docker-tunnel -h
          RUN rm -rf /tmp/git-repo
          EOF

          cp -R ./git-repo "${output_dir}/git-repo"

  - put: docker-remote-tunnel-image
    params:
      build: workspace

    on_failure:
      put: slack-alert
      params:
        channel: '#system_events'
        username: 'concourse'
        icon_emoji: ':concourse:'
        silent: true
        text: |
            *$BUILD_PIPELINE_NAME/$BUILD_JOB_NAME* ($BUILD_NAME) FAILED to build image
            https://ci.domain.com/teams/$BUILD_TEAM_NAME/pipelines/$BUILD_PIPELINE_NAME/jobs/$BUILD_JOB_NAME/builds/$BUILD_NAME
    on_success:
      put: slack-alert
      params:
        channel: '#system_events'
        username: 'concourse'
        icon_emoji: ':concourse:'
        silent: true
        text: |
            *$BUILD_PIPELINE_NAME/$BUILD_JOB_NAME* ($BUILD_NAME) SUCCESS - Image has been published
            https://ci.domain.com/teams/$BUILD_TEAM_NAME/pipelines/$BUILD_PIPELINE_NAME/jobs/$BUILD_JOB_NAME/builds/$BUILD_NAME

- name: test
  plan:
  - get: docker-remote-tunnel-image
    passed: [build-cached-image]
    trigger: true
  - get: git-repo
    passed: [build-cached-image]
  - task: run-tests
    image: docker-remote-tunnel-image
    config:
      platform: linux
      inputs:
      - name: git-repo
      run:
        dir: git-repo
        path: sh
        args:
        - /usr/bin/docker-tunnel
        - --help

    on_failure:
      put: slack-alert
      params:
        channel: '#system_events'
        username: 'concourse'
        icon_emoji: ':concourse:'
        silent: true
        text: |
            *$BUILD_PIPELINE_NAME/$BUILD_JOB_NAME* ($BUILD_NAME) FAILED - Testing image failure
            https://ci.domain.com/teams/$BUILD_TEAM_NAME/pipelines/$BUILD_PIPELINE_NAME/jobs/$BUILD_JOB_NAME/builds/$BUILD_NAME
    on_success:
      put: slack-alert
      params:
        channel: '#system_events'
        username: 'concourse'
        icon_emoji: ':concourse:'
        silent: true
        text: |
            *$BUILD_PIPELINE_NAME/$BUILD_JOB_NAME* ($BUILD_NAME) SUCCESS - Testing image Succeeded
            https://ci.domain.com/teams/$BUILD_TEAM_NAME/pipelines/$BUILD_PIPELINE_NAME/jobs/$BUILD_JOB_NAME/builds/$BUILD_NAME

Note that our secret information is templatized and saved in our local credentials.yml which should never be stored in version control:

1
2
3
4
5
6
7
slack_notification_url: https://api.slack.com/aaa/bbb/ccc
dockerhub_user: myuser
dockerhub_password: mypasswd
github_private_key: |-
        -----BEGIN RSA PRIVATE KEY-----
        some-secret-data
        -----END RSA PRIVATE KEY------

Set the Pipeline:

Now that we have our pipeline definition, credentials and application code (stored in version control), go ahead and set the pipeline, which will save the pipeline configuration in concourse:

1
2
# pipeline name: my-docker-app-pipeline
$ fly -t scw sp -n main -c pipeline.yml -p my-docker-app-pipeline -l credentials.yml

Now the pipeline is saved on concourse but in a paused state, go ahead and unpause the pipeline:

1
$ fly -t scw up -p my-docker-app-pipeline

Test your Pipeline

Make a commit to master and head over to concourse and look at it go:

Thanks for reading, make sure to check out my other posts on #concourse

Ship Your Logs to Elasticsearch With Filebeat

Filebeat by Elastic is a lightweight log shipper, that ships your logs to Elastic products such as Elasticsearch and Logstash. Filbeat monitors the logfiles from the given configuration and ships the to the locations that is specified.

Filebeat Overview

Filebeat runs as agents, monitors your logs and ships them in response of events, or whenever the logfile receives data.

Below is a overview (credit: elastic.co) how Filebeat works

Installing Filebeat

Let’s go ahead and install Filebeat. I will be using version 6.7 as that will be the same version that I am running on my Elasticsearch. To check the version of your elasticsearch cluster:

1
$ curl http://127.0.0.1:9200/_cluster/health?pretty # i have es running locally

Install the dependencies:

1
$ apt install wget apt-transport-https -y

Get the public signing key:

1
$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Get the repository definition:

1
$ echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-6.x.list

Update the repositories:

1
$ apt update && apt upgrade -y

Install Filebeat and enable the service on boot:

1
2
$ apt install filebeat -y
$ systemctl enable filebeat

Configure Filebeat

Let’s configure our main configuration in filebeat, to specify our location where the data should be shipped to (in this case elasticsearch) and I will also like to set some extra fields that will apply to this specific server.

Open up /etc/filebeat/filebeat.yml and edit the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
filebeat.inputs:

- type: log
  enabled: false
  paths:
    - /var/log/nginx/*.log

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 3

fields:
  blog_name: sysadmins
  service_type: webserver
  cloud_provider: aws

setup.kibana:
  host: "http://localhost:5601"
  username: "elastic"
  password: "changeme"

output.elasticsearch:
  hosts: ["localhost:9200"]
  protocol: "http"
  username: "elastic"
  password: "changeme"

Above, just setting my path to nginx access logs, some extra fields, including that it shoulds seed kibana with example visualizations and the output configuration of elasticsearch.

Filebeat Modules

Filebeat comes with modules that has context on specific applications like nginx, mysql etc. Lets enable system (syslog, auth, etc) and nginx for our web server:

1
2
$ filebeat modules enable system
$ filebeat modules enable nginx

Example of my /etc/filebeat/modules.d/system.yml configuration:

1
2
3
4
5
6
7
8
- module: system
  syslog:
    enabled: true
    var.paths: ["/var/log/syslog"]

  auth:
    enabled: true
    var.paths: ["/var/log/auth.log"]

Example of my /etc/filebeat/modules.d/nginx.yml configuration:

1
2
3
4
5
6
7
8
- module: nginx
  access:
    enabled: true
    var.paths: ["/var/log/nginx/access.log"]

  error:
    enabled: true
    var.paths: ["/var/log/nginx/error.log"]

Now setup the templates

1
$ filebeat setup

Then restart filebeat:

1
$ /etc/init.d/filebeat restart

You can have a look at the logs, should you need to debug:

1
tail -f /var/log/filebeat/filebeat

Your data should now be shipped to elasticsearch, by default under the filebeat-YYYY.mm.dd index pattern.

1
2
3
$ curl 'http://127.0.0.1:9200/_cat/indices/filebeat*?v'
health status index                     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   filebeat-6.7.1-2019.03.27 CBdV7adjRKypN1wguwuHDA   3   1     453220            0    230.2mb        115.9mb

Kibana

You can head over to Kibana at http://localhost:5601 (in this case) to visualize the data that is ingested into your filebeat index. I will write a tutorial on how to graph up most common dashboards later this week.

Thats it for now :D

Resources:

How to Deploy a Docker Swarm Cluster on Scaleway With Terraform

We will deploy a 3 node docker swarm cluster with terraform on scaleway. I have used the base source code from this repository but tweaked the configuration to my needs.

Pre-Requisites

Ensure terraform and jq is instaled:

1
2
$ brew install terraform
$ brew install jq

Terraform

You can have a look at the linked source at the top for the source code, but below I will provide each file that will make up our terraform deployment.

Ource main.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
provider "scaleway" {
  region = "${var.region}"
}

data "scaleway_bootscript" "debian" {
  architecture = "x86_64"
  name = "x86_64 mainline 4.15.11 rev1"
}

data "scaleway_image" "debian_stretch" {
  architecture = "x86_64"
  name         = "Debian Stretch"
}

data "template_file" "docker_conf" {
  template = "${file("conf/docker.tpl")}"

  vars {
    ip = "${var.docker_api_ip}"
  }
}

The outputs.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
output "swarm_manager_public_ip" {
  value = "${scaleway_ip.swarm_manager_ip.0.ip}"
}

output "swarm_manager_private_ip" {
  value = "${scaleway_server.swarm_manager.0.private_ip}"
}

output "swarm_workers_public_ip" {
  value = "${concat(scaleway_server.swarm_worker.*.name, scaleway_server.swarm_worker.*.public_ip)}"
}

output "swarm_workers_private_ip" {
  value = "${concat(scaleway_server.swarm_worker.*.name, scaleway_server.swarm_worker.*.private_ip)}"
}

output "workspace" {
  value = "${terraform.workspace}"
}

Our security-groups.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
resource "scaleway_security_group" "swarm_managers" {
  name        = "swarm_managers"
  description = "Allow HTTP/S and SSH traffic"
}

resource "scaleway_security_group_rule" "ssh_accept" {
  security_group = "${scaleway_security_group.swarm_managers.id}"

  action    = "accept"
  direction = "inbound"
  ip_range  = "0.0.0.0/0"
  protocol  = "TCP"
  port      = 22
}

resource "scaleway_security_group_rule" "http_accept" {
  security_group = "${scaleway_security_group.swarm_managers.id}"

  action    = "accept"
  direction = "inbound"
  ip_range  = "0.0.0.0/0"
  protocol  = "TCP"
  port      = 80
}

resource "scaleway_security_group_rule" "https_accept" {
  security_group = "${scaleway_security_group.swarm_managers.id}"

  action    = "accept"
  direction = "inbound"
  ip_range  = "0.0.0.0/0"
  protocol  = "TCP"
  port      = 443
}

resource "scaleway_security_group" "swarm_workers" {
  name        = "swarm_workers"
  description = "Allow SSH traffic"
}

resource "scaleway_security_group_rule" "ssh_accept_workers" {
  security_group = "${scaleway_security_group.swarm_workers.id}"

  action    = "accept"
  direction = "inbound"
  ip_range  = "0.0.0.0/0"
  protocol  = "TCP"
  port      = 22
}

Our variables.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
variable "docker_version" {
  default = "18.06.3~ce~3-0~debian"
}

variable "region" {
  default = "ams1"
}

variable "manager_instance_type" {
  default = "START1-M"
}

variable "worker_instance_type" {
  default = "START1-M"
}

variable "worker_instance_count" {
  default = 2
}

variable "docker_api_ip" {
  default = "127.0.0.1"
}

Our managers.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
resource "scaleway_ip" "swarm_manager_ip" {
  count = 1
}

resource "scaleway_server" "swarm_manager" {
  count          = 1
  name           = "${terraform.workspace}-manager-${count.index + 1}"
  image          = "${data.scaleway_image.debian_stretch.id}"
  type           = "${var.manager_instance_type}"
  bootscript     = "${data.scaleway_bootscript.debian.id}"
  security_group = "${scaleway_security_group.swarm_managers.id}"
  public_ip      = "${element(scaleway_ip.swarm_manager_ip.*.ip, count.index)}"

  volume {
    size_in_gb = 50
    type       = "l_ssd"
  }

  provisioner "remote-exec" {
    script = "scripts/mount-disk.sh"
  }

  connection {
    type = "ssh"
    user = "root"
    private_key = "${file("~/.ssh/id_rsa")}"
  }

  provisioner "remote-exec" {
    inline = [
      "mkdir -p /etc/systemd/system/docker.service.d",
    ]
  }

  provisioner "file" {
    content     = "${data.template_file.docker_conf.rendered}"
    destination = "/etc/systemd/system/docker.service.d/docker.conf"
  }

  provisioner "file" {
    source      = "scripts/install-docker-ce.sh"
    destination = "/tmp/install-docker-ce.sh"
  }

  provisioner "file" {
    source      = "scripts/local-persist-plugin.sh"
    destination = "/tmp/local-persist-plugin.sh"
  }

  provisioner "remote-exec" {
    inline = [
      "chmod +x /tmp/install-docker-ce.sh",
      "/tmp/install-docker-ce.sh ${var.docker_version}",
      "docker swarm init --advertise-addr ${self.private_ip}",
      "chmod +x /tmp/local-persist-plugin.sh",
      "/tmp/local-persist-plugin.sh"
    ]
  }
}

Our workers.tf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
resource "scaleway_ip" "swarm_worker_ip" {
  count = "${var.worker_instance_count}"
}

resource "scaleway_server" "swarm_worker" {
  count          = "${var.worker_instance_count}"
  name           = "${terraform.workspace}-worker-${count.index + 1}"
  image          = "${data.scaleway_image.debian_stretch.id}"
  type           = "${var.worker_instance_type}"
  bootscript     = "${data.scaleway_bootscript.debian.id}"
  security_group = "${scaleway_security_group.swarm_workers.id}"
  public_ip      = "${element(scaleway_ip.swarm_worker_ip.*.ip, count.index)}"

  volume {
    size_in_gb = 50
    type       = "l_ssd"
  }

  provisioner "remote-exec" {
    script = "scripts/mount-disk.sh"
  }

  connection {
    type = "ssh"
    user = "root"
    private_key = "${file("~/.ssh/id_rsa")}"
  }

  provisioner "remote-exec" {
    inline = [
      "mkdir -p /etc/systemd/system/docker.service.d",
    ]
  }

  provisioner "file" {
    content     = "${data.template_file.docker_conf.rendered}"
    destination = "/etc/systemd/system/docker.service.d/docker.conf"
  }

  provisioner "file" {
    source      = "scripts/install-docker-ce.sh"
    destination = "/tmp/install-docker-ce.sh"
  }

  provisioner "file" {
    source      = "scripts/local-persist-plugin.sh"
    destination = "/tmp/local-persist-plugin.sh"
  }

  provisioner "remote-exec" {
    inline = [
      "chmod +x /tmp/install-docker-ce.sh",
      "/tmp/install-docker-ce.sh ${var.docker_version}",
      "docker swarm join --token ${data.external.swarm_tokens.result.worker} ${scaleway_server.swarm_manager.0.private_ip}:2377",
      "chmod +x /tmp/local-persist-plugin.sh",
      "/tmp/local-persist-plugin.sh",
    ]
  }

  provisioner "remote-exec" {
    when = "destroy"

    inline = [
      "docker node update --availability drain ${self.name}",
    ]

    on_failure = "continue"

    connection {
      type = "ssh"
      user = "root"
      host = "${scaleway_ip.swarm_manager_ip.0.ip}"
    }
  }

  provisioner "remote-exec" {
    when = "destroy"

    inline = [
      "docker swarm leave",
    ]

    on_failure = "continue"
  }

  provisioner "remote-exec" {
    when = "destroy"

    inline = [
      "docker node rm --force ${self.name}",
    ]

    on_failure = "continue"

    connection {
      type = "ssh"
      user = "root"
      host = "${scaleway_ip.swarm_manager_ip.0.ip}"
    }
  }
}

data "external" "swarm_tokens" {
  program = ["./scripts/fetch-tokens.sh"]

  query = {
    host = "${scaleway_ip.swarm_manager_ip.0.ip}"
  }

  depends_on = ["scaleway_server.swarm_manager"]
}

Our config for the docker daemon: conf/docker.tpl

1
2
3
4
5
6
7
8
9
10
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// \
  -H tcp://${ip}:2375 \
  --storage-driver=overlay2 \
  --dns 8.8.4.4 --dns 8.8.8.8 \
  --log-driver json-file \
  --log-opt max-size=50m --log-opt max-file=10 \
  --experimental=true \
  --metrics-addr 172.17.0.1:9323

Our script to mount our additional disk: scripts/mount-disk.sh

1
2
3
4
5
6
#!/bin/bash
apt update
apt install xfsprogs attr -y
mkfs -t xfs /dev/vdb
echo "/dev/vdb /mnt xfs defaults 0 0" >> /etc/fstab
mount -a

Our script to install docker: scripts/install-docker-ce.sh

1
2
3
4
5
6
7
8
9
10
#!/usr/bin/env bash

DOCKER_VERSION=$1
DEBIAN_FRONTEND=noninteractive apt-get -qq update
apt-get -qq install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"

apt-get -q update -y
apt-get -q install -y docker-ce=$DOCKER_VERSION containerd.io

Our script that retrieves the swarm tokens: scripts/fetch-tokens.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/usr/bin/env bash

# Processing JSON in shell scripts
# https://www.terraform.io/docs/providers/external/data_source.html#processing-json-in-shell-scripts

set -e

# Extract "host" argument from the input into HOST shell variable
eval "$(jq -r '@sh "HOST=\(.host)"')"

MANAGER=$(ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@$HOST docker swarm join-token manager -q)
WORKER=$(ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@$HOST docker swarm join-token worker -q)

# produce a json object containing the tokens
jq -n --arg manager "$MANAGER" --arg worker "$WORKER" '{"manager":$manager,"worker":$worker}'

Our script to install the local-persist docker volume plugin: scripts/local-persist-plugin.sh

1
2
3
#!/usr/bin/env bash
set -e
curl -fsSL https://raw.githubusercontent.com/CWSpear/local-persist/master/scripts/install.sh | bash

Deploy your Swarm

Note that we will be deploying 3x SMART1-M servers with Debian Stretch. At this moment the image id is the one of debian stretch but may change in the future. If you want to change the distro, update the install script, and the terraform files.

Generate API Token on Scaleway then export it to your current shell:

1
2
export SCALEWAY_ORGANIZATION="<organization-id>"
export SCALEWAY_TOKEN="<secret>"

Make sure that your ssh private key is the intended one as in the config, in my example: ~/.ssh/id_rsa and that they are allowed in your servers authorized_keys file

Create a new workspace:

1
$ terraform new workspace swarm

Pull down the providers and initialize:

1
$ terraform init

Deploy!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ terraform apply
...
...
scaleway_server.swarm_worker[0]: Creation complete after 4m55s (ID: xx-xx-xx-xx-xx)

Apply complete! Resources: 14 added, 0 changed, 0 destroyed.
Outputs:

swarm_manager_private_ip = 10.21.x.x
swarm_manager_public_ip = 51.xx.xx.xx
swarm_workers_private_ip = [
    swarm-worker-1,
    swarm-worker-2,
    10.20.xx.xx,
    10.20.xx.xx,
]
swarm_workers_public_ip = [
    swarm-worker-1,
    swarm-worker-2,
    51.xx.xx.xx,
    51.xx.xx.xx,
]
workspace = swarm

Once your deployment is done you will be prompted with the public/private ip addresses of your nodes as seen above, you can also manually retrieve them:

1
$ terraform terraform output

Or for a specific node, such as the manager:

1
2
$ terraform terraform output swarm-manager
51.xx.xx.xx

Go ahead and ssh to your manager nodes and list the swarm nodes, boom, easy right.

1
2
3
4
5
$ docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
2696o0vrt93x8qf2gblbfc8pf *   swarm-manager       Ready               Active              Leader              18.09.3
72ava7rrp2acnyadisg52n7ym     swarm-worker-1      Ready               Active                                  18.09.3
sy2otqn20qe9jc2v9io3a21jm     swarm-worker-2      Ready               Active                                  18.09.3

When you want to destroy the environment:

1
$ terraform destroy -force

References:

Big thanks goes to @stefanprodan

Deploy Scaleway Servers via the API in Python

A quick post on how to deploy Scaleway Servers via their API using Python.

API Documentation

Scaleway has great API Documentation available, so for deeper info have a look at the link provided.

Python

Our python script has a function create_server that expects a server name, server size, the tag and the linux distribution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import requests
import json
import time

SCW_API_KEY = "<your-api-key>"
SCW_OGRA_ID = "<your-organization-id>"
SCW_REGION = "ams1"
SCW_COMPUTE_API_URL = "https://cp-{region}.scaleway.com/{resource}".format(region=SCW_REGION, resource='servers')
SCW_VOLUME_API_URL = "https://cp-{region}.scaleway.com/{resource}".format(region=SCW_REGION, resource='volumes')
SCW_HEADERS = {"X-Auth-Token": SCW_API_KEY, "Content-Type": "application/json"}
SCW_IMAGES = {"ubuntu/18": "6a601340-19c1-4ca7-9c1c-0704bcc9f5fe", "debian/stretch": "710ff1fa-0d16-4f8f-93ac-0647c44fa21d"}

def get_status(server_id):
  response = requests.get(SCW_COMPUTE_API_URL + "/" + server_id, headers=SCW_HEADERS)
  state = response.json()
  return state

def create_server(instance_name, instance_type, instance_tag, os_distro):
  count = 0
  compute_payload = {
      "name": instance_name,
      "image": SCW_IMAGES[os_distro],
      "commercial_type": instance_type,
      "tags": [instance_tag],
      "organization": SCW_OGRA_ID
  }

  print("creating server")
  r_create = requests.post(SCW_COMPUTE_API_URL, json=compute_payload, headers=SCW_HEADERS)
  server_id = r_create.json()["server"]["id"]
  action_payload = {"action": "poweron"}
  r_start = requests.post(SCW_COMPUTE_API_URL + "/" + server_id + "/action", json=action_payload, headers=SCW_HEADERS)
  r_describe = requests.get(SCW_COMPUTE_API_URL + "/" + server_id, headers=SCW_HEADERS)

  server_state = get_status(server_id)['server']['state']
  while server_state != "running":

    if count > 90:
      r_delete = requests.delete(SCW_COMPUTE_API_URL + "/" + server_id, json=action_payload, headers=SCW_HEADERS)
      return {"message": "error", "description": "task timed out while waiting for server to boot"}

    count += 1
    print("waiting for server to become ready")
    time.sleep(10)
    server_state = get_status(server_id)['server']['state']

  time.sleep(5)
  resp = get_status(server_id)["server"]
  output = {
      "id": resp["id"],
      "hostname": resp["hostname"],
      "instance_type": resp["commercial_type"],
      "public_ip": resp["public_ip"]["address"],
      "private_ip": resp["private_ip"],
      "status": resp["state"]
  }
  return output


response = create_server("swarm-manager", "START1-M", "swarm", "ubuntu/18")
print(response)

Deploying a server with the hostname: swarm-manager, instance-size: START1-M, tag: swarm and os distribution: ubuntu/18:

1
2
3
4
5
6
$ python scw.py
creating server
waiting for server to become ready
waiting for server to become ready
waiting for server to become ready
{'status': u'running', 'hostname': u'swarm-manager', 'public_ip': u'51.x.x.x', 'instance_type': u'START1-M', 'private_ip': u'10.x.x.x', 'id': u'xx-xx-xx-xx-xx'

For more info on Scaleway please do check them out: https://www.scaleway.com}

Setup NRPE Client and Server for Monitoring Remote Services in Nagios

If you have not setup the Nagios Server have a look at that link to setup the Nagios server.

Nagios NRPE

Nagios Remote Plugin Executor (NRPE) allows you to remotely execute Nagios plugins on other linux systems. This allows you to monitor remote machine metrics (disk usage, CPU, local listening services, etc.).

NRPE has 2 sections:

  • The nagios server side.
  • The client side.

For nagios to execute remote plugins, the client configuration needs to allow the nrpe server which will be nagios.

Download, extract, configure and install NRPE server:

1
2
3
4
5
6
7
8
9
$ wget 'https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-3.2.1/nrpe-3.2.1.tar.gz'
$ tar -xvf nrpe-3.2.1.tar.gz
$ cd nrpe-3.2.1
$ ./configure --enable-command-args --with-nagios-user=nagios --with-nagios-group=nagcmd --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/x86_64-linux-gnu
$ make all
$ make install
$ make install-init
$ make install-config
$ systemctl enable nrpe.service

Installing NRPE on the client side:

1
2
3
$ apt update && apt install nagios-nrpe-server -y
$ systemctl enable nagios-nrpe-server
$ systemctl start nagios-nrpe-server

Allow your nagios server ip in /etc/nagios/nrpe.cfg:

1
allowed_hosts=nagios.ip.in.here

Restart NRPE on the client:

1
$ systemctl restart nagios-nrpe-server

Ensure that the check_nrpe plugin is configured and available in the commands.cfg configuration for the nagios server:

1
2
3
4
5
6
$ vi /usr/local/nagios/etc/objects/commands.cfg

define command {
    command_name check_nrpe
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

Check this out how to create a python nrpe nagios plugin to check disk space on the client host