In this tutorial we will deploy a monitoring stack to docker swarm, that includes Grafana, Prometheus, Node-Exporter, cAdvisor and Alertmanager.
If you are looking for more information on Prometheus, have a look at my other Prometheus and Monitoring blog posts.
What you will get out of this
Once you deployed the stacks, you will have the following:
- Access Grafana through Traefik reverse proxy
- Node-Exporter to expose node level metrics
- cAdvisor to expose container level metrics
- Prometheus to scrape the exposed entpoints and ingest it into Prometheus
- Prometheus for your Timeseries Database
- Alertmanager for firing alerts on configured rules
The compose file that I will provide will have pre-populated dashboards
Deploy Traefik
Get the traefik stack sources:
1 2 |
|
Have a look at HTTPS Mode if you want to deploy traefik on HTTPS, as I will use HTTP in this demonstration.
Set your domain and deploy the stack:
1 2 3 4 5 6 7 8 9 10 |
|
Your traefik service should be running:
1 2 3 |
|
Switch back to the previous directory:
1
|
|
Deploy the Monitoring Stack
Get the sources:
1 2 |
|
If you want to deploy the stack with no pre-configured dashboards, you would need to use ./docker-compose.html
, but in this case we will deploy the stack with pre-configured dashboards.
Set the domain and deploy the stack:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
The endpoints is configured as ${service_name}.${DOMAIN}
so you will be able to access grafana on http://grafana.localhost
as showed in my use-case.
Use docker stack services mon
to see if all the tasks has checked into its desired count then access grafana on http://grafana.${DOMAIN}
Accessing Grafana
Access Grafana on http://grafana.${DOMAIN}
and logon with the user admin and the password admin:
You will be asked to reset the password:
You will then be directed to the ui:
From the top, when you list dashboards, you will see the 3 dashboards that was pre-configured:
When looking at the Swarm Nodes Dashboard:
The Swarm Services Dashboard:
Exploring Metrics in Prometheus
Access prometheus on http://prometheus.${DOMAIN}
and from the search input, you can start exploring though all the metrics that is available in prometheus:
If we search for node_load15
and select graph, we can have a quick look on how the 15 minute load average looks like for the node where the stack is running on:
Having a look at the alerts section:
Resources
For more information and configuration on the stack that we use, have a look at the wiki: - https://github.com/bekkerstacks/monitoring-cpang/wiki
The github repository: - https://github.com/bekkerstacks/monitoring-cpang
Thank You
Let me know what you think. If you liked my content, feel free to checkout my content on ruan.dev or follow me on twitter at @ruanbekker