So let’s take this scenario:
Your database is getting hammered with requests and building up some load over time and we would like to place a caching layer in front of our database that will return data from the caching layer, to reduce some traffic to our database and also improve our performance for our application.
The Scenario:
Our scenario will be very simple for this demonstration:
- Database will be using SQLite with product information (product_name, product_description)
- Caching Layer will be Memcached
- Our Client will be written in Python, which checks if the product name is in cache, if not a
GET_MISS
will be returned, then the data will be fetched from the database, returns it to the client and save it to the cache - Next time the item will be read, a
GET_HIT
will be received, then the item will be delivered to the client directly from the cache
SQL Database:
As mentioned we will be using sqlite for demonstration.
Create the table, populate some very basic data:
1 2 3 4 5 6 7 8 |
|
Read all the data from the table:
1 2 3 4 5 6 |
|
Run a Memcached Container:
We will use docker to run a memcached container on our workstation:
1
|
|
Our Application Code:
I will use pymemcache as our client library. Install:
1 2 |
|
Our Application Code which will be in Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Explanation:
- We have a function that takes a argument of the product name, that makes the call to the database and returns the description of that product
- We will make a get operation to memcached, if nothing is returned, then we know the item does not exists in our cache,
- Then we will call our function to get the data from the database and return it directly to our client, and
- Save it to the cache in memcached so the next time the same product is queried, it will be delivered directly from the cache
The Demo:
Our Product Name is guitar
, lets call the product, which will be the first time so memcached wont have the item in its cache:
1 2 3 4 5 |
|
Now from the output, we can see that the item was delivered from the database and saved to the cache, lets call that same product and observe the behavior:
1 2 3 |
|
When our cache instance gets rebooted we will lose our data that is in the cache, but since the source of truth will be in our database, data will be re-added to the cache as they are requested. That is one good reason not to rely on a cache service to be your primary data source.
What if the product we request is not in our cache or database, let’s say the product tree
1 2 3 |
|
This was a really simple scenario, but when working with masses amount of data, you can benefit from a lot of performance using caching.