When using Lambda and DynamoDB, you can use global variables to gain performance when your data from DynamoDB does not get updated that often, and you would like to use caching to prevent a API call to DynamoDB everytime your Lambda Function gets invoked.
You can use external services like Redis or Memcached when you would like to verify that each invocation is as true as your source of truth which will be DynamoDB. Then your application logic can work with caching.
But in this case we just want a simple piece of code that can keep the state for the remaining time that the function is running on that underlying container. I am not 100% sure, but I have seen that the data can be cached for up to 60 minutes. This can be a total mess when your data gets updated regularly, then I would set all my calls in functions, as the global variables keeps their state for some time.
Example Function:
This function gets data from DynamoDB, iterates through a small dataset (10 Items), and appends each group name to my list which is the value of my groups key inside my dictionary.
Due to my global variable mydata, you will see that the first invocation will result in a API call to DynamoDB as the length of my mydata["groups"] being 0, the second invocation, the data will exist inside my global variable, therefore I am returning the data directly from my variable.
123456789101112131415161718192021
importboto3,jsonclient=boto3.resource('dynamodb',region_name='eu-west-1')tbl=client.Table('my-dynamo-table')mydata={}mydata["groups"]=[]deflambda_handler(event,context):iflen(mydata["groups"])==0:# data is not cached, make call to dynamodata=tbl.scan()group_data=data['Items']forgroupingroup_data:mydata["groups"].append(group['name'])returnmydataelse:# return cached contentreturnmydata
Results of my Invocations:
The first call that I made:
The second call that I made:
If you need a small layer of caching that can improve your latency, this can be used. But if you need your data to be accurate from every call, rather looking into a different approach and external caching services.
Resources:
Take advantage of Execution Context reuse to improve the performance of your function.:
“Make sure any externalized configuration or dependencies that your code retrieves are stored and referenced locally after initial execution. Limit the re-initialization of variables/objects on every invocation. Instead use static initialization/constructor, global/static variables and singletons. Keep alive and reuse connections (HTTP, database, etc.) that were established during a previous invocation.”
importsmtplibfromemail.MIMEMultipartimportMIMEMultipartfromemail.MIMETextimportMIMETextmail_from='Ruan Bekker <ruan@ruanbekker.com>'mail_to='Ruan Bekker <xxxx@gmail.com>'msg=MIMEMultipart()msg['From']=mail_frommsg['To']=mail_tomsg['Subject']='Sending mails with Python'mail_body="""Hey,This is a test.Regards,\nRuan"""msg.attach(MIMEText(mail_body))try:server=smtplib.SMTP_SSL('smtp.sendgrid.net',465)server.ehlo()server.login('apikey','your-api-key')server.sendmail(mail_from,mail_to,msg.as_string())server.close()print("mail sent")except:print("issue")
When I ran the code, I received the mail, and when you inspect the headers you can see that the mail came via sendgrid:
After the Access Policy has been updated, the Elasticsearch Domain Status will show Active
Testing from EC2 using IAM Instance Profile:
Launch a EC2 Instance with the IAM Role eg. es-role, then using Python, we will make a request to our Elasticsearch Domain using boto3, aws4auth and the native elasticsearch client for python via our IAM Role, which we will get the temporary credentials from boto3.Session.
I wanted to get metadata from my other blog sysadmins.co.za, such as each post’s title, link and tags using the RSS link. I stumbled upon feedparser, where I will use it to scrape all the posts details from the link and append it to a list, which I can then use to ingest it into a database or something like that.
Installing Dependencies:
Install feedparser and requests:
1
$ pip install feedparser requests
The Python Code:
I’m not too sure at this point how to get pagination going, so I’ve set a range to check, and if a status code of 200 is received, it will check if the title is in the list that I defined, if not, it will append it to the list.
At the end of the loop, the script will return the list that was defined, which will provide the info mentioned earlier:
In this tutorial we will be using Amazons DynamoDB (DynamoDB Local) to host a sample dataset consisting of music data that I retrieved from the iTunes API, which we will be using the aws cli tools to interact with the data.
We will be doing the following:
Use Docker to provision a Local DynamoDB Server
Create a DynamoDB Table with a Hash and Range Key
List the Table
Create a Item in DynamoDB
Read a Item from DynamoDB
Read a Item from DynamoDB by specifying the details you would like to read
Batch Write multiple items to DynamoDB
Scan all your Items from DynamoDB
Query by Artist
Query by Artist and Song
Query all the Songs from an Artist starting with a specific letter
If you have a AWS Account you can provision your table from there, but if you want to test it locally, you can provision a local DynamoDB Server using Docker:
Now lets use the iTunes API to get a collection of some songs, which I will dump into a json file on github. So now that we have a json file with a collection of songs from multiple artists, we can go ahead and write it into our table using the BatchWriteItem call:
This can be a very expensive call, as a Scan will return all the items from your table, and depending on the size of your table, you could be throttled, but since we are using dynamodb local and only having 16 items in our table, we can do a scan to return all the items in our table:
$ aws dynamodb --endpoint-url http://localhost:8000 query --select ALL_ATTRIBUTES \ --table-name MusicCollection \ --key-condition-expression "Artist = :a and begins_with(SongTitle, :t)"\ --expression-attribute-values '{":a":{"S":"The Beatles"}, ":t": {"S": "h"}}'{"Count": 2,
"Items": [{"Artist": {"S": "The Beatles"},
"SongTitle": {"S": "Happy Day"},
"AlbumTitle": {"S": "The Beatles 1967-1970 (The Blue Album)"}},
{"Artist": {"S": "The Beatles"},
"SongTitle": {"S": "Help!"},
"AlbumTitle": {"S": "The Beatles Box Set"}}],
"ScannedCount": 2,
"ConsumedCapacity": null
}
So our table consists of Artist (HASH) and SongTitle (RANGE), so we can only query based on those attributes. You will find when you try to query on a attribute that is not part of the KeySchema, a exception will be received:
123
$ aws dynamodb --endpoint-url http://localhost:8000 query --select ALL_ATTRIBUTES --table-name MusicCollection --key-condition-expression "Artist = :a and AlbumTitle = :t" --expression-attribute-values '{":a":{"S":"AC/DC"}, ":t": {"S": "Back in Black"}}'An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element
So how do we query on a attribute that is not part of the KeySchema? Let’s say you want to query all the songs from a Artist and a specific Album.
Global Secondary Indexes:
Add Global Secondary Index, with the Attributes: Artist and AlbumTitle.
If you have the irb output, you should be good to go.
Strings and Integers
You will find when you enter a string, which is represented as one or more characters enclosed within quotation marks:
12
irb(main):001:0> "hello"=> "hello"
The integers will be without the quotation marks, when we introduce anything within quotation marks, ruby will read it as a string. So for a integer, lets provide ruby with a number and the number will be returned to the shell:
12
irb(main):002:0> 1=> 1
Using mathematical symbols like the + will either sum the two values when they are integers, or concatenate when they are strings.
Let’s start with strings: we will add the string hello and world
12
irb(main):003:0> "hello" + "world"=> "helloworld"
Now let’s add two numbers together, 10 and 20:
12
irb(main):004:0> 10 + 20=> 30
As you can see, it did a calculation on the two numbers as they were treated as integeres. But what happens when we add them as strings?
12
irb(main):005:0> "10" + "20"=> "1020"
Adding them as strings, will concatenate them.
String Methods
Ruby’s strings has many built in methods, which makes it convenient manipulating data, let me go through a couple that I am working with:
This is a command line approach to create a java web app for payara that takes war files, which we will be using in conjunction with springboot and apache maven.
In this post we will setup a Java Hello World Web App, using Maven and SpringBoot on Ubuntu 16. I will create all the needed files in this tutorial, but you can head to start.spring.io to generate the zip for you.
$ java -version
java version "1.8.0_181"Java(TM) SE Runtime Environment (build 1.8.0_181-b13)Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
Install Apache Maven:
Maven is a build automation tool used primarily for Java projects. Let’s setup Maven:
packagehello;importorg.springframework.web.bind.annotation.RequestMapping;importorg.springframework.web.bind.annotation.RestController;@RestControllerpublicclassHelloController{@RequestMapping("/")publicStringindex(){return"This is the index!\n";}@RequestMapping("/hello")publicStringindex2(){return"Hello, World!\n";}}
Build and Compile:
This will download all the dependencies and build the jar file:
In this tutorial, we will setup a basic ruby on rails web app, that consists of a /hello_world and a /status controller. The hello_world controller will return Hello, World and our /status controller will return a HTTP 204 no content response code.
$ rails new fist-app
$ cd first-app
$ rails server
Route Config
Our routes config:
123456
$ cat config/routes.rb
Rails.application.routes.draw do
# For details on the DSL available within this file, see http://guides.rubyonrails.org/routing.html
get 'hello_world', to: 'hello_world#call'
get 'status', to: 'status#call'
end
Controllers
Configure the hello_world controller:
1234567
$ cat app/controllers/hello_world_controller.rb
class HelloWorldController < ApplicationController
def call
render body: "Hello, World"
end
end
Configure the status controller:
1234567
$ cat app/controllers/status_controller.rb
class StatusController < ApplicationController
def call
[204, {}, ['']]
end
end