Tutorial on DynamoDB Using Bash and the AWS CLI Tools to Interact With a Music Dataset
In this tutorial we will be using Amazons DynamoDB (DynamoDB Local) to host a sample dataset consisting of music data that I retrieved from the iTunes API, which we will be using the aws cli tools to interact with the data.
We will be doing the following:
Use Docker to provision a Local DynamoDB Server
Create a DynamoDB Table with a Hash and Range Key
List the Table
Create a Item in DynamoDB
Read a Item from DynamoDB
Read a Item from DynamoDB by specifying the details you would like to read
Batch Write multiple items to DynamoDB
Scan all your Items from DynamoDB
Query by Artist
Query by Artist and Song
Query all the Songs from an Artist starting with a specific letter
If you have a AWS Account you can provision your table from there, but if you want to test it locally, you can provision a local DynamoDB Server using Docker:
Now lets use the iTunes API to get a collection of some songs, which I will dump into a json file on github. So now that we have a json file with a collection of songs from multiple artists, we can go ahead and write it into our table using the BatchWriteItem call:
This can be a very expensive call, as a Scan will return all the items from your table, and depending on the size of your table, you could be throttled, but since we are using dynamodb local and only having 16 items in our table, we can do a scan to return all the items in our table:
$ aws dynamodb --endpoint-url http://localhost:8000 query --select ALL_ATTRIBUTES \ --table-name MusicCollection \ --key-condition-expression "Artist = :a and begins_with(SongTitle, :t)"\ --expression-attribute-values '{":a":{"S":"The Beatles"}, ":t": {"S": "h"}}'{"Count": 2,
"Items": [{"Artist": {"S": "The Beatles"},
"SongTitle": {"S": "Happy Day"},
"AlbumTitle": {"S": "The Beatles 1967-1970 (The Blue Album)"}},
{"Artist": {"S": "The Beatles"},
"SongTitle": {"S": "Help!"},
"AlbumTitle": {"S": "The Beatles Box Set"}}],
"ScannedCount": 2,
"ConsumedCapacity": null
}
So our table consists of Artist (HASH) and SongTitle (RANGE), so we can only query based on those attributes. You will find when you try to query on a attribute that is not part of the KeySchema, a exception will be received:
123
$ aws dynamodb --endpoint-url http://localhost:8000 query --select ALL_ATTRIBUTES --table-name MusicCollection --key-condition-expression "Artist = :a and AlbumTitle = :t" --expression-attribute-values '{":a":{"S":"AC/DC"}, ":t": {"S": "Back in Black"}}'An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element
So how do we query on a attribute that is not part of the KeySchema? Let’s say you want to query all the songs from a Artist and a specific Album.
Global Secondary Indexes:
Add Global Secondary Index, with the Attributes: Artist and AlbumTitle.