How to use DynamoDB with Transposit

DynamoDB is a powerful data store for your Transposit application.

Dan Moore
Oct 11th, 2019
Share

What is DynamoDB

DynamoDB is a NoSQL database with great performance and near-zero administration. It’s great for applications that need speed and have simple relations between the entities in the system.

You can create tables in DynamoDB, which have a primary key and any number of other attributes. You can search by primary key. If you need to search by any other attribute, you can either scan the entire table (if it is small) or create a secondary index. Attributes in DynamoDB can have data types.

There is plenty more DynamoDB documentation at the AWS site.

Why use it with Transposit

Transposit applications don’t have a way to store application data, beyond a small temporary data store (the stash). So if your application needs to store state, you need a third party storage solution. The right solution depends on the application characteristics, your skills and your existing application infrastructure. DynamoDB is a good fit when you have an obvious primary key and want to be able to store and query large amounts of data quickly.

Since Transposit lets you easily create multiple connectors to the same service, you can use Transposit to move data between different DynamoDB tables in different accounts or regions.

Tips and tricks

In the course of building Transposit applications, I learned some tips and tricks for interacting with DynamoDB that I wanted to share.

Datatypes

Everything is a string in DynamoDB requests (the value at rest can have a data type enforced). However, when you are creating an operation in Transposit, you can specify parameters. These can have a type (string, number, boolean, etc). If you have number or boolean that you want to store in DynamoDB, it is tempting to specify the parameter type. Don’t succumb to the temptation. All Transposit parameters that will end up in DynamoDB should be specified as strings. Instead, use DynamoDB datatypes. Here’s an example Transposit SQL query, updating an order:

SELECT * FROM aws_dynamodb.update_item
WHERE $body=(SELECT {
 'TableName': 'orders',
 'UpdateExpression': 'set orderstatus = :os, city = :city',
 'ExpressionAttributeValues': {
 ':os': {'N': '1'},
 ':city': {'S': 'Little Rock'}
 },
 'Key': {
 'OrderID': { 'N': '200' },
 }
})

You can see that order status is stored in DynamoDB as a number (the N key), but that it is passed in as a string '1'.

Because all data returned from DynamoDB is a string (even that with a different type), you’ll have to do some dereferencing to access values. For instance, here’s an SQL operation:

SELECT OrderStatus as orderstatus
FROM aws_dynamodb.query
 WHERE $body=(SELECT {
 'ProjectionExpression' : 'userid,orderstatus',
 'ConsistentRead' : true,
 'TableName' : 'orders',
 'FilterExpression': 'orderstatus <> :os',
 'ExpressionAttributeValues': {
 ':os': {'N': '999'},
 ':userid' : {'N' : '111'}
 },
 'KeyConditionExpression': 'userid = :userid'
 })

And here’s the results:

[
 {
 "orderstatus": {
 "N": "1"
 }
 }
]

So we want to rewrite our SQL to be:

SELECT OrderStatus.N as orderstatus
FROM aws_dynamodb.query
 WHERE $body=(SELECT {
 'ProjectionExpression' : 'userid,orderstatus',
 'ConsistentRead' : true,
 'TableName' : 'orders',
 'FilterExpression': 'orderstatus <> :os',
 'ExpressionAttributeValues': {
 ':os': {'N': '999'},
 ':userid' : {'N' : '111'}
 },
 'KeyConditionExpression': 'userid = :userid'
 })

which gives us the much nicer result of:

[
 {
 "orderstatus": "1"
 }
]

Blank strings not allowed

Consider this update statement. We have three lines of an address that we want to store for an order. We call this like so: api.query([sql as below],{addressline1: '123 Stout', addressline2: 'Suite 127', city: 'Jackson'});.

SELECT * FROM aws_dynamodb.update_item
WHERE $body=(SELECT {
 'TableName' : 'orders',
 'UpdateExpression': 'set addressline1 = :addressline1, addressline2 = :addressline2, city = :city',
 'ExpressionAttributeValues': {
 ':addressline1': {'S': @addressline1},
 ':addressline2': {'S': @addressline2},
 ':city': {'S': @city}
 },
 'Key': {
 'OrderID': {'N': '200' },
 }
})

@addressline1, @addressline2, and @city are all parameters that are passed into this operation. (This buys us maintainability, handles single quotes in values, and protection against SQL injection.) This update succeeds. However, the next order doesn’t have a suite number, and so the addressline2 parameter is the empty string: api.query([sql as below],{addressline1: '123 Stout', addressline2: '', city: 'Jackson'});.

SELECT * FROM aws_dynamodb.update_item
WHERE $body=(SELECT {
 'TableName': 'orders',
 'UpdateExpression': 'set addressline1 = :addressline1, addressline2 = :addressline2, city = :city',
 'ExpressionAttributeValues': {
 ':addressline1': {'S': @addressline1},
 ':addressline2': {'S': @addressline2},
 ':city': {'S': @city}
 },
 'Key': {
 'OrderID': { 'N': '201' },
 }
})

This fails, because DynamoDB doesn’t allow empty strings for attribute values. You can have omitted attributes or you can remove an attribute, but you can’t have an empty string.

However, SQL operations in Transposit don’t have the ability to do empty string checks for parameters that are passed into them. You have two options:

  • provide a default value (perhaps ’null’ or ’n/a’ in this case) and then check for it whenever you are querying the table. This is suboptimal.
  • dynamically create the SQL and exclude any attributes that are empty.

To use the second solution, I build up an object and then removed empty or null strings. This code works as long as the keys of your sql_params object are valid JavaScript identifiers.

// built the object with the attribute names and values
const sql_params = {
 OrderId: 201
 addressline1: '100 S Stout St',
 addressline2: '',
 city: 'Jackson'
}

// remove keys with empty values
Object.keys(sql_params).forEach((key) => ((sql_params[key] === null) || (sql_params[key] === "")) && delete sql_params[key]);

// build up the put statement, including data types
let value_params = {}
let sql = "SELECT * FROM aws_dynamodb.put_item WHERE $body=(SELECT { Item: {";
Object.keys(sql_params).forEach(function(key) {
 value_params.key = sql_params[key];
 let val_type = "S";
 if (typeof sql_params[key] === "boolean") {
 val_type = 'BOOL';
 } else if (!isNaN(sql_params[key])) {
 val_type = 'N';
 }
 sql = sql + " " + key + ": { " + val_type + ": @" + key + "},"
});
sql = sql + "}, TableName : 'orders' })";

// execute the statement
const result = api.query(sql, value_params);

Time to live

Time to live values (TTLs) are a neat feature of DynamoDB. If you need to have rows in a DynamoDB table just disappear, you specify an attribute as the TTL attribute. Then, put a UTC timestamp into that attribute, and DynamoDB will automatically remove that data once the timestamp is in the past. Here’s how to add that timestamp:

const ten_minutes_in_seconds = 600;
const exptime = Math.round(Date.now() / 1000) + ten_minutes_in_seconds;
const exptime_s = exptime+"";
api.run("this.put_sessions",{id: 123, email: 'test@example.com', exptime: exptime});

In put_sessions:

SELECT * FROM aws_dynamodb.put_item
 WHERE $body=(SELECT {
 'TableName': 'sessions',
 'Item' : {
 'SessionID' : { 'N' : @id },
 'Email' : { 'S' : @email },
 'SessionTTL' : { 'N': @exptime }
 }
 })

If someone didn’t have a row in the sessions table, they needed to login again. Sweet. Except, TTLs take up to 48 hours to expire. My experience is that they expire more quickly than that. However, if you need to be absolutely sure, you can check the value of the SessionTTL and make sure it is in the future, rather than just rely on existence of the session.

DynamoDB DDL

Transposit shines for data manipulation. While you can create tables and indexes or do other DynamoDB configuration (setting up TTL, streams) in Transposit using the APIs, it’s probably easier and/or more maintainable to use the AWS console (for one off situations) or whatever infrastructure as code solution (Terraform or Cloudformation) is used in your organization.

However, if you want to create a DynamoDB table every time a new client was added (perhaps for a multi tenant SaaS solution) you could have use a Transposit operation (make sure to protect that endpoint, however!).

Conditional requests

Conditional requests allow you to fail if the state of your data has changed. This is often the case when multiple clients access the same data.

This update will fail if the order status isn’t 0 when the update runs. This makes sure that if there is another client editing the order status, we don’t overwrite their changes.

SELECT * FROM aws_dynamodb.update_item
WHERE $body=(SELECT {
 'TableName': 'orders',
 'UpdateExpression': 'set orderstatus = :os, AddressCity = :city',
 'ExpressionAttributeValues': {
 ':os': {'N': '1'},
 ':prevos': {'N': '0'},
 ':city': {'S': 'Little Rock'}
 },
 'Key': {
 'userid': { 'N': '111' },
 },
 'ConditionExpression': 'orderstatus = :prevos'
})

Note that if you do this, you’ll want to make sure to wrap the SQL calls in an try/catch block and handle the error as you see fit.

try {
 // run the query
} catch (e) {
 // pull the latest data and try to re-process, or emit an error
}

Conclusion

Any application of a certain complexity will require storing state. DynamoDB is a flexible powerful data storage solution that you can use to build complex Transposit applications. Let Transposit power the logic and DynamoDB provide the storage.

Share