Friday, 25 November 2016

Spring Data JPA with a Hash & Range Key DynamoDB Table

DynamoDB is an attractive option for a quick and simple NoSql database for storing non-relational items consistently and safely without the need for setting up hardware, software and configuration for clustering. DynamoDB will scale up perfectly well to cope with large volumes of data. It is extremely quick and easy to set up a table and read write items from other AWS components.

Spring Data for DynamoDB


From an application point of view the obvious choice for access is to use the AWS SDK. For CRUD operations, from some front end gateway such as a REST API, we would want to abstract the boiler plate code of mapping a POJO to a table which is the perfect use case for JPA. I was unaware of any JPA framework which had built in extensions for DynamoDB until I looked at Spring's implementation - Spring Data. There's a nice framework on GITHub, forked from Michael La'Velles original work, which provides a DynamoDB extension with all the annotations for mapping Java domain classes I'd expect.

Mapping Items to the Java Domain Model


As with all database design the first step is to define a unique key with which to uniquely identify each item in the table.  In DynamoDB this is known as the Partition or Hash Key. Often, the items we are dealing with have no unique natural identifier and we need to specify a second identifier, the Sort or Range Key, which, when coupled together with the Partition Key, is unique within the table.

When annotating a domain type we are expected to mark a field as an ID. With Spring Data we only have the option of using the org.springframework.data.annotation.Id annotation. This must be at field level and only one field can be annotated. Therefore, with our compound key model, we are forced to create an identifier class which is embedded into the domain type. This is a common pattern with JPA and relational data models. However, it has knock-on effects with the DynamoDB persistence model and how the domain object is serialized from a JSON document.

In this example I will map a Widget domain type with both Partition and Sort keys. In order to ensure uniqueness by compound key I will create a WidgetId class which is embedded into Widget. However, I don't want this persisted into the store as an object. The keys should be defined as simple number and String types within the JSON schema. The JSON schema I will map is shown below.

Rather than map other domain model attributes I simply want the Widget class to contain a String of JSON data which is dumped directly into single DynamoDB attribute. In reality I would probably map each attribute and tightly constrain the Java model to the JSON schema. But, this approach gives increased flexibility to store whatever data I want. Even so, the lack of constraint increases the risk of persisting bad or invalid data. When persisted into DynamoDB a Widget item looks like this



I need to tell the underlying Jackson mapper to treat the data String as raw JSON rather than deserialize it as a String. Here's the Widget and WidgetId classes annotated with everything Spring Data and Jackson need to serialize and persist single instances.

The annotations on the get/set Hash and Range key values essentially act as proxies to the encapsulated WidgetId instance which is ignored by both DynamoDB and the Jackson serializer. The @Id annotation ensures JPA still uses this class as the embedded identifier allowing us to utilize the compound key.

CRUD Repository


I need to set up the configuration to access the DynamoDB Widgets table. Access control is via IAM so I can either add a key pair of my IAM account to the application or grant the required role to the EC2 instance I'm running this app from. In this example I'm going to give it the access and secret key which, being a Spring Boot application, I add as properties to the application.properties file on my class path along with the end point URL to the DynamoDB service.

Now to define a repository to handle the CRUD operations. The Spring Data framework provides all the functionality needed straight out the box to such an extent that no further code is required. In this example I want to read all the Widgets with a given Partition key value, there may be multiple items because that key alone isn't unique. The interface which extends the Spring Data org.springframework.data.repository.CrudRepository simply needs another method defining. Spring Data will interpret this from it's name, parameters and return type and formulate the query at runtime.


Use MVC to Create a REST API


It's pretty straight forward to add the dependency for Spring MVC and create a simple REST Controller to implement GET and PUT, read & write, methods.

Securing the End Points


As with any web application I need to secure the REST end points. I'll do this with Basic authantication, a simple base64 encoded user and pass on the Authorization header of the request. This is again very simple to achieve with Spring Security and the convention over configuration of Spring Boot. Simply extend the WebSecurityConfigurerAdapter class, override the configure method and set the required authority for the specific paths. In my case I just want to force any request to either method exposed by the controller to be authenticated. Spring Security will create a default user/pass and print it out to the system log on boot up. I can override this by adding the credentials into the application.properties file. In a production system I'd add my own UserDetailsService and load those credentials from store of some kind.

End-to-End Testing

Postman is my favourite tool for testing basic REST APIs, here's an example running the GET to return the data I wrote into the widgets table ealier



Complete code for the widget example can be forked from this repository

14 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Hello John,

    A really interesting, clear and easily readable Spring Data JPA with a Hash & Range Key DynamoDB Table of interesting and different perspectives. I will clap. So much is so well covered here.

    I know this question has been answered numerous time, but I can't seem to find a right Linux OS for it. Zombie is a process state when the child dies before the parent process. In this case the structural information of the process is still in the process table.
    I have this dinosaur laptop that I got from my aunt, and she was using it for engineering but decide to give it to me. Now that I have this laptop and check its spec.
    It was this one.
    Awesome! Thanks for putting this all in one place. Very useful!

    Cheers,
    Kevin

    ReplyDelete
  3. JavaScript is the most widely deployed language in the worldhighly informative and professionally written and I am glad to be a visitor of this perfect blog, thank you for sharing this information.

    angular js training in chennai

    angular training in chennai

    angular js online training in chennai

    angular js training in bangalore

    angular js training in hyderabad

    angular js training in coimbatore

    angular js training

    angular js online training


    ReplyDelete
  4. Excellent…Amazing…. I’m satisfied to find so many helpful information here within the put up,for latest php jobs in hyderabad. we want work out extra strategies in this regard, thanks for sharing.

    AWS Course in Bangalore

    AWS Course in Hyderabad

    AWS Course in Coimbatore

    AWS Course

    AWS Certification Course

    AWS Certification Training

    AWS Online Training

    AWS Training

    ReplyDelete
  5. It is really a great and useful piece of info. I’m glad that you shared this helpful info with us. Please keep us informed like this. Thank you for sharing.

    IELTS Coaching in chennai

    German Classes in Chennai

    GRE Coaching Classes in Chennai

    TOEFL Coaching in Chennai

    spoken english classes in chennai | Communication training


    ReplyDelete
  6. Am really impressed about this blog because this blog is very easy to learn and understand clearly.This blog is very useful for the college students and researchers to take a good notes in good manner.Thank you for sharing.
    For More...
    Data Science Training In Chennai

    Data Science Online Training In Chennai

    Data Science Training In Bangalore

    Data Science Training In Hyderabad

    Data Science Training In Coimbatore

    Data Science Training

    Data Science Online Training

    ReplyDelete
  7. A good blog always comes-up with new and exciting information and while reading I have feel that this blog is really have all those quality that qualify a blog to be a one.
    artificial intelligence course in bangalore

    ReplyDelete