Monday 12 December 2016

Jackson Type ID Mapping

Mapping JSON to a Domain Model

Mapping class types directly to a JSON schema with annotations is always a preferable solution for (de)serialization as it's so straight forward. Jackson makes this even simpler and happens automatically without the need to annotate simple domain types, the parser simply uses class, method and field names to generate some suitable JSON. Where we don't wish to bind the JSON schema directly to concrete classes, or where some legacy code or schema must be mapped to a new domain structure we are forced to customize the mapping. This can be done in many ways and we could override the toString method of our domain classes and hard code the JSON schema. Separating the JSON schema away from concrete domain objects has obvious benefits, especially when using UI frameworks such as Angular which heavily depend on JSON from the server side. Mapping becomes difficult in this situation because we need to tell the parser what type of object to map to when it receives some data. Unless we include the class type name within the data, which would again bind our data to the model, we must tell Jackson how to find a suitable type to use.

In this example a REST controller consumes and produces JSON which represents a user. The user is defined as an interface which is implemented in different domain model packages. This separates the controller from the underlying implementation and has clear advantages. However, Jackson must still know which concrete domain model object it is mapping the JSON to and this should be configurable so that the controller is not bound in any way to the model.

Defining the User interface is straight forward but we must annotate it to tell the parser where to find the identifier within the data structure and what functionality to use to determine the mapping.

Now we need to define the functionality to tell the Jackson parser how to resolve the class type. A possible solution would be to hard code the mapping somewhere within the domain model package or within the concrete type itself. But, what if we have multiple implementations of the model and wish to configure which type is mapped at runtime? Jackson provides the com.fasterxml.jackson.databind.jsontype.TypeIdResolver stereotype, and its abstract sub type TypeIdResolverBase specifically for this use case. In our case the UserTypeIdResolver extends the TypeIdResolverBase. There are two functions which must be implemented to allow Jackson to map the ID within the data to a specific class type.

Serialization

The TypeIdResolver enforces a function to generate the JSON ID property from the underlying concrete domain model type. The JSON property @type contains this value, as defined by the property attribute in the JsonTypeInfo annotation. The obvious format for this value would be the class name. However, this could be defined by some legacy schema or be something generic which doesn't relate to the Java domain model. In this example let's assume the property value is just 'User' and we'll formulate that value by trimming of the package name and any suffix from concrete class name assuming the usual class naming convention. In this example the User interface is implemented by com.johnhunsley.user.jpa.UserJpaImpl

The package name and class name suffix - 'JpaImpl' is predefined and injected into the resolver using the usual Spring property placeholder. Currently the Resolver is not a managed bean, it is instantiated by Jackson at runtime when the parser is invoked. Therefore, we must annotate the class with the @Configuration to tell Spring that instances of this class should be managed within the default context to allow configuration to be injected from the properties file.

The function to generate the ID value from the domain model class name boils down to simple String manipulation with some sanity checking to ensure the class name format fits the usual naming convention

Deserialization

The reverse operation is again enforced by the TypeIdResolver and is again a simple matter of String manipulation. In this case we construct a class name from the injected package name, suffix and ID value from the JSON property.


Configuration

Spring Boot automatically provides a Property Placeholder Configurator which uses the default application.properties file from the class path. Therefore, configuration of the TypeIdResolver is a case of adding the domain package and suffix properties to this file. 

So long as the com.johnhunsley.user.jpa.UserJpaImpl concrete domain type is on the class path Jackson will (de)serialize it from/to JSON which contains the @type:User value. There's no need to add any specific Jackson annotations or explicit mapping for this class

A complete example which uses this function for abstracting the mapping between an interface and concrete implementation can be found within the following repositories -

simple-user-account  - Defines the User model interfaces
simple-user-account-jpa - Contains the concrete implementations of the model specifially for Jpa
simple-user-account-api - Defines the REST Controller and configuration for the application

Friday 25 November 2016

Spring Data JPA with a Hash & Range Key DynamoDB Table

DynamoDB is an attractive option for a quick and simple NoSql database for storing non-relational items consistently and safely without the need for setting up hardware, software and configuration for clustering. DynamoDB will scale up perfectly well to cope with large volumes of data. It is extremely quick and easy to set up a table and read write items from other AWS components.

Spring Data for DynamoDB


From an application point of view the obvious choice for access is to use the AWS SDK. For CRUD operations, from some front end gateway such as a REST API, we would want to abstract the boiler plate code of mapping a POJO to a table which is the perfect use case for JPA. I was unaware of any JPA framework which had built in extensions for DynamoDB until I looked at Spring's implementation - Spring Data. There's a nice framework on GITHub, forked from Michael La'Velles original work, which provides a DynamoDB extension with all the annotations for mapping Java domain classes I'd expect.

Mapping Items to the Java Domain Model


As with all database design the first step is to define a unique key with which to uniquely identify each item in the table.  In DynamoDB this is known as the Partition or Hash Key. Often, the items we are dealing with have no unique natural identifier and we need to specify a second identifier, the Sort or Range Key, which, when coupled together with the Partition Key, is unique within the table.

When annotating a domain type we are expected to mark a field as an ID. With Spring Data we only have the option of using the org.springframework.data.annotation.Id annotation. This must be at field level and only one field can be annotated. Therefore, with our compound key model, we are forced to create an identifier class which is embedded into the domain type. This is a common pattern with JPA and relational data models. However, it has knock-on effects with the DynamoDB persistence model and how the domain object is serialized from a JSON document.

In this example I will map a Widget domain type with both Partition and Sort keys. In order to ensure uniqueness by compound key I will create a WidgetId class which is embedded into Widget. However, I don't want this persisted into the store as an object. The keys should be defined as simple number and String types within the JSON schema. The JSON schema I will map is shown below.

Rather than map other domain model attributes I simply want the Widget class to contain a String of JSON data which is dumped directly into single DynamoDB attribute. In reality I would probably map each attribute and tightly constrain the Java model to the JSON schema. But, this approach gives increased flexibility to store whatever data I want. Even so, the lack of constraint increases the risk of persisting bad or invalid data. When persisted into DynamoDB a Widget item looks like this



I need to tell the underlying Jackson mapper to treat the data String as raw JSON rather than deserialize it as a String. Here's the Widget and WidgetId classes annotated with everything Spring Data and Jackson need to serialize and persist single instances.

The annotations on the get/set Hash and Range key values essentially act as proxies to the encapsulated WidgetId instance which is ignored by both DynamoDB and the Jackson serializer. The @Id annotation ensures JPA still uses this class as the embedded identifier allowing us to utilize the compound key.

CRUD Repository


I need to set up the configuration to access the DynamoDB Widgets table. Access control is via IAM so I can either add a key pair of my IAM account to the application or grant the required role to the EC2 instance I'm running this app from. In this example I'm going to give it the access and secret key which, being a Spring Boot application, I add as properties to the application.properties file on my class path along with the end point URL to the DynamoDB service.

Now to define a repository to handle the CRUD operations. The Spring Data framework provides all the functionality needed straight out the box to such an extent that no further code is required. In this example I want to read all the Widgets with a given Partition key value, there may be multiple items because that key alone isn't unique. The interface which extends the Spring Data org.springframework.data.repository.CrudRepository simply needs another method defining. Spring Data will interpret this from it's name, parameters and return type and formulate the query at runtime.


Use MVC to Create a REST API


It's pretty straight forward to add the dependency for Spring MVC and create a simple REST Controller to implement GET and PUT, read & write, methods.

Securing the End Points


As with any web application I need to secure the REST end points. I'll do this with Basic authantication, a simple base64 encoded user and pass on the Authorization header of the request. This is again very simple to achieve with Spring Security and the convention over configuration of Spring Boot. Simply extend the WebSecurityConfigurerAdapter class, override the configure method and set the required authority for the specific paths. In my case I just want to force any request to either method exposed by the controller to be authenticated. Spring Security will create a default user/pass and print it out to the system log on boot up. I can override this by adding the credentials into the application.properties file. In a production system I'd add my own UserDetailsService and load those credentials from store of some kind.

End-to-End Testing

Postman is my favourite tool for testing basic REST APIs, here's an example running the GET to return the data I wrote into the widgets table ealier



Complete code for the widget example can be forked from this repository

Monday 4 April 2016

Micro Services - Spring Boot, Eureka and Feign - Part 2

In the last post I set up a very basic 'Hello World' Spring Boot REST service running on an EC2 instance. One of the corner stones of Micro Services and Spring Boot is auto discovery of services from a client. In this post I will demonstrate the use of Eureka and Feign to register the Hello Service and automatically discover and call it from a client. Eureka, created by Netflix, runs in the same environment as the micro services. You are likely to want multiple environments - Dev, Test, Production, each with a different set of the services and end points. In a cloud environment, those services and their end points are transient. You may well be spinning up and tearing down instances many times a day in Dev. In production, you may well be autoscaling services across multiple availability zones. The importance of self registering services and auto discovery is evident.

  1. The Spring Boot Hello Service is modified to register itself with another Spring Boot application configured as a Eureka server.
  2. Another Spring Boot app, which implements the Feign client to call the Hello Service, queries the Eureka server asking for the end point of the Hello Service. 
  3. Eureka returns the details of the Hello Service to the client.
  4. The client makes requests to the Hello Service without any prior knowledge as to its whereabouts.

Create the Eureka Server


First of all we'll create and launch the Eureka server in the same environment our Hello Service is running. In a Dev environment we'll probably just want one single Eureka instance but in Production we will want some kind resilience. Eureka handles this with the ability to replicate the service across availability zones. For the purposes of simplicity I won't be covering that in this blog and will just set up a single Eureka instance in a single availability zone.

Eureka is itself run as a simple Spring Boot application. Again, we'll start with the pom.xml and import into an IDE. As with the Hello Service; the key is the parent. This gives us everything we need. This time, I will also pull in the Spring Cloud pom which is a parent of the Eureka server I will implement within the application.

Import this into the IDE, set up a simple package structure and create a class with a main method. This is pretty much exactly the same as our Application class from the Hello service although this time we add the EnableEurekaServer annotation.

As with our previous Spring Boot example I can add an application.properties file to the classpath which contains any custom configuration I require. By default the Eureka server will attempt to register itself as a service. It's a simple case to turn that off. I've also added some logging configuration to quiet things down a bit and specified a point through which services and clients will 'talk' to the Eureka service.
And that is it!!!!! Spin up another micro instance, deploy the build and execute it as before. One thing to note is that you'll need to ensure your security groups and VPC subnet configuration allows the Hello Service and Eureka instances to communicate. Obviously, there are many ways to set this up and you'll probably have your own security configuration. As this is just a Dev example, I launched my instances into the same subnet and opened up 8761 to any other instance with the same security group association.

Register the Hello Service with Eureka


Now we'll edit the Hello Service from the previous article and make it register itself with Eureka so that clients can discover its location. To do this we must enable the application as a Eureka Discovery Client with another annotation. Currently, our Hello Service app only has a dependency on Spring Boot. We need to reimport the build file and add a dependency on Spring Cloud and Eureka.

We can add the annotation to the Hello Service Application class telling it to register with Eureka.

By default, the Eureka client will attempt to locate Eureka on localhost. We need to tell it to find Eureka on the instance we are running it on. Depending on your AWS set up you may add the private ip, Elastic ip or domain as a property to the application.properties file in the Hello Service classpath.
As you can see, there's an additional configuration to tell Eureka to register the Hello Service by IP address. By default, Eureka will register the machine name. On AWS that's the instance ID. We could force Eureka to register the service with a configured value such as a load balancer or Route 53 domain. As this is only a basic example and the service, client and Eureka server are all in the same subnet, the private IP address of the Hello Service will do.

Start the Eureka and then redeploy and reboot the Hello Service. Watch the logs, you'll see the Hello Service start and after a few seconds it will report a successful registration with Eureka. We can now open a browser and point it at Eureka. This will present us with the management UI and show our Hello Service as the only registered service along with the IP address where the end point can be found.

Eureka UI showing the Hello Service with registered IP address (blurred)

Create a Hello Service Client 

Now the Hello Service is registered with Eureka we can create a client which will call the Hello Service without having any prior knowledge of its whereabouts. To do this we will again use Spring Boot, Spring Cloud and also Feign which will discover the Hello Service end point from Eureka. Once again we'll create a simple Spring Boot project starting with the pom.xml below

Once imported created another Application class, annotated as a SpringBootApplication. We also annotate this class to tell Spring it's a Eureka and Feign client.

We now need to create two more Java types to enable the application to make a call to the Hello Service. First of all, create an interface annotated as a FeignClient. This interface has one method for each service end point or operation, annotated with a RequestMapping bind. This might not seem so obvious because a RequestMapping annotation might more commonly be found on a controller. The presence of this annotation might make you think it is serving the mapping but it simply binds it to the method. Feign understands this and knows to route calls to the method to the remote Hello Service, at the end point discovered from Eureka with the path and mapping described in the annotation. The FeignClient annotation has a value of the name of the service we want to call. Our Hello Service application is named hello-service and we can see the name in the Eureka registry. using this name, Feign will contact Eureka and discover the end point IP address.

We now need a runnable class, Spring Boot supplies the CommandLineRunner stereotype which enforces implementation of a run method. This is called following successful initialization of the Spring Boot application. Our CommandLineRunner implementation simply calls the HelloClient method so that, in this example, the web service call to HelloService is made once on boot up.

Lastly, we just need to the application.properties on the classpath. The only configuration we need is to tell the application the location of Eureka server which the EnableEurekaClient annotation will pick up and use.

Nice and simple! If you like, you can build and deploy this and run it in the same way as the other two Spring Boot apps or just run the main method from your IDE. You should see the Client application boot up, contact Eureka, discover the Hello Service end point and call it, which will return the Hello World message. All accomplished without any pre configured address information of the Hello Service itself.

Micro Services - Spring Boot, Eureka and Feign - Part 1

Micro Services


The term 'Micro Service' is the current flavour of the month within software circles. However, this approach to architecture; splitting up applications into small units of functionality is nothing new. As developers and architects we always look to decouple our systems into single, simple units of functionality. This is a basic principle of OO. The philosophy of Micro Services is to decouple units of functionality entirely. Each unit should effectively standalone in its own environment with as little, or no, dependency on others. Again, this isn't a new approach. Since the advent of web services, inparticular REST, we have decoupled our systems into simple, stateless, independently scalable applications. The advantages of this are clear but as our systems grow into transient environments it becomes difficult for each application to keep track on service end points of other systems. The big win of the Micro Service approach is automatic discovery which helps DevOps solve the headache of configuration across the ecosystem.

In part 1 of this blog I will use Spring Boot to set up a Micro Service very quickly and easily and run it in the cloud. In part 2 I will configure this service to register itself for auto discovery and create a client which has no configuration dependency on the service. This is accomplished with Spring Cloud, Netflix Eureka and Feign; three important frameworks which make up the basis of Spring's Micro Service implementation.

Creating a Micro Service with Spring Boot


Spring Boot allows us to create a deployable application at the drop of a hat. There's no need for a external web server or servlet container, Spring Boot takes care of all that and allows us to just execute a jar file to run a web service. Spring Boot employs the common Spring philosophy of convention over configuration, often to the extreme. This results in an annotation rich framework with little, or no, xml or properties to configure.

To demonstrate this we'll start with a simple REST service which will return a 'hello world' string to a GET request. As with all applications, start with the Maven build file - pom.xml and import into your favorite IDE.

The key here is the parent - spring-boot-starter-parent. This brings in all the dependencies and environment configuration you need to run a Spring Boot app. The only additional dependencies I've added are the web and test packages because this application will be a web app for which I want to write some unit tests. Once imported, create a package structure and add a class name 'Application' with a main method.

By default, the SpringBootApplication annotation enables component scan for its own and sub packages. Therefore, it is convention to add our other components to packages below this one. Create a web and service package for our MVC controller and service layer classes.




The MVC controller and service are both very simple as you'd imagine. Amongst other things, Spring Boot includes automatic configuration of Spring MVC, which is again enabled by the SpringBootApplication annotation. This means its a simple case of just annotating the controller as a RestController and adding the mappings. The HelloService simply returns a string saying hello

That pretty much concludes the service. The only other addition is a properties file. By default, Spring Boot will look for a file named application.properties on the root classpath. This file will hold any other configurations which are required in subsequent parts of this blog. For now, we just add a name for the service, like so.
To run the application as a web service we need an environment hosting the JDK 1.8. No application server is required as Spring Boot has an embedded pre-configured tomcat. If you have an AWS account, spin up a micro instance, ensure you are running Java 1.8 and add the following line as user data to run the HelloService
Open a browser and navigate to the instance on the default 8080 tomcat port. The browser, by default, will issue a GET request and return the message from the service.

Thursday 10 March 2016

Custom CloudWatch Tomcat JVM Metrics

If you use tomcat as your usual Servlet container for your Java web applications on EC2 then you may wish to monitor Java Virtual Machine stats and apply notification alarms which can be triggered should the application run into trouble. There are many JVM monitoring tools out there which range from free to expensive, simple to complex. I decided to try and create a simple and free solution using CloudWatch and the AWS Command Line Interface, (AWS CLI), to send JVM stats to CloudWatch as custom metrics. Once the custom metrics have been configured CloudWatch does the rest and provides a monitoring platform on which alarms and notifications can be easily created and managed.

To set this up you will need to do the following:

  • Have an AWS account
  • Set up a IAM Role to be applied to a running instance which you want to monitor
  • Start EC2 instance hosting Tomcat which has outbound access to the internet
  • Download the AWS CLI package
  • Create a script to extract the JVM stats and publish to CloudWatch using the CLI
  • Set execution of the script in cron to publish the metrics at your desired frequency

Assuming you already have an AWS account and know your way around, create an IAM role which will be assigned to the EC2 instance you want to monitor. This should allow applications on the instance to manage metric data

Next, ensure you have a VPC with a subnet allowing access to the outside internet. You will need a subnet which assigns a public IP to launched instances, an internet gateway and a route table which directs all outbound traffic from instances within the subet to that gateway - 0.0.0.0/0 - igw-my-gateway-id.

Launch an instance into the subnet as usual and assign the new IAM role. If you already have a base image containing a tomcat installation then use that. Otherwise, install tomcat as you wish. I won't cover this in this post. Once running open a session and install the AWS CLI package.


You can confirm the installation is good by running aws help which will show the manual. Now create a script to get the JVM stats from your running tomcat instance and use the CLI to send to CloudWatch. You may well want to run this on multiple instances so it makes sense to ascertain the instance id at runtime and use this when you define the metric name in the CLI command. To discover the instance id you'll need to make an HTTP request and interrogate the meta data, assign this to a variable in your script which will be used in the CLI command later.


Locate the catalina PID file which contains the process ID of your running Tomcat instance. Depending on your version and installation this can be in various places. The PID file contains the id of the running Tomcat process. There are various other ways you could ascertain this using the jps or ps commands.


Use the jstat command to print a snapshot of the JVM memory performance statistics. These are pretty printed so there has to be a bit of String manipulation to pick them out. Use the AWS CLI command to invoke the CloudWatch service which makes a PUT request with the substituted data. In the example below I've created a metric name for each jstat value from this EC2 instance, identified by the instance id. Here's the full script -


It's then a simple case of adding this script to a cron schedule. You'll then be able to login to the AWS console, navigate to CloudWatch and select the metrics. From here, you can easily create the alarms and notifications for different threshold values of the metric data.

Saturday 13 February 2016

Elasticsearch EC2 Autoscaled Implementation

Elasticsearch has now superseded Solr as the defacto search server. Both systems have Lucene at the heart, Elastic has various benefits and features above Solr but this post will focus on implementing a scaleable search service cluster on an AWS. In a previous post I outlined a similar solution for a SolrCloud cluster. Although this is robust it isn't truly scaleable because each instance needs to be preconfigured to have reference to the other. It may be possible to get around this, inject configuration at runtime and autoscale the cluster. However, Elasticsearch makes this very easy to do directly out of the box and create a stateless, highly scalable search service cluster in a fraction of the time.

Amazon offers two search systems as Platform as a Service: CloudSearch and Elasticsearch Service. Unsurprisingly, these are essentially SolrCloud and Elasticsearch implemented on EC2 behind the scenes and offered as a service. Of course there are advantages to using these but if you want complete control over your search cluster implementation and functionality then you need to build your own from the ground up. The good news is that this is very simple but there are some stumbling blocks which are not obvious in any documentation I've come across.

Setting up the Environment


Before getting into the Elasticsearch configuration we need to focus on the environment. You will probably already have your own VPC and network configuration for Dev, Test and Production. The cluster is a simple set up; Elasticsearch instances running in each availability zone behind a load balancer serving HTTP requests. Each instance needs to know about the other in the cluster and this is done by automatic discovery. Each instance must be associated to the same security group which is used to control access and importantly for filtering discovered instances on the same network. This will be covered later.

Autoscaled cross availability zone Elasticsearch EC2 instances with associated volumes 


Elasticsearch instances will, by default, be capable of automatically discovering each other and forming a dynamic cluster. This is accomplished using network multicast. However, this is disabled in AWS and most other cloud hosts and prevents Elasticsearch from clustering out of the box. To create the cluster we must install the Cloud-aws plugin which utilizes the AWS-API for unicast discovery of other Elasticsearch instances on the network. In order to use this we must provide IAM account credentials in our Elasticsearch configuration file. The account must have permission to use the EC2 service by granting - ec2:DescribeInstances

Installing Elasticsearch from the Package Manager


Once happy with the network setup we can begin building out the temporary instance which will be used to form the baseline image. Many of the tutorials and examples I have come across built out on Ubuntu but I decided to go with the initial Amazon Machine Linux baseline. I started a t2.medium instance and added an additional volume of 25Gig. This will be mounted on a path where we will store the indexed data so size is dependent on what you intend to index and how you want to shard it.

The easiest way to install Elasticsearch is from an rpm which can be downloaded the Elastic.co web site and will need to be added to the package manager
Edit the file and add the Elastic repository
Now install Elastic from the package manager, run it as a service and set the JVM heap size as an environment variable. Add the following line, set the heap size to whatever you wish, within the constraints of your desired EC2 instance type -

Installing The AWS Plugin and Configuring Elasticsearch


Now configure Elasticsearch itself. First, install the cloud-aws plugin which is required to enable auto discovery.
Now edit the main elastic yaml config file found in /etc/elasticsearch/elasticsearch.yml With the exception of logging, all settings are configured in this file. The intention is to avoid any predefined state, such as hostnames or IP addresses, which will require any manual intervention at runtime. There are many properties which could be configured here, however the configuration outlined below is enough to get a basic cluster up and running on EC2.
Now start the Elasticsearch service to ensure it boots up and runs on the configuration we have made. By default the logs are set to level INFO, if you edit the logging.yml file prior to starting and set the default level to DEBUG you'll be able to get some more meaningful information out should you come across any problems. When running the server will log to /var/log/elasticsearch/elastic-cluster.log
You should now be able to make client HTTP requests to the running service either from your local browser or the command line and get information about the cluster and nodes. This will report the state of the cluster with a JSON response and should look similar to this

Create the Autoscaled Cluster


Assuming the service starts without a problem you will now have a single instance from where you can create the image which will be used in the launch configuration. Simply image the running instance and name the AMI appropriately.

Create an Elastic Load Balancer, set the ping check to root on port 9200. Add a listener to receive on port 80 and forward to 9200 on the instances. Either create anew security group or use the same one associated to the instance you created earlier. If creating a new one ensure the instance security group has reference to it so that traffic from the load balancer can make contact with the instances. Enable cross Availability Zone balancing so we can ensure the cluster is highly available.

Create the Launch Configuration and associated Autoscaling Group. these are straight forward, the Launch Config should simply instantiate the image we created earlier on a suitable instance type. I chose t2.medium which gives me ample memory to designate the 1 gig heap size I defined in my environment variable. associate the instance security group and create the Autoscaling Group which you can play with to scale as you require. I just set it to maintain a cluster of 3, which should ensure we have at least one instance in each Eu-west zone. If you wish to launch your instances with an associated volume on which to store the search index files then you must specify the volume in the launch configuration and pass the mount command as user data to the launching instance. You should also specify the path.data value in the elasticsearch.yml file with reference to the directory where the volume is mounted.

Once the cluster has initialized you should be able to make another HTTP request to the load balancer and confirm the number of running nodes This will report the state of the cluster with a JSON response and should look similar to this