Saturday 9 July 2016

Scala Slick - dealing with large tables of more than 22 columns

Slick is a tool for managing access to a database in Scala. So far I have mainly quite liked it, but have also found that documentation can be a bit lacking in some areas, especially around managing tables with more than 22 columns, or mapping your columns into a nested case class structure.

This post attempts to stitch together suggestions from a few different sources so you can see what the options are and which may be best suited to your use case. This blog is supported by code examples that all compile and form part of this project:
https://github.com/timgent/spray-slick-template

The simple case - mapping a small, flat case class

As with all good products Slick's tutorial starts off with an example that makes everything seem straightforward and easy - mapping from a small number of columns to an unnested case class.



As you can see we just define our case class, create a table based on this with it's respective columns, and finally define a * projection.

The * projection is there to provide a mapping from the columns, to a tuple, to your case class. It first sets out the columns you are mapping from, and then takes 2 functions - one to map from the tuple to your case class (hence the use of the tupled function on the case class here), and one to map from the case class back to a tuple (which is exactly what unapply does)

Oh no! More than 22 columns!

This approach stops working as soon as you have more than 22 columns. You can see this as Case Classes stop having the tupled and unapply methods as soon as you have more than 22 columns, scuppering our earlier simple approach.

But fear not, there are 2 fundamental approachs to dealing with this - using nested tuples and nested case classes (which come in 2 flavours), or using HLists.

Using Nested Tuples and Case Classes

Our challenge comes from having too many fields in our tuple and case class so one simple solution is to nest them. This means on each nested case class we can still use the unapply and tupled methods, though it does add a little boiler plate for us.

 

Using projections for each case class

This is my preferred method. We have to:
  1. Create our nested case class
  2. Group our columns to match the nesting of our case class
  3. Create projections for each nested case class
  4. Use these projections in our * projection
  5. The mapping functions as you can see remain fairly simple
The gist is here

 

Using custom mappings

You can just use your own custom mapping functions, though personally I find this can get quite messy quite quickly. In particular when you have a compile error listing over 22 types it is rather confusing.

The steps are the same as above except we don't create projections for each nested case class, which means our mapping functions have to do all the unapplying and tupling themselves.

Gist here

Using HLists

An alternative to all this is using HLists (essentially lists with types for each element).

 

Plain HLists (no mapping to case class)

Another method I favour is using plain HLists - fairly simple with minimal boilerplate.

This is as simple as having the right imports and just defining a * projection with your columns all stuffed in a HList.

Gist here

 

HLists with a mapping to a large case class

This example takes it to the extreme of using HLists instead of tuples, but then using a custom mapping to transform this to a case class.

The advantage is you can use a case class with more than 22 fields, but it does require a little more boiler plate.

The steps are:
  1. Create our large unnested case class
  2. Create our columns as usual (no nesting required)
  3. Define a * projection, with custom mappings to go from HList to our Case Class, and vice versa
Gist here

Monday 2 May 2016

Documenting APIs

I've been using Swagger for documenting a REST API, but I've found it has a few shortcomings. I was inspired by this talk to look at other methods for generating API documentation. We've ended up using API Blueprint and I wanted to share some thoughts.

What I look for in a tool for documenting an API

The things that are important to me are that any tool:
  1. Allows me to create clear documentation for users
  2. Ensures the documentation is kept up to date (for example by generating docs from code or tests, or from the docs themselves being testable)
  3. Allows generation of a mock server to make it easy for my consumers to test against

Swagger

Swagger generates a swagger spec of your API from your code. This spec can then be used to produce pretty decent docs, generate mock servers, and more.

The docs generated by Swagger are pretty good, and are a pretty common standard now. There are also a number of tools that read and do useful things with a swagger specification.

The good:
  • Swagger specs are (mainly) generated from code, so pretty easy to keep up to date
  • The swagger endpoint for viewing documentation is particularly good given it takes almost no work to set up
  • There are tools available to generate a mock server from Swagger docs
The bad:
  • Swagger doc generation relies heavily on annotations. These annotations are easy to let get out of date. No tests will fail if this happens so there's no way to detect it except someone noticing when trying to use your docs. This is a major downside in my view
  • Swagger docs are quite inflexible. For example if you want to document a service with more words to explain context, or organise your docs differently than by route, it isn't possible to do that. Albeit you could write a tool that extracted examples from a swagger specification and put them in asciidoctor files for you to embed in a more readable document...
  • Generated mock servers only allow for one response per endpoint, limiting their usefulness in testing
Conclusion on Swagger
It's a really good tool, but many of the shortcomings here are fundamentally because of the approach to generate the spec from code. I still recommend it, but think in the long run other tools will win out.

 

API Blueprint

API Blueprint lets you write API documentation is a set format for a markdown document. This gives you some but not loads of flexibility for the format of the docs.

As with Swagger the spec can then be used for other things, including running tests, generating prettier html docs, and running a mock server.

The good:
  • Having made the tests part of the CI pipeline we now always know if the spec is out of date
  • Dredd (the testing tool) is fairly flexible as it allows you to run custom hooks before the tests, for example I used these to do an oAuth login required for accessing our API in dev
  • As with Swagger some very good tools around the specification, including converting from Swagger spec to API Blueprint spec if you just want to try it out
The bad:
  • The next thing on their roadmap is being able to have multiple requests and responses per endpoint. As with Swagger this is much needed as currently it limits the tests you can run, and the usefulness of the mock server
  • Docs are a little more flexible than Swagger, but could do with even more flexibility. You tend to need to split it up into separate markdown files if you have large responses

Spring RESTDocs

I do agree with the talk posted above - having tests generate document snippets to include in a larger document for your API seems a great way to produce documentation. Unfortunately I'm not using Spring so that tool is unavailable. I would love to see a testing tool that isn't coupled to a particular framework that can generate documentation snippets. However API Blueprint or Swagger could potentially both generate documentation snippets, so perhaps this advantage is overplayed.

Spring RESTDocs does also give much greater flexibility with testing.

On the downsides I can see it being harder to generate a mock server using this approach, though this isn't a high priority for many people.

Wednesday 14 October 2015

Vaultconf - managing vault logins for your kubernetes applications

In my last post I gave an overview of how you can use vault and kubernetes together. In this post I want to show how to use vaultconf to configure credentials for your kubernetes applications so they can read secrets from vault.

 

Before you start you will need:

  • Vault v0.3
  • A kubernetes cluster, with local configuration allowing you to talk to the API
  • Docker

 

Configuration documents

vaultconf is designed to allow you to reconcile users and policies between configuration files and vault. Lets look at how to set these up:

 

Policy configuration

Policies set out rules for what can be accessed in vault. Policies should all be contained in a policies folder, whose structure should look like this (replacing the names with whatever you like):
  • policies
    • mynamespace1
      • policy1.yaml
      • policy2.yaml
    • mynamespace2
      • policy3.yaml
Please see the vaultconf test/resources folder for examples of vault policies in yaml.

 

User configuration

Users should be contained in a users.yaml file. It defines which users you want in which namespaces, and what policies those users should have. Again you will find an example in the github repo.

 

Vault server

If you haven't got a vault server setup you can start one easily by running:
$ vault server -dev

 

Vault setup

Before you are able to use vaultconf you will first need some things setup in vault:
  • Set your VAULT_ADDR and authenticate yourself. If using the dev vault server simply do:
    • $ export VAULT_ADDR=http://127.0.0.1:8200
  • Enable the userpass auth backend:
    • $ vault auth-enable userpass
  • Create a user for yourself with root access:
    • $ vault write auth/userpass/users/myusername password=mypassword policies=root

 

Kubernetes setup

Before using vaultconf ensure your kubernetes context is set to use the cluster you want the username and password secrets adding to. You will also need to ensure any namespaces are already created within this cluster.

 

Creating policies

The following command will add your policies to vault. Note the --net=host is only needed if you're connecting to a vault server running on your local machine.

$ docker run --net=host -v test/resources/policies:/policies quay.io/timgent/vaultconf:v0.1 policies -c /policies -u myusername -p mypassword -a http://localhost:8200

You should now be able to see your policies in vault:
$ vault policies
dev_myproject_reader
dev_myproject_writer
uat_anotherproject_apolicy
uat_myproject_reader
uat_myproject_writer
root

 

Creating users

IMPORTANT NOTE!
vaultconf will add kubernetes secrets for the vault usernames and passwords on whichever kubernetes cluster your .kube/config file is currently set to use.
 
$ docker run --net=host -v ~/.kube:/root/.kube -v ~/WORK/vaultconf/test/resources/users:/users quay.io/timgent/vaultconf:v0.1 users -c /users/users.yaml -u myusername -p mypassword -a http://localhost:8200

You should now be able to see the secrets in kubernetes:

kubectl --namespace=dev-myproject get secrets
NAME            TYPE      DATA
mrread-vault    Opaque    1
mrwrite-vault   Opaque    1

Using the things you've created

Your applications can now mount in these secrets to gain their credentials for accessing vault. The file will look something like:
{"username":"myNamespace_testUser","password":"testPassword","method":"userpass"}

I hope to follow up with an example of how the vault-sidekick container can then use these credentials to talk to vault and make secrets available to your applications.

Sunday 4 October 2015

Secret management with Vault and Kubernetes

Some introductions...

First a little introduction to Kubernetes, which may require a little introduction to Docker. Skip ahead if you are already familiar with these :)

 

Docker

Docker is a containerisation engine, similar in concept to a virtual machine, but more efficient as it allows all containers on a single machine to share the hosts resources, by all making use of the hosts kernel. To take advantage of this applications built with docker will typically have multiple containers running on each host.

To deploy docker applications you first package them up into a Docker image, which contains everything that is needed to run your application with docker, massively reducing configuration management headaches. I definitely recommend reading more about Docker :) Just note it needs to run on a linux machine, which if you are on Mac or Windows means you will need a linux VM first. There are lots of tools to help you get setup though.

 

Kubernetes

In order to deploy and manage large groups of docker containers Google have developed Kubernetes. The ultimate aim is that once your kubernetes cluster is setup you can ask it to deploy one of your docker images and it will choose which host(s) to put it on, it will handle failure of an instance of the application, it has tools for rolling updates, and lots more goodies, all designed to take away the hassle of deploying and managing applications. It is however still pretty new technology and has a way to go to mature.

 

Secret management

One question with Kubernetes, as with many other systems for managing applications, is how to manage the secrets that your applications need. For example certificates for terminating TLS encryption. Or credentials for accessing things from your cloud provider, such as an Amazon S3 bucket. Or database credentials. The list goes on.

The ideal is these secrets are all short-lived, so that if someone manages to compromise them they will expire in short order anyway.

 

Vault

Vault gives you a good set of tools for managing secrets. Through it's API you can configure vault users with policies to allow them access only to certain secrets. It can issue new, short lived, certificates signed by a CA which you have setup in vault. It can issue short lived credentials for AWS. It can store generic secrets of several other types. So it sounds like it solves our secret management worries but...

 

The challenge

The challenge is how to get your secrets from vault into your applications, and because they are short-lived how do you replace them at regular intervals. There are 2 possible ways we considered doing this:
  • Write your applications to talk directly to the vault API to request new secrets when needed
  • Have a "helper" container that manages secrets and makes them available to your application
The set of tools we're talking about are aimed at the latter option. It has some advantages - you're applications are decoupled from your secret management system, making it easy to change in the future, and you only need to write one helper container, rather than having to maintain libraries to manage vault secrets in all the different languages we use.

 

The solution

There are a few parts to the solution we're looking at:
  • A vault configuration tool to setup users and policies in vault
  • A "vault sidekick" container that pulls secrets from vault and makes them available to your application
Vaultconf allows you to keep configuration for vault in a version controlled repository as yaml files. It allows you to reconcile your vault server to these configuration files, making sure you know exactly what users and policies are in vault. It also generates strong passwords for each of your vault users and makes these available as Kubernetes secrets. This is important as your vault sidekick will need vault credentials to read secrets with.

Vault sidekick can read vault credentials from a kubernetes secret, and then make your chosen secrets available to your application, automatically getting new credentials before the old ones expire.

I hope to do another post at some point with a practical example in it.

Sunday 11 January 2015

Play and ScalaTest - testing controllers

I've started a pet project to play with different aspects of Play and Scala and today tried to decide which test library to use. The 2 main contenders are Specs2 and ScalaTest. After reading a great number of articles about both I concluded they are both offer similar functionality so either would be fine as a starting point.

To get ScalaTest running I followed the guide here:

However some tweaks were needed to get what I wanted so I'll step through it quickly here.

Add the appropriate dependency to build.sbt

"org.scalatestplus" %% "play" % "1.2.0" % "test"

Latest version numbers here: http://www.scalatest.org/plus/play/versions

Change my controller to be more testable

Instead of using a straight object for the controller having it as a trait means you can instantiate versions of it in your test, which gives you the ability to override some methods when you test it, I see this could be useful for when you want to test some things in isolation. 

However really for now I think I could just as easily just use the object directly (I've checked and the tests still work fine this way). I will follow the trait-object pattern as it seems to be recommended and hopefully will come in handy later.

Controller is:
trait QuestionsController {  
   this: Controller =>  
   def createQuestion = Action {  
      Ok(views.html.addQuestion())
   }
}  

object QuestionsController extends Controller with QuestionsController

Test the controller

The example given on the play site for unit testing shows the Ok call above just outputting text. As soon as you try to replace that with one that actually calls a view it will stop working, complaining that there is no started application.

The next page on the play site gives the answer to this, which is to mix in a fake application.

Complete test is:
import scala.concurrent.Future  
import org.scalatestplus.play._
import play.api.mvc._  
import play.api.test._  
import play.api.test.Helpers._

class QuestionsControllerSpec extends PlaySpec with OneAppPerSuite {
   class TestQuestionsController() extends Controller with QuestionsController
   "createQuestion" must {
      "direct to the create question page" in {  
         val controller = new TestQuestionsController()  
         val result: Future[Result] = controller.createQuestion().apply(FakeRequest())  
         val bodyText: String = contentAsString(result) bodyText must include("Add Question and Answer")
      }
   }


 So that's it, a simple controller test.

Introduction

Hi,

I have recently started a major career change - moving from generalist jobs to become a software developer. I hope to share some of the things I learn along the way through this blog.

A little background about me. I graduated with a physics degree and worked for 2 years as a support analyst - application support, training, SQL, and a taster of VBA and C#. I then signed up for a graduate scheme aimed at producing technically minded generalists. There I spent 4 years in a variety of jobs from project management, to commercial work, to requirements analysis. Mostly not very technical.

About 4 months ago I started working as a developer. There has been a huge learning curve so far and I still think I am only scratching the surface. At present the main technologies I've been using include:
- Scala and the Play framework
- NodeJS and Express
- Cucumber (both with Ruby and Cucumber-js)
- MongoDB

Hope this blog ends up being some use to someone!

Tim