Saturday 9 July 2016

Scala Slick - dealing with large tables of more than 22 columns

Slick is a tool for managing access to a database in Scala. So far I have mainly quite liked it, but have also found that documentation can be a bit lacking in some areas, especially around managing tables with more than 22 columns, or mapping your columns into a nested case class structure.

This post attempts to stitch together suggestions from a few different sources so you can see what the options are and which may be best suited to your use case. This blog is supported by code examples that all compile and form part of this project:
https://github.com/timgent/spray-slick-template

The simple case - mapping a small, flat case class

As with all good products Slick's tutorial starts off with an example that makes everything seem straightforward and easy - mapping from a small number of columns to an unnested case class.



As you can see we just define our case class, create a table based on this with it's respective columns, and finally define a * projection.

The * projection is there to provide a mapping from the columns, to a tuple, to your case class. It first sets out the columns you are mapping from, and then takes 2 functions - one to map from the tuple to your case class (hence the use of the tupled function on the case class here), and one to map from the case class back to a tuple (which is exactly what unapply does)

Oh no! More than 22 columns!

This approach stops working as soon as you have more than 22 columns. You can see this as Case Classes stop having the tupled and unapply methods as soon as you have more than 22 columns, scuppering our earlier simple approach.

But fear not, there are 2 fundamental approachs to dealing with this - using nested tuples and nested case classes (which come in 2 flavours), or using HLists.

Using Nested Tuples and Case Classes

Our challenge comes from having too many fields in our tuple and case class so one simple solution is to nest them. This means on each nested case class we can still use the unapply and tupled methods, though it does add a little boiler plate for us.

 

Using projections for each case class

This is my preferred method. We have to:
  1. Create our nested case class
  2. Group our columns to match the nesting of our case class
  3. Create projections for each nested case class
  4. Use these projections in our * projection
  5. The mapping functions as you can see remain fairly simple
The gist is here

 

Using custom mappings

You can just use your own custom mapping functions, though personally I find this can get quite messy quite quickly. In particular when you have a compile error listing over 22 types it is rather confusing.

The steps are the same as above except we don't create projections for each nested case class, which means our mapping functions have to do all the unapplying and tupling themselves.

Gist here

Using HLists

An alternative to all this is using HLists (essentially lists with types for each element).

 

Plain HLists (no mapping to case class)

Another method I favour is using plain HLists - fairly simple with minimal boilerplate.

This is as simple as having the right imports and just defining a * projection with your columns all stuffed in a HList.

Gist here

 

HLists with a mapping to a large case class

This example takes it to the extreme of using HLists instead of tuples, but then using a custom mapping to transform this to a case class.

The advantage is you can use a case class with more than 22 fields, but it does require a little more boiler plate.

The steps are:
  1. Create our large unnested case class
  2. Create our columns as usual (no nesting required)
  3. Define a * projection, with custom mappings to go from HList to our Case Class, and vice versa
Gist here

Monday 2 May 2016

Documenting APIs

I've been using Swagger for documenting a REST API, but I've found it has a few shortcomings. I was inspired by this talk to look at other methods for generating API documentation. We've ended up using API Blueprint and I wanted to share some thoughts.

What I look for in a tool for documenting an API

The things that are important to me are that any tool:
  1. Allows me to create clear documentation for users
  2. Ensures the documentation is kept up to date (for example by generating docs from code or tests, or from the docs themselves being testable)
  3. Allows generation of a mock server to make it easy for my consumers to test against

Swagger

Swagger generates a swagger spec of your API from your code. This spec can then be used to produce pretty decent docs, generate mock servers, and more.

The docs generated by Swagger are pretty good, and are a pretty common standard now. There are also a number of tools that read and do useful things with a swagger specification.

The good:
  • Swagger specs are (mainly) generated from code, so pretty easy to keep up to date
  • The swagger endpoint for viewing documentation is particularly good given it takes almost no work to set up
  • There are tools available to generate a mock server from Swagger docs
The bad:
  • Swagger doc generation relies heavily on annotations. These annotations are easy to let get out of date. No tests will fail if this happens so there's no way to detect it except someone noticing when trying to use your docs. This is a major downside in my view
  • Swagger docs are quite inflexible. For example if you want to document a service with more words to explain context, or organise your docs differently than by route, it isn't possible to do that. Albeit you could write a tool that extracted examples from a swagger specification and put them in asciidoctor files for you to embed in a more readable document...
  • Generated mock servers only allow for one response per endpoint, limiting their usefulness in testing
Conclusion on Swagger
It's a really good tool, but many of the shortcomings here are fundamentally because of the approach to generate the spec from code. I still recommend it, but think in the long run other tools will win out.

 

API Blueprint

API Blueprint lets you write API documentation is a set format for a markdown document. This gives you some but not loads of flexibility for the format of the docs.

As with Swagger the spec can then be used for other things, including running tests, generating prettier html docs, and running a mock server.

The good:
  • Having made the tests part of the CI pipeline we now always know if the spec is out of date
  • Dredd (the testing tool) is fairly flexible as it allows you to run custom hooks before the tests, for example I used these to do an oAuth login required for accessing our API in dev
  • As with Swagger some very good tools around the specification, including converting from Swagger spec to API Blueprint spec if you just want to try it out
The bad:
  • The next thing on their roadmap is being able to have multiple requests and responses per endpoint. As with Swagger this is much needed as currently it limits the tests you can run, and the usefulness of the mock server
  • Docs are a little more flexible than Swagger, but could do with even more flexibility. You tend to need to split it up into separate markdown files if you have large responses

Spring RESTDocs

I do agree with the talk posted above - having tests generate document snippets to include in a larger document for your API seems a great way to produce documentation. Unfortunately I'm not using Spring so that tool is unavailable. I would love to see a testing tool that isn't coupled to a particular framework that can generate documentation snippets. However API Blueprint or Swagger could potentially both generate documentation snippets, so perhaps this advantage is overplayed.

Spring RESTDocs does also give much greater flexibility with testing.

On the downsides I can see it being harder to generate a mock server using this approach, though this isn't a high priority for many people.