How I extracted core functionality from Compoxure into two reusable libraries - Parxer for HTML parsing and transformation, and Reliable Get for fault-tolerant service calls.

Modularizing Compoxure: Parxer and Reliable Get

When things don’t go to plan … Copyright: https://www.flickr.com/photos/clintjcl/7583651542/

I’ve been talking at a couple of conferences lately about the approach we are taking at TES Global around declarative service composition (this will explain it).

What I’ve noticed is that people tend to feel a little uncomfortable about putting a proxy like this in front of their application, and lets be honest - I completely understand that fear - this isn’t just a library, this will be between you and your customer and interacting with every piece of HTML your application produces! Dejan outlined some ideas and concerns in a post here.

I won’t talk about how you can try it in small slices (e.g. maybe on just a small part of your application), but rather some of the work I’ve been doing recently to take some of the ideas and concepts from Compoxure and modularise them so that both we and others can use them without having to buy the whole farm.

So, the core of Compoxure has been extracted into two new libraries:

Parxer

GitHub: https://github.com/tes/parxer

Parxer extracts the core HTML parsing logic out, think of it like a non-streaming version of Trumpet.

Realistically I don’t expect many people find use for this outside of our use case, as there are things with probably ‘better’ APIs out there that if you just want to do HTML transformation, however, if what you want to do is HTML transformation where those transforms involve asynchronous functions (e.g. calls to other services) and you want those to all run in parallel, then this might just be useful for you.

It’s non-streaming for a number of reasons, not the least that I can’t yet get my head around how to do the parsing and interleaving of parallel requests in a streams mindset, so this model is less complex for me to reason about and hence get working. I also find it very fast, despite the fact that it is technically buffering everything vs just sending it as soon as it gets it.

It’s been refactored with a simple plugin architecture, so you can easily add your own handlers, with the defaults that Compoxure uses (url, bundles and test) all built in as core.

var input = "<html><div id='test' cx-test='{{environment:name}}'></div></html>";
parxer({
  plugins: [ require('../Plugins').Test ],
  variables: { 'environment:name':'test' }},
  input,
  function(err, data) {
    var $ = cheerio.load(data);
    $('#test').text() == 'test'; // true
});

There are a lot more examples in the tests, and of course it is used within Compoxure so you can also look at that source code.

Reliable Get

GitHub: https://github.com/tes/reliable-get

The second part is the code that interacts with the micro-services themselves, ‘reliable-get’.

Think of this as a wrapper around request that ensures that if the service at the other end isn’t reliable that your application won’t die as well.

var ReliableGet = require('reliable-get');
var config = {
 cache:{
   engine:'redis'
 },
 circuitbreaker:{
   includePath: true
 }
};
var rg = new ReliableGet(config);
rg.get({url:'http://www.google.com', cacheKey:'google', cacheTTL:60000, timeout: 500}, function(err, response) {
 console.log(response.content);
});

So once you have created an instance of reliable get, you can fire off GET requests to as many dodgy services as you like, and this will give you a reasonable amount of protection:

If your service is higher load than the 3rd party you deal with, this will cache responses from it (completely configurable, plus it will honour cache response headers from the service as well).
If the service has ever responded with a successful response, and then later dies, reliable-get will serve the last successful (‘stale’) response to ensure your app continues to appear like it is functioning as normal.
There is a circuit breaker that ensures that if the 3rd party service dies or starts having trouble, a circuit breaker can open that can reduce load. In this model it will continue to serve the ‘stale’ content if it has it. This avoids you hammering a service that might be struggling into the ground while you try to fix it.

Modularity

The other thing that is useful about this process is that it has made the Compoxure code base much smaller and simpler to understand, I’ve found a few bugs in all three projects that are now fixed, overall test coverage is much higher and I think that actually getting others to add features to any one of the three will likely be easier (lets see!).

I think this method actually worked really well in this instance:

Write code to get it working.
Refactor that code to make it easier to work with, increasing test coverage as you go.
Once refactored, extract out modular code into separate libraries. Increase coverage on those libraries.

The danger of starting with lots of modules is that you don’t really understand the requirements for those smaller modules until you compose them together into something more complex.

This way of thinking also has a lot of similarities to the Micro Services - How to Start? question, e.g. can you only properly identify the right granularity of services if you have previously built something larger that you are now teasing apart?

Menu

Modularizing Compoxure: Parxer and Reliable Get

Modularizing Compoxure: Parxer and Reliable Get

Parxer

Reliable Get

Modularity