From Runa to Zolo Labs in 54 months

Runa

Runa Logo

I’ve been at Runa for about four and a half years: I was there from pretty much from the beginning, and for the past couple of years, I ran the tech team as the CTO and VP of Engineering. We’ve built an amazing product, with quite a small team of engineers. We picked tools that gave us leverage, and I think to a large part, these choices let us stay small and lean.

We were among the first companies in the world to push a Clojure service into production… it was the fall of 2008, and Clojure was still pre-version-1.0. In those days, it still used Subversion for version control… and I remember having to tag commit #783 or something… so we’d be sure some unknown incompatibility wouldn’t derail us. We finally did an official release of that in Jan of 2009. Fun times :-)

Since then, we’ve grown our code-base to tens of thousands of lines of Clojure code, and have worked with HBase + Hive + MapReduce, Cascalog, RabbitMQ, Redis, MonetDB, Hazelcast, and loads of AWS services. We coded several state of the art machine-learning algorithms for predictive modeling. Rapunzel, our Clojure DSL, is used by several non-technical folks every day to support our merchants. It’s really neat stuff.

And the code is blazingly fast. For instance, we’re one of the only 3rd party services deployed inside eBay’s data-centers, and are called in real-time for their live pages. To be specific, we’re called about 100 to 200 millions times a day, with spikes of over a billion, and have an average response times of 640 micro-seconds. This service makes statistical predictions about [redacted], and runs over 800,000 models concurrently. Like I said – fun times…

More than anything, I worked with some incredibly talented people over these past few years… not just here in the Bay Area, but also in Bangalore, India. I think the work we pulled together as a team, and the relationships I’ve made with these people are what I’m most proud of…

Zolo Labs
z_logo_white
This past month, though, was my last as Runa’s CTO. I’m still engaged as the chief technologist, but no I’m longer a full-time employee. Why did I leave? Well… it’s been a long time, and I’ve been itching to do this Zolodeck thing for a while. And I see good things in the future for Runa, and I’m certain these things will happen without me as well… the team is rock solid… which gives me confidence to start my own adventure!

Some of you know what we’re building – our first product is called Zolodeck – but there’s a deeper philosophy behind that choice. I’ll delve into that in another post soon, but for now I’ll just say we want to improve how people collaborate and converse with each other, whether with their friends and family, or when they’re at work.

And for the techie readers of this blog, yes, we’re using a lot of cutting-edge technology to build out this product including Clojure, Datomic, Storm, and ClojureScript. Here’s a link to a talk I did at Strange Loop last year, discussing some elements of our stack.

It’s an exciting time for me… I’ve been waiting to be an entrepreneur for nearly 12 years… and it’s finally here! Boy, do I feel anxious and excited. Anxcited?

Why Datomic?

Many of you know we’re using Datomic for all our storage needs for Zolodeck. It’s an extremely new database (not even version 1.0 yet), and is not open-source. So why would we want to base our startup on something like it, especially when we have to pay for it? I’ve been asked this question a number of  times, so I figured I’d blog about my reasons:

  • I’m an unabashed fan of Clojure and Rich Hickey
  • I’ve always believed that databases (and the insane number of optimization options) could be simpler
  • We get basically unlimited read scalability (by upping read throughput in Amazon DynamoDB)
  • Automatic built-in caching (no more code to use memcached (makes DB effectively local))
  • Datalog-as-query language (declarative logic programming (and no explicit joins))
  • Datalog is extensible through user-defined functions
  • Full-text search (via Lucene) is built right in
  • Query engine on client-side, so no danger from long-running or computation-heavy queries
  • Immutable data – audits all versions everything automatically
  • “As of” queries and “time-window” queries are possible
  • Minimal schema (think RDF triples (except Datomic tuples also include the notion of time)
  • Supports cardinality out of the box (has-many or has-one)
  • These reference relationships are bi-directional, so you can traverse the relationship graph in either direction
  • Transactions are first-class (can be queried or “subscribed to” (for db-event-driven designs))
  • Transactions can be annotated (with custom meta-data) 
  • Elastic 
  • Write scaling without sharding (hundreds of thousands of facts (tuples) per second)
  • Supports “speculative” transactions that don’t actually persist to datastore
  • Out of the box support for in-memory version (great for unit-testing)
  • All this, and not even v1.0
  • It’s a particularly good fit with Clojure (and with Storm)

This is a long list, but perhaps begins to explain why Datomic is such an amazing step forward. Ping me with questions if you have ‘em! And as far as the last point goes, I’ve talked about our technology choices and how they fit in with each other at the Strange Loop conference last year. Here’s a video of that talk.

Make it right, then make it fast

 

Alan Perlis once said: A Lisp programmer knows the value of everything, but the cost of nothing.

I re-discovered this maxim this past week. 

As many of you may know, we’re using Clojure, Datomic, and Storm to build Zolodeck. (I’ve described my ideal tech stack here). I’m quite excited about the leverage these technologies can provide. And I’m a big believer in getting something to work whichever way I can, as fast as I can, and then worrying about performance and so on. I never want to fall under the evil of premature optimization and all that… In fact, on this project, I keep telling my colleague (and everyone else who listens) how awesome (and fast) Datomic is, and how its built-in cache will make us stop worrying about database calls. 

A function I wrote (that does some fairly involved computation involving relationship graphs and so on) was taking 910 seconds to complete. Yes, more than 15 minutes. Of course, I immediately suspected the database calls, thinking my enthusiasm was somehow misplaced or that I didn’t really understand the costs. As it turned out, Datomic is plenty fast. And my algorithm was naive and basically sucked… I had knowingly  glossed over a lot of functions that weren’t exactly performant, and when called within an intensive set of tight loops, they added up fast.

After profiling with Yourkit, I was able to bring down the time to about 900 ms. At nearly a second, this is still quite an expensive call, but certainly less so than when it was ~ 1000x slower earlier.

I relearnt that tools are great and can help in many ways, just not in making up for my stupidity :-)

datomic – demonic transforms – to do or not to do?

While working with the new Datomic database, we’re having to get used to using ns-qualified keywords in our maps. To make day-1 go faster, I wrote these versions of the demonic insert and load functions:

and also:

The idea was to use them as follows (for instance):

where datomic-key->regular-key is something like:

The idea behind all this was that a map like:

Could become something like:

This initially seemed to be more usable and familiar. However, we’ve since moved away from it, seeing some value in having keys that are indeed namespaced to the entity they belong to (for example user entity has keys called :user/first-name and :user/last-name).

Just thought we’d share.

demonic v0.1 – utilities for Datomic

I’ve been writing some code to work with Datomic, and thought some of this might be useful to others. So I’ve put it into a small utility project called demonic. There are a few concepts in demonic:

The first is that of a demarcation. A demarcation is kind of a dynamic binding, within which, all datomic transactions are held until the end. When the scope exits, all the datomic transactions are committed at once. This is useful in a scenario where you’re updating multiple entities in a single request (say), and you want them all to happen or you want them all to roll-back. Here’s how you’d use it:

The good thing is that this makes it easy to write cleaner tests. There are two macros that help, namely demonictest and demonic-testing, which are respectively like deftest and testing. You could use them like this:

or:

As you can see, there are several CRUD operations provided by demonic, and in order to get the above benefits (of demarcations and testability), you need to only go through these functions (and not directly call the datomic functions). Here are these basic functions:

  • demonic/insert – accepts a map, if it doesn’t contain a :db/id key, an insertion occurs, else an update will occur
  • demonic/load-entity – accepts a datomic entity id, and loads the associated entity
  • demonic/delete – accepts a datomic entity id, and deletes the associated entity (and all references to it)
  • demonic/run-query – accepts a datomic query, and any other data-sources, and executes it against the current snapshot of the db (within the demarcation)

By the way, there’s another helper function for when you’re building web-apps using Compojure:

  • demonic/wrap-demarcation – sets up a demonic demarcation for the web-request

So these are a few things I’ve got in there right now. I’m also working on making it easy to create and maintain datomic schemas. I’ll write about that another time, once it is a bit more baked.