Elastic Git

Elastic Git is a library we've developed to provide a single interface when working with data that's distributed with Git and indexed in Elasticsearch. It is built around Mozilla's elasticutils library.

Elastic Git (EG from now on) is built around the concept of declarative models.

Here is what a model looks like.

We declare a model as a thing we want to work with and then define attributes that that model has. In this case it is a name as a bit of text and an age as an integer.

To code for all this is available online, the samples in this document all come from there.

First to get working with EG we need to get an environment up and running, the following code will set that up for you:

Note that you also need to have Elasticsearch running.

The workspace

The workspace is an object that EG provides which provides a single point of access to your data in Git and Elasticsearch.

API documentation for it is available online but here is a simple example of creating a repository called the_repository and saving an instance of a model in it:

What happens here is that under the hood we are generating a JSON file that represents the model and the data, here is what it looks like in the repository we've created:

The UUID is automatically generated and is the identifier with which we refer to individual bits of data, both in applications and in git and elasticsearch.

You will notice that it also has some extra metadata under the _version key which provides some hints about what package & version of that package generated the JSON data.

Querying the workspace

Now that we have the ability to save things to the workspace, we can query it as well. We've built a wrapper around elasticutils that provides the same interface and query API.

Running that will print the following:

Similarly we can generate & save more people entries:

And then query them:

Which would print out the following:

Elasticutils queries return MappingTypes which are objects reflecting the data that was indexed in Elasticsearch. EG provides a mapping between objects in Elasticsearch and in Git and as a result you can ask a result from Elasticsearch to return the object from Git.

When running this you will see the following:

What is returned by the get_object() function is the instance of the Person class that was saved with workspace.save().

More complex querying

Elasticutils allows for fairly complex query construction, and, if all else fails, allows you to provide raw queries as input.

Here is an example of an OR query using the F() filter helper:

When run that prints out people with age 10 or name Simon:

At this point I would recommend the following reading: