Decentralised Data with CouchDB

My life through a lens bq31L0jQAjU unsplash
Photo by “My Life Through A Lens” on Unsplash

This is another post related to using CouchDB and PouchDB.

TLDR: PouchDB stores data in your browsers localstorage on your machine, you can then share said data that with others via CouchDB which can run as an application on another machine on your local network. Each device then gets a copy of said data and the updating in realtime and sync is handled by the CouchDB app. If you go offline you have all the data, its all copied to your machine, if you delete the data, then sync will remove it from all other devices. The data is on your machine, local first and you have complete ownership of how that is shared to other devices.

This extends an earlier post on Why PouchDB and CouchDB

Most systems be that single player or multiplayer need a way to synchronise data between machines.

The default way to undertake this is to create a server as the middle layer.

This is a centralised approached.

The issue with this is the cloud is at the centre point.

Many open source software providers that have this multi device option will allow you to self host the server infrastructure so you can maintain control of your data however this requires a server and often is quite complex to set up and maintain.

Which isn’t very delightful at all, also said server could be attacked meaning you also need to now manage security.

This is why people often outsource this part to a third party sync service or cloud service and open source projects will often offer a paid hosted version of this component as a way to generate revenue.

So if I want freedom to control my data across multiple devices or with other people I either have to become a sysadmin or pay a hosting provider, also making sure who I do pay is not going to sell my data down the river at a later point.

None of the which sounds like a great solution to me at all.

This is why I was always looking for a way that you can have local data that could sync with other machines directly, decentralised.

Something like a mesh network or peer to peer (P2P). I also knew this would need to happen in real-time.

I looked briefly at blockchain, P2P technology and decentralised apps (dApps) all of which was rather complex, or at least appeared to be.

I recall thinking and discussing this at Moz Fest 2017 and you can see my initial exploration explained a little in the blog post about prototypes back in Nov 2017.

At that time I had landed on deepstream an open source alternative to Google Firebase which I thought was a winner.

deepstream had a hosted version called hub and you could run your own self hosted version if needs be.

I was wrong as deepstream hub suddenly stopped and it turned out the deepstream server appeared to behind the hub in feature set and ease of use I just couldn’t get it to work with my current code which had worked perfectly with hub.

A conversation about this was then had with Matthew Parker a Perl developer about what I needed to do and the experiments I had undertaken.

Looking at my blog discursive search we must have had this conversation mid 2018, I suspect the first chat was over Signal app and with various phone and mac migrations the history is not located.

I just recall Matthew coming back to me at some point after that conversation and saying have you looked at CouchDB. I hadnt.

So why is this so powerful and why aren’t more people using it.

I don’t know.

What makes this set up so simple is this.

I have a Vue app that stores data in the vue store and then passes this into PouchDB, PouchDB is a browser based localstorage implementation of CouchDB, so the data is stored on you machine first. You can quit and restart your browser and your data is maintained.

You can then set a remote CouchDB for this data to copy to. You can also use replication to have CouchDB replicate to another CouchDB for resilience.

This remote or remotes of CouchDB can be either on a server type machine or just running locally as an app, both would be network accessible.

So if you only want to work with a close group of individuals you would ensure you are all on the same network, this could also be private / local.

You would then setting your remote to the address of the machine running the CouchDB app which would allow you to sync you data together, this data is then copied to the localstorage of each machine via PouchDB. If you decided to delete data this delete is sync’d across all machines.

CouchDB also has really good conflict resolution and keep tracks of changes, although I initially thought this could be a form of version control, the way couchDB does versioning is not designed to keep an infinite history, however in the case of offline editing of the same data from differing devices, you can easily resolve which one is the truth or even keep both versions.

What I found rather fun was at a similar time I had started reading the Ink and Switch research site which discussed offline first work. They had ruled CouchDB out for collaboration and yet as I was reading it I was literally running CouchDB as a real-time collaborative editor.April and May 2019 YouTube Clips.

This implementation is simple. If you also encrypt the data between localstorage PouchDB and the remote CouchDB, then even if you decided to use servers with replication the key to unlocking the actual data is done by the your machine, even if someone was to snapshot the couchDB data at that time there would be no way of knowing what the data was.