Dolt- A Version Controlled Database
Written by Nikos Vaggalis   
Monday, 29 January 2024

A database that you can fork and clone, branch and merge, push and pull just like a git Repository. What is the use case for it?

First of all Dolt , written in Go, might not be a fork of Mysql, but it acts a drop-in replacement, adhering to the Mysql protocol, but with versioning on top.

The gist is that if a user didn't use any versioning features, then he wouldn't be able to tell that he isn't using a Mysql database anyway. And while starting out with Mysql, Dolt has recently also released Doltgres which does the same but for Postgres.

So what does Dolt make different from the rest?

Versioning. If you are familiar with Git then you'll be right at home. Like Git does branching, merging, diffing etc , the same you can do with your Dolt database. You can for instance fork the database and hand it over to someone else for developing, debugging, running analytics on it, do changes, even add new tables or data, and give it back for you to merge!

Then, made a mistake and dropped a table? no problem roll it back, there's no need for retrieving it from tape backups.
But the mind blowing capability though is that since all the data is revised, you can run queries on a snapshot or commit (as they say Git-wise) , or join it with the current working set. This is useful for example if you want to know what the state of the data was last Monday to what it is today. In essence Dolt does both schema AND data versioning.

That's two use cases, but there's more. In PostgresML - Bring Your ML Workload To The Database, I described how that extension can be used to train and apply ML models
on your live OLTP data. But that might not be the most efficient approach, thus it is recommended to offload training (or inference) to secondary PostgreSQL servers. Dolt has this covered too. As a matter of fact Dolt's main usage is to act as an OLTP database to back an existing application. As such by leveraging the Binlog replication, the changes are transfered from slave to master as close to real-time as possible, therefore you don't exhaust your production workhorse, plus you get all the versioning benefits.

Another very useful use case which I find very intriguing is auditing. Say a user inserts a row and then another user deletes it. Who deleted it? Since revisions of the data are always available and every change gets recorded, you can easily automate auditing without having to code explicitly for it,
like writing triggers and having a separate table for monitoring the changes.

As a summary there's four key reasons to use database versioning:

  • Minimize downtime
  • Improve developer productivity
  • Reproducibility
  • Expose Versioning to Your Application

With Dolt you have that effortlessly and at your fingerprints. In fact if you check Dolt's CLI you'll discover that it imitates Git commands:

  • init - Create an empty Dolt data repository.
  • status - Show the working tree status.
  • add - Add table changes to the list of staged table changes.
  • diff - Diff a table.
  • reset - Remove table changes from the list of staged table changes.
  • clean - Remove untracked tables from working set.
  • commit - Record changes to the repository.
  • branch - Create, list, edit, delete branches.
  • checkout - Checkout a branch or overwrite a table from HEAD.
  • merge - Merge a branch.
  • conflicts - Commands for viewing and resolving merge conflicts.
  • cherry-pick - Apply the changes introduced by an existing commit.
  • revert - Undo the changes introduced in a commit.
  • clone - Clone from a remote data repository.
  • fetch - Update the database from a remote data repository.
  • pull - Fetch from a dolt remote data repository and merge.
  • push - Push to a dolt remote.


Dolt for Mysql is the production ready product. However, last November they've launched Doltgres too which is still in Alpha.
The reasoning for that is that in 2019 when Dolt was conceived, MySQL was the most popular SQL-flavor. But over the past 5 years, the tide has shifted more towards Postgres, especially among young companies which is Dolt's target market, therefore the customers have been clamoring for a Postgres version of Dolt.

DoltgreSQL strips out some of the Git for Data pieces like the CLI and builds directly for the version controlled database use case. With Doltgres, you can do everything with SQL, a familiar experience for Postgres users.

Enough talks, let's see some code. To illustrate how you would initiate a commit, after you create tables 'teams', 'employees', 'employees_teams', you do:

That simple.

In conclusion, Dolt's mission is visionary. It has done something that no one else has; it streamlines many actions that once required a lot of effort to perform, and gives room for some that are still to be invented. .

 

More Information

DoltHub
Dolt Github
Doltgres

 

Related Articles

PostgresML - Bring Your ML Workload To The Database

 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Programmer Gifts - Pi For Xmas
13/12/2024

The holiday season is a good time to learn about computers - you have the time. But where to start? Our advice is to ignore the pudding and go for a Pi.



Kafka 3.9 Adds Dynamic KRaft Quorums
16/12/2024

Kafka 3.9 has been released. The team says this is a major release and the final in the 3.x line. It This will also be the final major release to feature the deprecated Apache ZooKeeper mode. Kafka is [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 29 January 2024 )