One of the key differences between a Blockchain and a database is that a Blockchain is an immutable, append-only datastore. In a Blockchain, data items are never overwritten: all changes to the Blockchain are appended as new blocks in the chain. Of course, a database that only performed insert operations would be pretty hard to work with. ProvenDB is an append-only database, but it emulates a traditional database that provides updates and deletes by using versions.

In ProvenDB, a delete operation does not remove a document. Instead it creates a sort of “tombstone” marker, hiding apparently deleted documents from view. Likewise, an update operation does not replace an existing document: rather, it hides the old document and adds a new document in its place.

All of these operations create new versions of the database. The old versions can still be viewed, though by default you only ever see the current (most recent) version.

Setting the active version

By default, a database user will always see documents that exist within the current version. However, a user can choose to view a previous version using the setVersion command. When viewing a historical version, no updates, deletes or inserts are permitted. However, the full range of query capabilities are available.

Version compaction

The proliferation of versions can be a concern for applications with high modification rates. While it may be desired to keep specific versions of documents, it may not always be required to keep all versions. For this reason, ProvenDB allows for version compaction. Version compaction allows specific versions to be removed from the database without affecting versions before or after the selected range. For instance, an accounting application might compact all but the end of day versions for previous months, and might keep only end of month versions for previous years. For more information, see Compact .

Serializable consistency of versions

Versions in ProvenDB are serializably consistent. Each version of the database is sequentially consistent with the previous and subsequent versions. In practice, this means that each version is logically dependent upon the previous version and that versions cannot be skipped. This has some implications for concurrency which is discussed in a later section.

A future version of ProvenDB may implement a looser consistency model that has increased concurrency. However, in this version the emphasis has been on maintaining strict data integrity.


What’s Next