Skip to main content
Back to Blog
Technical February 13, 2026 · Steven Melendez

Deep Dive: CRDTs in PrivStack for Bulletproof Sync

Conflict-free Replicated Data Types (CRDTs) are the secret sauce behind modern collaborative applications. They allow multiple users to edit the same data concurrently without messy conflicts, ensuring that all users eventually see the same result. In PrivStack, we've invested heavily in a robust, memory-safe, and rigorously tested CRDT implementation in Rust to power our sync engine. This post takes a technical deep dive into our approach.


Core CRDT Implementations

Our CRDT library, privstack-crdt, provides a suite of foundational data types, each designed for a specific purpose.

VectorClock

At the heart of our causality tracking is the VectorClock. It’s a map of PeerId to a logical timestamp (u64). This allows us to determine if one operation "happened-before," "happened-after," or was "concurrent" with another. This is crucial for ordering operations and resolving conflicts in more complex CRDTs.

LWWRegister (Last-Writer-Wins Register)

For single-value properties like a document's title or a configuration flag, we use the LWWRegister. When two users edit the same value concurrently, the register needs a deterministic way to pick a winner. Our implementation uses a HybridTimestamp (a combination of wall-clock time and a monotonic counter) to resolve conflicts. If timestamps are identical, a comparison of PeerIds serves as a final, deterministic tie-breaker.

PNCounter (Positive-Negative Counter)

For distributed counters, such as view counts or likes, we use a PNCounter. It's composed of two internal hashmaps: one for increments (positive) and one for decrements (negative), both keyed by PeerId. The final value is simply the sum of all increments minus the sum of all decrements. This design ensures that increments and decrements from different peers can be applied in any order without affecting the final result.

ORSet (Observed-Remove Set)

For collections like tags on a document or a list of files, we use an ORSet. This set implementation provides "add-wins" semantics. When an element is added, it's assigned a unique Tag (a Uuid). To remove an element, we don't delete it directly. Instead, we add its associated tags to a "tombstone" set. An element is considered part of the set if it has at least one tag that is not in the tombstone set. This elegantly handles the classic scenario where one user removes an item while another concurrently re-adds it – the re-added item's new tag won't be in the tombstone set, so it "wins" and remains in the set.

RGA (Replicated Growable Array)

For ordered sequences like the text in a document or a list of tasks, we use an RGA. This is our most complex CRDT, and it's designed to handle concurrent insertions and deletions in a way that preserves the intention of all users.

Each element in the RGA is assigned a unique ElementId, which is a struct containing a HybridTimestamp, a PeerId, and a sequence number. When a user inserts a new element, it is linked to the ElementId of the element that preceded it (its "origin"). Deletions are handled by creating tombstones, similar to the ORSet.

The final, ordered sequence is generated by traversing the graph of elements, starting from a root element. This deterministic traversal ensures that all peers will reconstruct the exact same sequence, regardless of the order in which they received the insert and delete operations.

A Multi-Layered Testing Strategy

CRDTs are notoriously difficult to get right. A simple bug can lead to subtle data corruption or divergence that is hard to detect. That's why we've implemented a comprehensive, multi-layered testing strategy that goes far beyond simple unit tests.

1. Unit Tests

Each CRDT has a full suite of unit tests covering its basic API and functionality. These are the first line of defense and ensure that each component works as expected in isolation.

2. Property-Based Tests

Using the proptest crate, we've written extensive property-based tests for all our CRDTs. These tests don't check for specific outcomes, but rather for fundamental mathematical properties that all CRDTs must uphold:

  • Commutativity: merge(A, B) must equal merge(B, A).
  • Associativity: merge(merge(A, B), C) must equal merge(A, merge(B, C)).
  • Idempotence: merge(A, A) must equal A.

By generating hundreds or thousands of random operation sequences and ensuring these properties always hold, we gain a high degree of confidence in the correctness of our merge logic.

3. Adversarial and N-Peer Convergence Tests

This is where our testing strategy truly shines. We've created two dedicated test suites, adversarial_tests.rs and n_peer_convergence_tests.rs, designed to simulate the most challenging real-world scenarios:

  • Split-Brain Recovery: We simulate network partitions where two or more groups of peers diverge significantly. For example, in rga_split_brain_delete_and_append, one peer deletes a large chunk of text and inserts new content, while another peer appends to the original text. After the partition heals, we assert that the final merged document is consistent and contains both users' intended changes.
  • High-Churn and Tombstone Stress: In tests like orset_1000_add_remove_cycles_two_peers, we subject our CRDTs to thousands of rapid add/remove cycles. This is designed to stress our tombstone implementation and ensure that it doesn't lead to memory leaks or incorrect state.
  • Resurrection Bug Prevention: We have specific tests like orset_resurrection_classic_scenario that target classic CRDT bugs. In this test, one peer removes an item, and another peer (who was offline and didn't see the removal) re-adds it. Our "add-wins" semantics ensure that the item correctly "resurrects," as it should.
  • Realistic Sync Topologies: We don't assume a perfect world where every peer can talk to every other peer. Our n_peer_convergence_tests simulate more realistic topologies:
  • Gossip: Peers only sync with a random subset of other peers.
  • Chain Sync: Convergence is achieved by passing state down a line of peers (A→B→C).
  • Hub-and-Spoke: A common enterprise pattern where all peers sync through a central relay.
  • Large-Scale Simulations: Tests like enterprise_50_peers_shared_document simulate a large team of 50 concurrent users all editing the same document. This helps us identify and fix performance bottlenecks and ensure that our CRDTs can scale to meet the demands of large organizations.

4. Integration Tests

Finally, our integration tests verify that our different CRDTs work together seamlessly. A test like full_entity_split_brain_scenario simulates a complete data entity (e.g., a note) with a title (LWWRegister), content (RGA), and tags (ORSet), and subjects it to a split-brain partition. This ensures that the entire entity remains consistent, not just its individual components.

Conclusion

Building a reliable sync engine is a monumental task, and at its core lies a robust and correct CRDT implementation. By combining a suite of well-designed, memory-safe Rust CRDTs with a multi-layered testing strategy that includes property-based, adversarial, and large-scale simulation testing, we've built a foundation for PrivStack's sync that is both powerful and trustworthy. Our investment in this area ensures that our users can collaborate with confidence, knowing that their data will always be safe and consistent.