Distributed - but when do we over-react

September 26, 2023

[DRAFT!]

TL;DR: Distributing a Database helps with Sharding and Resilience. But do I really want my data spread out over 10s or 100s of "nodes" ?

Background.

Been investigating Distributed Databases.

Finding: Scale out means Fragmentation of Data.

Adding nodes to a cluster can help to improve both

Resilience, by having at least 2 spare copies and a voting mehanism, and

Scalability, by adding more workers, more CPU, more memory.

But the downside that I see (or imagine) is that my data gets too fragmented, too much spread out.

Solution: Shared Disks ?

For Resilience, I would prefer to keep 3, 5, or maybe even 7 copies of my data. But not more. Too many copies mean too much overhead on writing and voting.

For Sharding, I want the data spread, but not fragmented over separated storage components any more than necessary.

For some "scale out" situations, I may want more CPU and more Memory, but maybe not a further fragmentation of my data on (non-shared) disks.

Under sudden increase of traffic, load, I may want to scale up (quickly!) to 37 nodes, to add CPU and memory to my system, but I dont want to have to look after 37 storage-partitions. And I dont want to invest effort in re-distributing shards over 37 disks, especialy not under load.

Alternatively, when scaling back down from 37 nodes to, say, 9 (3 regions, 3 zones), I dont want the system to have to re-consolidate my data from 37 back to 9 storage-partitions.

Ideally, Scaling up is a matter of CPU and Memory, not for Storage, at least not by moving data around to occupy new storage-partitions. And scaling down is actually quite difficult if you have just spread out over multiple non-shared storage partitions.

Note: On scale-down, I can also imagine re-mounting 37 storage-partitions to just 9 nodes, but that would be complicated in the databases I have observed so far.

In Short:

Distributed Databases use multiple nodes (instances, shard) to achieve both Resilience and Sharding. But I hesitate to spread my data out over too-many pieces of storage.

Additional thoughts

Scale-down in both CRDB and YB seems to involve manually moving data back to the remaining nodes.

Search This Blog

SimplePGDBA

Distributed - but when do we over-react

Comments

Post a Comment

Popular posts from this blog

[DRAFT] Yugabyte and Postgres - Finding information.

[Draft] Yugabyte ASH notes and suggestions.

[Draft] Postgres - Yugabyte: logging of sessions, and (active-) session-history