Posts

Showing posts with the label cluster

Checking Cockroach : Distributed, Replicated, Sharded.

Image
 TL;DR: Some simple testing of data in a table to see how it is "replicated" and "sharded". Spoiler: cockroachDB replicates from the start, but it only shards (splits) when necessary. Hence on default settings you need to add about 512M to a table before any sharding (splitting) occurs. It seems to work Fine! Background: Distributed Databases, Replication and Sharding. The two different aspects covered by a Distributed Database are 1) Replication and 2) Distribution/Sharding. Replication: Offering resilience against outage (storage-, server- or network-faillure), and Sharding. Replication means there are multiple copies of the data in different storage-areas, or even in different physical locations. Sharding: Offering scalability to more-users and/or parallel processing of large sets. By having data in more manageable chunks (shards, partitions), it can be handled by multiple processes in parallel. Distribution or Sharding (can) offer both higher multi-user capacity...

yugbyte IO monitoring and load-balance verification.

Image
TL;DR: I've experimented with a the running of a "Distributed Database", and concluded that the load indeed gets distributed over several nodes. I'm also discovering how to use the tools. And I "Double Check" as much of the information as I can. There are always some Surprises, and Questions... Background The nature of Distributed is to Spread the load. I'm going to try and observer some of that distributed load. Currently still mainly in the Explore and Verify stage of my discovery. I'm deliberately going to cause what is probably an unbalanced-load, and see if I can find + fix that.  Note that my setup is still primitive: my nodes are docker-containers running the template-downloadable image on a macbook (described here) . I am currently looking to find and test the Concepts and Principles behind yugabyte, rather than doing real-world testing.   Tooling and monitoring: yb-admin, yugatool, dsar: I built a primitive inserter to connect and store 1 re...

yugabyte : Finding the data, in a truly Distributed Database

Image
 TL;DR: I dig deeper into yugabyte to find/verify how the various components work, and where the data is kept. I find, with a little digging, that the data is Indeed Distributed and Replicated. Nice! Background In a previous post, I did some RTFM and identified various components that make up a running yugabyte "universe" or a cluster. Now it is time to find the actual data, and explore some of the tools.  hidden-agenda, wishlist: I would really like to be able to "query" this data straight from psql or cql, but that would require additions by yugabyte similar to how postgres exposes its catalog. In a distributed  environment, this is more of a challenge than in a "monolith" where everything is nicely in 1 place. Setup and Tools My "cluster" consists of 4 to 6 nodes running in docker. The setup is described in an earlier blogpost, and was a good starting point. But by now I begin to see I might want to use specific nodes for master and for tserve...

Distributed databases, how many shards and where are they

Image
[+/- Draft! Beware: Work In Progress] TL;DR: Trying to find out how to shard data on Yugabyte (link). I find a lot of "moving parts" in the YB-Universe, and try to explain and simplify. For Deep-Down-Engineers, YB-ers: check the questions at bottom. Background Future of Databases is ... Serverless, possibly sharded. But Sharding is something for Very Large sets. An average (serverless) database that comes form "the real world"  doesnt need 1000s of shards... IIMHO, it needs "ACID" and Relational Capabilities First. Yugabyte does this, and potentially "serverless" to make the database more Resilient, and more Scalable (on demand-scaling?) and overall Easier to Operate. By experimenting with a 6-node database, I try to observe the sharding, and might try to draw some conclusions or "good practices" from what I see. My "cluster" is running in docker-containers hence K8s or other container-systems will also work. After the first e...

Yugabyte Distributed database (in docker on macbook...)

Image
TL;DR: I managed to create a 6 node cluster and run a distributed database on it. Got me a nice Playground, and some interesting things to investigate. Background: Serverless, but I want multiple nodes (mostly kidding: because there is always a server running somewhere) But, like some of my friends say: Serverless is The Future.  As a database-person; I've now started to explore a distributed database. And bcse I already knew approx how to use docker-containers as "little severs", that was a logical way to start. I also had already experimented with yugabyte both in docker (1 node, 1 container,  following these examples) and on the free cloud-offering ( cloud.yugabyte.com ). But those were all still "single node". I could do postgres-commands (nice, all of my scripts + demos worked...), but RF=1 is not really what Yugabyte is designed for. I needed more nodes... Creating and Running multiple nodes. Luckily, there was this example by Franck: https://dev.to/yugab...