Posts

Showing posts from August, 2023

Yugabyte - Testing: a 7 node cluster can survive on 2 nodes.

Image
 TL;DR: Just playing: I reduced a 7 node cluster back to 2 nodes. It Still runs! Background:  For previous experiments, I was setting up my 7 node test-system, and did some careful checking with bringing down nodes (link). At the end of that test I had 4 of the 7 nodes still running.  At the end of my test, yugatool cluster_info looked like this, with nodes 3, 6, and 5 no longer alive: After write-up of the previous test, I had "forgotten" my other 4 nodes were still running. When I noticed the terminal window with yugatool-display, I just wanted to see how far I could get... Just Playing: removing nodes... So I had a cluster, with 4 remaining nodes. And I also knew that my critical test-table was held in a single tablet, replicated over node4, node8 and node7. Re-confirmed by this screenshot of previous blog, showing where the (single) Tablet of  Table t is replicated: From what I suspected, the nodes 2 and 4 had to stay up to keep a majority (quorum) of yb-Master-processes,

Yugabute distributed database Some Resilience testing

Image
TL;DR: Yugabyte is a Distributed Database. I destroy some nodes to see what happens. The Database Survives. Later: Master-nodes are particularly Important.  Background Failure of nodes, or failure of connectivity (network-to-nodes) is probably the most common in distributed system. I'm going to try a few simple things like taking out a node, and trying to replace a node. Note that my setup is still primitive: my nodes are docker-containers running the template-downloadable image on a macbook (described here). I am currently looking to find and test the Concepts and Principles behind yugabyte, rather than doing real-world testing. So what happens if I remove 1 or 2 nodes ... ?  Overview and Setup The creation of my cluster is now smoothly scripted (link to script?), and by using yugatool, I can see that my cluster looks like this: We see 3 Masters (bcse RF=3), and a total of 7 Tservers. On fresh-create of the cluster, the top node is the leader of the masters, IP=172.20.0.3. In dock

yugbyte IO monitoring and load-balance verification.

Image
TL;DR: I've experimented with a the running of a "Distributed Database", and concluded that the load indeed gets distributed over several nodes. I'm also discovering how to use the tools. And I "Double Check" as much of the information as I can. There are always some Surprises, and Questions... Background The nature of Distributed is to Spread the load. I'm going to try and observer some of that distributed load. Currently still mainly in the Explore and Verify stage of my discovery. I'm deliberately going to cause what is probably an unbalanced-load, and see if I can find + fix that.  Note that my setup is still primitive: my nodes are docker-containers running the template-downloadable image on a macbook (described here) . I am currently looking to find and test the Concepts and Principles behind yugabyte, rather than doing real-world testing.   Tooling and monitoring: yb-admin, yugatool, dsar: I built a primitive inserter to connect and store 1 re

yugabyte : Finding the data, in a truly Distributed Database

Image
 TL;DR: I dig deeper into yugabyte to find/verify how the various components work, and where the data is kept. I find, with a little digging, that the data is Indeed Distributed and Replicated. Nice! Background In a previous post, I did some RTFM and identified various components that make up a running yugabyte "universe" or a cluster. Now it is time to find the actual data, and explore some of the tools.  hidden-agenda, wishlist: I would really like to be able to "query" this data straight from psql or cql, but that would require additions by yugabyte similar to how postgres exposes its catalog. In a distributed  environment, this is more of a challenge than in a "monolith" where everything is nicely in 1 place. Setup and Tools My "cluster" consists of 4 to 6 nodes running in docker. The setup is described in an earlier blogpost, and was a good starting point. But by now I begin to see I might want to use specific nodes for master and for tserve

yugabyte terminology and related information

Image
[Work in Progress - notably needs an ER-diagram to help visualize]  TL;DR: This text will Identify the parts (the objects, entities) that make up a YB-universe and try to clarify their function.  In a later post, I will try to use the web-consoles at ports 7000 and 9000, and experiment with yb-admin and the yugatool to find some of the available information about these objects. For the momet this post will collect my notes, and hopefully result in a readable article at some point. Background When thinking about a system, it helps me to have an ERD-view of the components. In this post I will try to find the yugabyte terminology, and try to interpret it from what I see and what I can peek out of the yb-tooling. I'm keeping the xCluster replication out of scope for the moment, but that should be included ASAP, as I think it is important.  Although the web-exposed pages contains some performance-data, I will also keep that out of scope for the moment (check ports 7000, 9000 and 15433,

Distributed databases, how many shards and where are they

Image
[+/- Draft! Beware: Work In Progress] TL;DR: Trying to find out how to shard data on Yugabyte (link). I find a lot of "moving parts" in the YB-Universe, and try to explain and simplify. For Deep-Down-Engineers, YB-ers: check the questions at bottom. Background Future of Databases is ... Serverless, possibly sharded. But Sharding is something for Very Large sets. An average (serverless) database that comes form "the real world"  doesnt need 1000s of shards... IIMHO, it needs "ACID" and Relational Capabilities First. Yugabyte does this, and potentially "serverless" to make the database more Resilient, and more Scalable (on demand-scaling?) and overall Easier to Operate. By experimenting with a 6-node database, I try to observe the sharding, and might try to draw some conclusions or "good practices" from what I see. My "cluster" is running in docker-containers hence K8s or other container-systems will also work. After the first e

Distributed data(base), some simple experiments.

Image
 TL;DR: in distributed databases (Example: Yugabyte), it helps to know how to define your tables. The default behaviour is Optimised (sharded, distributed) for Very Large Tables. But Small tables also need attention. Too Much of a Good Thing...  Background Distributed Databases are The Future. That is why I began to experiment with Yugabyte. I managed to create a 6-node (yes, Six Nodes) cluster in no time . And because Yugabyte is fully Postgres Compatible my good-old pg-scripts work straight away.  From the install-story, I found that by default all my tables seem sharded over 6 tablets and that was something I wanted to investigate futher. So, Let's Play.... The Demo I needed some demo-tables first. With YB comes the "northwind" demo (link) . This demo was promptly installed from the command-prompt from any of the nodes (regardless of which node: in my case they are all equal). I shell-ed into the container, and typed # Yugabyted demo connect And there it was. I also us

Yugabyte Distributed database (in docker on macbook...)

Image
TL;DR: I managed to create a 6 node cluster and run a distributed database on it. Got me a nice Playground, and some interesting things to investigate. Background: Serverless, but I want multiple nodes (mostly kidding: because there is always a server running somewhere) But, like some of my friends say: Serverless is The Future.  As a database-person; I've now started to explore a distributed database. And bcse I already knew approx how to use docker-containers as "little severs", that was a logical way to start. I also had already experimented with yugabyte both in docker (1 node, 1 container,  following these examples) and on the free cloud-offering ( cloud.yugabyte.com ). But those were all still "single node". I could do postgres-commands (nice, all of my scripts + demos worked...), but RF=1 is not really what Yugabyte is designed for. I needed more nodes... Creating and Running multiple nodes. Luckily, there was this example by Franck: https://dev.to/yugab