yugabyte terminology and related information

[Work in Progress - notably needs an ER-diagram to help visualize] 

TL;DR: This text will Identify the parts (the objects, entities) that make up a YB-universe and try to clarify their function. 


In a later post, I will try to use the web-consoles at ports 7000 and 9000, and experiment with yb-admin and the yugatool to find some of the available information about these objects.

For the momet this post will collect my notes, and hopefully result in a readable article at some point.


Background

When thinking about a system, it helps me to have an ERD-view of the components. In this post I will try to find the yugabyte terminology, and try to interpret it from what I see and what I can peek out of the yb-tooling.

I'm keeping the xCluster replication out of scope for the moment, but that should be included ASAP, as I think it is important. 

Although the web-exposed pages contains some performance-data, I will also keep that out of scope for the moment (check ports 7000, 9000 and 15433, and possibly some others). But in future, every one of the components defined below should probably get a "metrics" sub-entity where we can store measured/extracted/derived information about its "performance". Measured data would include things like: Status, Usage-Counters, Sizes (memory, disk), and Timings. All for later. 

I want to start by just defining the components, and "think about them" first.


Info From the documentation.

when reading the intro and concept documentation from yugabyte, I find the following concepts (I'd call them entities, I might even create tables to hold info on them...):


Universe: a group clusters containing nodes, where data is organised in namespaces, and inside namespaces in tables.

https://docs.yugabyte.com/preview/architecture/concepts/universe/


Cluster: A universe contains 1 primary cluster and 0 or more read-replica-clusters. In the below article, I will assume a universe with just a single R/W cluster (e.g. only the primary). The case with read-replica clusters is not much different, but obfuscates a lot of the explanations (IMHO). 


Namespace: A Universe contains namespaces which hold the data, and these can be of two types: 

YSQL: namespace corresponds to a (postgres) database

YCQL: namespace corresponds to a (cassandra) keyspace


Host or Node : Physically, a Universe contains nodes (or hosts, machines, containers) which serve to run the software-components of yugabyte. At the moment, I'll designate nodes by IP or nodename. They can have capacity-properties like : CPU, Memory, Diskspace. Nodes will be located in clouds, zones, regions, which may come into play when resilience and geo-distribution become important.


Master: A yugabyte software process responsible for keeping metadata and coordinating work. One of the master-processes is acting as "leader", and if needed (e.g. on faillure) the raft-protocol is used to elect new leader.


TServer: A yugabyte software process responsible for writing and reading the user-data. Most (all?) yugabyte work is directed to tservers, and executed by the tservers.  The spread of work over the tservers is coordinated and kept in balance by the master-process(es). 

Note: a node (host, machine, vm, container) can run both master and tserver processes, and in all the demos Ive see: a node runs both master + tserver.


Table: The "user object" that holds the user-data, the rows and columns. The Table  serves as the source + target for SQL-statements (ins/upd/del and selects). A table is logically held in a Namespace, and will be physically held in 1 or more tablets. Colocated tables share a tablet to hold data from multiple tables.


Tablet: The underlying storage-object, sometimes referred to as "shard" . For Resilience, yugabyte will keep multiple copies of every tablet. In case of RF=3 (Replication Factor Three, the default), there will be 3 copies of each tablet, divided over 3 tserver-nodes. 


Tablet_TServer_Replicas: The replicas of each tablet will be on 3 tservers (RF=3). One of these will have the role of "leader", the other are listed as "followers". In the tablets listed via the web-interface (at tserver port 9000), you will find a column "RaftConfig", indicating if a tablet is a follower or leader, and where the peers are located. In the yb-admin tool, use "list_tablet_servers <tablet_id>" to find the list of tservers that hold a copy of the Tablet.


In a Diagram (I used DBeaver) it looks like this:




Some notes

In the replicas of tablets over tservers, the Tablet_TServer_Replica combinations, or the tablet-replica-groups, one replica will be leader and the others (RF-1) will be followers. On failure of the tserver (or node, or network to tserver...), the remaining tablets will use the raft protocol to elect a new leader.

Note: I didnt find a good name for the "tablet group", and from talking to a friend I got the concept is: The Tablet is virtual. The actual copies could be referred to as "replicas", of which there should always be 3 (at RF=3). Tablet_Replicas that are the physical objets. One of these will be the leader + 2 will be followers. The web-interface on port 9000 will tell you where they are located. alternatively: yb-admin list_tablet_servers...

Note: the relation between tables and tablets is n:m. 1 table can be "split" into many tablets, (1:m) or a table can be colocated with others in the same tablet (m:1). Hence, there will probably occur a linking-entity: Table_Tablet.


There is Always More..

This isnt finished. I hope t re-visit this post with an ERD-drawing, and possibly some create-table examples, some hints of where to find the info (google : yb-admin and yugatool) and maybe even define some ref-constraints... 

Also important, conceptually... Once you have these entities (objects? classes?) defined they are the basis for your next step: Find Metrics per object. For example, a table would have counters and timers to keep track of the activity and time-spent by operations. A tablet-replicat would have counters+timers on how much read- and write-activity occured on the replica, possibly signalling a hotspot... 

All that: Later.


-- -- -- - Questions... always so many -- -- -- -- 

q1: Masters: How may masters in a universe ? always RF=3 ? Can and Should there be more masters ? 

q2: Masters and Tservers: should the cpu- and storage capacity of masters and t-servers differ? e.g. more storage, possibly more cache, with T-servers ? 

q3: what about using k8s operators, do they have similar files to yugabyted.conf ?

q4: what kind of challenges do customers experience with these systems  ?




can run.

Comments

Popular posts from this blog

yugabyte : Finding the data, in a truly Distributed Database

Testing Resilience of CockroachDB - Replacing 1 node.