yugabyte : Finding the data, in a truly Distributed Database

TL;DR: I dig deeper into yugabyte to find/verify how the various components work, and where the data is kept. I find, with a little digging, that the data is Indeed Distributed and Replicated. Nice!

Background

In a previous post, I did some RTFM and identified various components that make up a running yugabyte "universe" or a cluster. Now it is time to find the actual data, and explore some of the tools.

hidden-agenda, wishlist: I would really like to be able to "query" this data straight from psql or cql, but that would require additions by yugabyte similar to how postgres exposes its catalog. In a distributed environment, this is more of a challenge than in a "monolith" where everything is nicely in 1 place.

Setup and Tools

My "cluster" consists of 4 to 6 nodes running in docker. The setup is described in an earlier blogpost, and was a good starting point. But by now I begin to see I might want to use specific nodes for master and for tserver. And I have learned that 6 nodes, or "many nodes" may un wantedly lead to too-many tablets for small tables. All interesting stuff.

My preferred too to "look at data" or look at anything, is SQL. Hence I queried :

select * from yb_servers() ;

And I managed to find some other functions of the form 'yb%' from looking at pg_proc. But I didnt really got "under the hood" of yugabyte with just SQL, as I would with the "catalog views" in Postgres-proper or in other databases.

There were also the web-pages at ports 7000, 9000 and 15433 of the yugabyte-nodes. I did learn from those, but they didnt really gave me a look inside, and the pages require a lot of "Control-F" and scrolling to find information.

The tools of choice had to be:

yb-admin : This seems to be the do-it-all tool for yugabyte. Run it and read the help...

yugatool.gz : a utility provided by yugabyte. Google it, download and unzip, and see what it can do. Notably useful to inspect "clusters".

Both tools require a "list of master nodes", hence I picked that up from yugabyted.conf, and set and env-variable. A few aliases also came in handy:

So far my setup.

What am I looking for: Entities.

Firstly, I wanted to be able to find, and "catch" the entities which I had spotted reading the RTFM and poking around in the catalog after trying my postgres-demos and postgres trick-scripts on yugabyte (all demos and most tricks worked straight out! this yuga-thing is "compatible").

I am not (yet) looking for "metrics" or "performance data", but once I get the backbone of entities correctly identified, the "metrics" will not be far away.

This is the ERD I created after first inspections:

I realise this is only my first-look, and I will probably miss a few entities (Items like Database or Namespace, with property "colocated", for example). But this is what I could start from.

Let's find some information.

Universe, Clustter and some infrastructure:

To find any universe info, I didn't get much further than yb-admin which gave me json-output:

yb-admin -master_addresses $MASTERS get_universe_config

It produces json output, but with jq, that looks like this:

I'm sure there will be a uuid for "universe" somewhere, but it will have to wait.

Next is Cluster. The yugatool gave me a nice overview using:

yugatool -m $MASTERS cluster_info

It comes back with cluster-info, master-lists, and tserver-information:

This helped: I now had a cluster-uuid, plus an overview of masters (uuids and IPs), and tablet-servers (uuids and IPs) that seemed to belong to that Cluster. I also realised I had temporarily shut-down nodes 5 and 6. Hence, only 4 tservers in my cluster at that time.

The Estate: Nodes, Masters and TServers:

I could already determine those from the output of previous yugatool. But there are are also some yb-admin options to pick up that data:

yb-admin -master_addresses $MASTERS list_all_masters

yb-admin -master_addresses $MASTERS list_all_tablet_servers

After I re-started nodes 5 and 6 to make my cluster complete again, that looks like this:

These two commands will give you an overview of your "estate". The masters and tservers run on hosts, identified by the various IP-addresses. In my case, three of the nodes run both master and tserver processes, but three more nodes, only run a tserver (e.g. act as processing- and storage-nodes).

Now look for Tables and Tablets.

The actual user-data will be held in postgres or cassandra tables (databases, keyspaces...). This comes down to "Tables" for the developer, and "Tablets" for the intenal yugabyte storage.

The pg-catalog doesnt give me any yugabyte-specifics, and the yb-functions I found so far are still limited (link). A few days ago, I created a view to expose some of the yb_table_properties(), but that was about as far as I could get (link).

yb-admin had a solution, try this command:

yb-admin -master_addresses $MASTERS list_tables include_db_type include_table_id include_table_type | tail -n20

The result was a list of yugabyte-info concerning cql (cassandra) and psql (postgres) tables, a list of some 500 lines. In this near-empty database, most tables were part of the internals, the catalogs of cql and postgres. But the bottom 15 were created by me: my tables and indexes:

This list could help to link up the pg-tables and ycql (cassandra-) tables to the yugabyte system: the uuid listed in the yb-admin output can also be found on ports 7000 and 9000, which is a nice Confirmation.

It also shows that the catalog-info is stored in a similar manner as the user-tables, which I consider a good thing. Everything is a Table/View/Relation, Everything is SQL.

Also note: the tables are not linked to any node, tserver or master. The "logical" table layer is at this point still detached from the "physical" (tablet-)layer. That is next.

Time to look for some Tablets. The info at ports 7000 and 9000 gives hints, but the actual data seem to be available from yb-admin. To show the tablets that make up table t2 you ask for:

yb-admin -master_addresses $MASTERS list_tablets ysql yugabyte t2

Here is the output for tables t1, t2 and t4:

I had created those tables with "split into" 1, 2 or 4 tablets. You can see each of them having 1, 2, or 4 Tablets, with a range-split-clause, and the IP of the node where the leader for that tablet is held. Now we are beginning to see Inside...

But There is more: every tablet should have 3 replica-copies... Some of that is visible in the "RaftConfig" via the web-console at node:9000/tablets. Again, yb-admin has the details:

yb-admin -master_addresses $MASTERS list_tablet_servers <tablet_uuid).

This command takes a tablet_uuid, which can be taken from the web-gui at port 9000 or from yb-admin with "list_tablets".

I've tried it with two of the tablets from table t4:

This was my Aha-Moment.

We know t4 is split (sharded) into 4 Tablets.

And we now see that the first tablet of table t4 is replicated over 3 of the tservers, and that the middle-one, located at IP 172.22.0.7:9100 is the leading tserver for this tablet.

The 2nd tablet, is similarly copied over 3 replicas, but those are located at other tservers, other IPs, and with a different leader. Different components. Distributed.

Hence, Table t4 is Split over four Tablets, and it is both Sharded and Replicated.

1) "Sharded" over the tservers, for load-balancing, and

2) "Replicated", each tablet (=shard) is replicated three times (RF=3) for resilience.

Here we See the Essence of Distributed Data.

Summary:

Using the web-gui and the command-line tools, I was able to find the components that make up a yugabyte universe or a cluster. I could find that data is actually Distributed over my nodes, and I can see some of the mechanisms involved.

This is a Distributed Database.

The tools I used were: yb-admin, yugatool, SQL, RTFM, and some "common sense"

Suggestions, notably to Yugabyte or Customers:

I could now go on an hack some code to pick the data from stdout to put it in various tables (more work than you think!). I also think this data doesnt have to via these tools or "go over stdout" to be presented in the database: there must be shorter paths.

And I dont see that as my main job: But I'm looking at yugabyte to provide this information in some yb_catalog in the near future. I am hoping some yb-customer sees the need for this data (and the metrics that would be attached to it) and will ask-for, or even pay-for, having this information available in the database.

----- end of Blog - some notes and questions below ----

Questions:

- This data is +/-easily available from CLI-tools, but it is hard to "combine" (to join) the various outputs. An SQL interface where you can query views instead of running yb-admin would be convenient.

- Would yugabyte consider setting some env-variables, such as $MASTERS ? The defaults for the address lists (localhost) dont seem to work on the containers I pulled.

- I know everything is more expensive (latency + complexity) in a Distributed + Replicated System. But how expensive (RPC-wise) is it to pull the yb-admin data regularly from the system by some viewing or polling mechanism. I know of the tool called "ybwr", which already does some of that.

Exposing the information more directly (and faster) will help with inspection, monitoring and troubleshooting. Even to start scraping if off the yb-admin output would demonstrate the case for metrics-as-data.

And exposing it as SQL-query-able will make it much more flexible for investigations and more efficient to retrieve (an SQL-result is often much smaller than an XML or even a JSON-document).

Search This Blog

SimplePGDBA

yugabyte : Finding the data, in a truly Distributed Database

Comments

Post a Comment

Popular posts from this blog

[DRAFT] Yugabyte and Postgres - Finding information.

[Draft] Yugabyte ASH notes and suggestions.