Some notes on pg Yugabyte and Obervability

TL;DR: Use SQL to expose information. And use Catalog and similar info to do it. Some pro/con arguments, and some Examples below.

Note: I'm not tweeting or otherwise promoting this blog. It is just info for those who have the link or can search cleverly.  I'm using this post to send a few notes to some  friends, who may or may not use the info further. 


Aim: Find + Expose data via SQL

To create some "catalog" information (views) about my Yugabyte objects.

My overall-aim: use SQL to find info related to the catalog, and to the back-end. Ideally, I get to use SQL and combine the info.  Yugabyte exposes some of this at ports 7000 (tserver) and 9000 (master)


Background

Yugabyte stores data in Tablets, which are stored at Nodes in clusters. A Tablet is replicated over multiple nodes (3 nodes if the RF=3: Replication Factor).

The "Front-end" of Yugabyte, the part that the Developer sees, is using the Postgres code, hence YB is a perfect fit-in if you ever want to replace your PG database. Re-using Postgres-code also brings us the pg_catalog and all the useful information therein.

I Installed yugabyte (v219) in docker-desktop-mac environment, and used the cloud-dot-yugabyte-dot com to verify some findings.

Current database is RF=1 (will expand soon if I got time+useful).


Entities to look for:

First items to look for would be Entities like: 

Node, 
    Master, 
    Tserver, 
        Tablet

Table 
Index (which supposedly has its own Tablets - it does, see below),

Tbl-Tbt: the relation between Tables and Tablet
Idx-TbT: idem, Index - Tablet.

The last ones would be an n:m relations: A table or index can be colocated, can occupy a single tablet, or can be split over multiple tablets. 

Furthermore, the View used to inspect Tablets would have relations with the Node(s) where it is located (several), a Masternode_id, a Table_id (the pg_iod and/or a yb-uuid), and a lot of properties that I don't know about yet, but that may be relevant to YB. Some of those are visible in the GUIs at ports 7000 and 9000.

The view to examine Tablets could like aprox like this:
yugabyte=# 
yugabyte=# \d ybx_tablet
             View "public.ybx_tablet"
  Column   | Type | Collation | Nullable | Default 
-----------+------+-----------+----------+---------
 id        | uuid |           |          | 
 node_id   | uuid |           |          | 
 master_id | uuid |           |          | 
 table_oid | oid  |           |          | 
 moreinfo  | text |           |          | 

yugabyte=# 

The Search...

Here is what I looked into..

With some hints on Slack from Jim Kniceley and some digging into pg_catalog, I found some Yugabyte functions:
yugabyte=# select  proname || '() ; ' as fcntcall
yugabyte-# from pg_catalog.pg_proc 
yugabyte-# where proname like '%yb%'
yugabyte-# order by 1 desc ;   
                     fcntcall                      
---------------------------------------------------
 ybginhandler() ; 
 ybcinhandler() ; 
 yb_table_properties() ; 
 yb_servers() ; 
 yb_server_zone() ; 
 yb_server_region() ; 
 yb_server_cloud() ; 
 yb_pg_stat_get_queries() ; 
 yb_pg_stat_get_backend_rss_mem_bytes() ; 
 yb_pg_stat_get_backend_catalog_version() ; 
 yb_pg_stat_get_backend_allocated_mem_bytes() ; 
 yb_mem_usage_sql_kb() ; 
 yb_mem_usage_sql_b() ; 
 yb_mem_usage_sql() ; 
 yb_mem_usage_kb() ; 
 yb_mem_usage() ; 
 yb_is_local_table() ; 
 yb_is_database_colocated() ; 
 yb_increment_all_db_catalog_versions() ; 
 yb_heap_stats() ; 
 yb_hash_code() ; 
 yb_getrusage() ; 
 yb_get_range_split_clause() ; 
 yb_get_effective_transaction_isolation_level() ; 
 yb_get_current_transaction_priority() ; 
 yb_catalog_version() ; 
(26 rows)

yugabyte=#  

This set of functions exposed some of the info I was looking for. I did some experimenting to see what calls would be needed and what data was returned. 


A Primitive Start: my first View.

And I then used some of this to create a view to expose some of the table- and index-data (in future I would probably do separate views for Tbls and Idxs): 

create or replace view ybx_tblinfo as (
select c.oid, c.relname
, tp.num_tablets
, tp.num_hash_key_columns
, tp.is_colocated
, tp.tablegroup_oid
, tp.colocation_id
, yb_is_local_table ( c.oid) islocal
from pg_class c
, yb_table_properties(c.oid) tp
where 1=1
and c.relkind in ( 'r', 'i' )
);

This view is far from perfect, and should probably be done much more efficient. But... I can now use this view to "inspect" my tables:

yugabyte=# 
yugabyte=# 
yugabyte=# select oid, relname, num_tablets, is_colocated from ybx_tblinfo where relname like 'pt%' ;
  oid  |            relname             | num_tablets | is_colocated 
-------+--------------------------------+-------------+--------------
 16398 | pt_1                           |           1 | f
 16429 | pt_1_payload_filler_amount_idx |           1 | f
 16401 | pt_1_pkey                      |             | 
 16403 | pt_2                           |           1 | f
 16430 | pt_2_payload_filler_amount_idx |           1 | f
 16406 | pt_2_pkey                      |             | 
 16408 | pt_3                           |           1 | f
 16431 | pt_3_payload_filler_amount_idx |           1 | f
 16411 | pt_3_pkey                      |             | 
 16413 | pt_4                           |           1 | f
 16432 | pt_4_payload_filler_amount_idx |           1 | f
 16416 | pt_4_pkey                      |             | 
(12 rows)

yugabyte=# 

At first, views like this will solely be "descriptive". But in the long run, those views, or derivatives from these views, would start containing counters and timers. This is how most other "catalogs" seem to evolve (the discussion of counters versus timers is topic for later, suffice to say you need both, but: Later...)

A good Start would be to have at least some view on the Key-Items: Nodes, Tablets, Tables and their relations (tbl-tbt, idx-tbt) etc..

Some of it is already there: While looking around, I stumbled across some interesting functions, for example for memory-usage. And this interesting one: yb_getrusage(), which seems to be a able to return a record-set as well:


yugabyte=> 
yugabyte=> select yb_getrusage() ;  
                  yb_getrusage                   
-------------------------------------------------
 (,,71536,0,0,0,6641,0,0,0,504,0,0,0,256281,105)
(1 row)

yugabyte=> select * from yb_getrusage() ; 
 user_cpu | system_cpu | maxrss | ixrss | idrss | isrss | minflt | majflt | nswap | inblock | oublock | msgsnd | msgrcv | nsignals | nvcsw  | nivcsw 
----------+------------+--------+-------+-------+-------+--------+--------+-------+---------+---------+--------+--------+----------+--------+--------
          |            |  71536 |     0 |     0 |     0 |   6641 |      0 |     0 |       0 |     512 |      0 |      0 |        0 | 256550 |    107
(1 row)

yugabyte=> 

Now, those data look like metrics to me, and they are probably related to the Node.

Clearly, there is already some intention for Monitoring or Observability there.


So Why all this...


My question to Yugabyte and others:

Can you Expose Views with relevant data about Nodes, Tablets etc ?


Note: I know the "way forward" is currently to use logfiles and Prometheus to do monitoring and inspection, which is pretty much Industry Standard at the moment. 

But for a Database, the most natural interface to look at data is SQL.

If views are going to span multiple nodes and incur overhead: I can understand the developers would Hold Off! This isnt going to happen overnight. 

But I'll point out that existing databases, MS-SQL, Oracle, PG, all have similar mechanisms to Expose information: Views on Catalog and Internals.

From history, I know it took Oracle 15+ years, 1990-2005, and a lot of lobbying (begging) from "troubleshooters" to get to where they are on Observability. It started with just static info, and over time they added counters and/or timers. Most of this happened under pressure of either Persistent Problems or Benchmarking-races. And it happened partly bcse some ppl had sufficient "vision". But the resulting "Open-ness" has been worth the Effort: An Oracle Database can be inspected (and fixed) to great detail and at great efficiency. A lot of the the information is available "live", and not with minutes of delay from writing, collecting, reading, merging logfiles etc.. (this is one of my niggles with AWS-Cloudwatch: the data is generally delayed, and in case of Real-Trouble, the data is sometimes just missing).

From PG I understand there still is resistance (and tech-difficulty!) to implement similar mechanism but EDB seems to be exploring it actively now. They have been interviewing/interrogating Ex-Oracle-folks.

This kind of "Direct inspection" with a familiar, SQL-queries, interface greatly helps Resolution-Speed of issues.


Future: Everything is SQL, and SQL is Everything.

I'm probably in good company if I state that  "distributed" and "serverless" systems are The Future of Databases (link to  Stonebraker, a Turing Award professor, check remarks on Serverless at 0:54:00 and Yugabyte is metioned at 1:01:00). 

"Relational" is the emerged, evolved, way to handle data and information. And SQL is the most enduring "language" to manipulate Data. 

In Getting There, the new systems can learn a few things from "legacy" that is out there, and achieve Maturity a little faster. And an SQL-interface to Everything is probably part of that Maturiry (Codd: Rule4?)

I'm watching with Interest.


  

Comments

Popular posts from this blog

yugabyte : Finding the data, in a truly Distributed Database

Testing Resilience of CockroachDB - Replacing 1 node.

yugabyte terminology and related information