[DRAFT] Yugabyte and Postgres - Finding information.

 TL;DR: [DRAFT] Could not resist creating an ERD of tables to collect metrics. I have put "Session" central to my information. Considering it is the application-connect that initiates the session and is the main "customer" causing all activity in a database.


Regarding Session: a Datamodel...


Aim is currently to find heavy processes and heavy queries. Spotting underlying tabblets / tables may/should follow from that. But other may be more interested to dive directly into the storage-layer (as thew metrics on port 9000 seem to focus on rocksdb)


Central, I think, is the "Session". as identified from from pg_stat_activity and from. the ps -ef list on the host.


I then built a number of entities (tables) around the Session with the aim of collecting data. Relevant Data (I hope). 

Metrics... I did get lost a bit in the huge amout of text coming from ports 7000 and 9000. They seem to provide a lot of rocks-db and program/process info. But I could not (yet) relate that back to user-initiated activity, e.g. queries and sessions. Another reason why I put my focus on the data from views.


And when storing time-series (all the ybx_abcd_log -tables below) it is not difficult to use Grafana with the pg or yb connector to produce nice graphs.   


Grafana on table saved from yb_active_session_history


But even the graphs mostly lack the focus on "Session". I'm still looking into that. (Added the grafana image bcse ppl like graphs, cloud-systems should produce good looking graphs, to give an illusion of control....) Later.




The Entities, Tables, short descriptions.


Session, a postgres-process with a pid on a host, belongs to a tserver, on a host. Session also links to an "application" or some component that is initiating work on the system. This leads me to define a table: 


table: ybx_sess_mst: try to catch the master record for every session.


My main source for session-data is pg_stat_activity.



The Session is then linked to ASH, where the work is "polled" from the view yb_active_session_history, with information local to a tserver:


table: ybx_ashy_log : data logged from yb_active_session_history



ASH records are local to a tserver, and belong to a session or parent_process, possibly via root-request.  But that RR is hard to "spot": does it belong to the Session, to the tsrver, or to the query? Possibly the link of the three. TBD.



On the Sessions, We can regularly "poll" logging data out of a session from pg_stat_activity:


table : ybx_sess_log, regular polled data from pg_stat_activity.


Check the first ERD below for session, ash, and some related entities.



Looking upwards from session, the session is initiated at a t-server, and we can regularly capture (scrape) data from tserver via yb-admin and the function yb_server_metrics(). Also via port 9000/prometeus. leading to tables:


table: ybx_tsrv_mst : 1 record per tserver

table: ybx_tsrv_log : regular polled data, e.g. memory, threads, cpu...


Similarly, we can capture data for universe and yugabyte-master


table: ybx_mast_mst : 1 record per master

table: ybx_mast_log : polled (scraped) data from yb-admin or port 7000.



Back to Sessions, and queries. Both the session pg_stat_activity) and the ASH-record contain a query_id. We can capture query-data from pg_stat_statements, resulting in tables:


ybx_qury_mst : signal existence of the qry, and act as masater, linking-pin

ybx_qury_log : capture data per t-server from pg_stat_statements.


Note that the cumulative data in pg_stat_stmt is not relatable to a session (yet).

Possibly there is a query-plan (explain plan) in there, and a query can have muliple of those plans, which hints to future table(s):


table: ybx_qury_pln : some as yet unknown way to store a plan.



For storage details, we know ASH acts on tablets, local to the node where the ash-view is queried, which prompts two tables:


ybx_tblt_mst : 1 record per tablet (which will have at least 3replicas... )

ybx_tblt_rep : a replica (leader or follower), local to a t-server


Note that a tablet_replicat (the rep) will have "role" and probably a "state" (tombstoned?). When tablets move or are deleted, we need to signal this via some "gone_dt", and over time, a tablet(replica) can return to a given tserver. Hence the ybx_tblt_rep may have "found_dt" and "gone_dt" as identifying fields.



Tablets are linked to tables, and because of possible "colocation" this will also require an linking table to model the n:n relationship. Modelling this out we get


ybx_tabl_mst : master record per table, key is table_uuid.

ybx_tbl_log  : info from pg_stat_tables, per tsrve3r, per log_dt ?

ybx_tata_lnk : linking table to tablet: just two keys ?


The master-record should contain the postgres OID and other info. 

The table-log is to capture data from pg_stats Per Tsever and per time (log_dt).

The lnk-table is to link tables to tablets using the two UUID fields. Not sure if there is more data in that entity.


Tables will belong to a database, and from pg_database views we can collect info on those databases (not linked to a tserver!):


ybx_datb_mst : 1 record per database, key is datid, the oid inside postgres.

ybx_datb_log : polled data, per t-server, key is datid + tsrverk + log_dt


This whole reasoning leads to a fairly busy ERD, and to a primitive script to create the tables. I'll print a few partial ERDs to illustrate.



Scripts and Diagrams (from DBeaver)


Since some time, I'm trying to script the capture of data. Notably from yb_session_history, yb_local_tablets, and the pg_stat views. Those scripts are still messy, but they help the "thinking process". I might ask a few questions on the yb-slack about this and it helps to have some basic explanations from my thinking.


Scripting... I did manage to generate a running script to create all the tables... (link below).  The sripts and functions to collect (some of ) the data are too messy to publish atm. Later.


Note: In real-life, the constraints had to be left-out because of the order in which we can collect the data: pg + ybdb often provide details first, and the "master" or meta-data is then to be derived by via a script or function.


script: link.


two partial ERDs: attached..



The Session and some surrounding entities (click or download to enlarge),



T-server, master, and how session, query and ash relate.. (click to enlarge)



Caveat Note: This blog Currently For Discussion Only. This whole blog was very much a rush-job. But I might get some ppl at postgres or ybdb to think in this direction. 




Comments

Popular posts from this blog

yugabyte : Finding the data, in a truly Distributed Database

Testing Resilience of CockroachDB - Replacing 1 node.

yugabyte terminology and related information