Using Explain with Yugabyte Distributed Database

 TL;DR: Explain should provide some insight in the Efficiency of an SQL query. We start by looking at the very basics on a "distributed" database. Immediately I run in to some Questions.

Note: Query-planner and Query-optimizer can be a large topic. There is a lot to inspect and to write about on this subject, a whole lot more than what I can cover in a few paragraphs.


Background.

In most SQL processors, there is the "Explain" facility to show some of the internal workings of the query. Trouble-queries (slow ones), can be subjected to an Explain, to see what is happening, why they are slow. And possibly suggest a Fix.

Distributed databases pose a whole new, interesting field for Explain, because the data is sprea-out over tablets/shards/systems. Effets of caching can suddenly be "local" or "remote" and "chattyness" or nr of RPC calls may come into play.

Told you: There is A Lot to explore here.


Start Simple, tables in a distributed world.

We start with a table that has just a PK defined, and contains 100 records, one "index" namely the PK, and some fields: a number, a date, and some characterfields. I'll include a "filler" field to contain some random-data to avoid effects of compression. A record in this table should be close to 1K in size, and difficult to compress ;-).

To allow for "distributed", I'll create two versions:

t01 : sharded over a single tablet (hence 3 replicas when RF=3).

t16: sharded over 16 tablets, nicely distributed over the cluster.

From earlier work with other RDBMSes, I know this kind of table has most of the properties relevant to "test with", notably various data-types, and data-content which can be indexes or aggregated, or searched-through. 

The tables look like this:


I'll insert 100 records in each, and here is a sample of the data:

Notice the various components: ID, Date, payload, filler... 


Explain some Queries

Now let's select some records and ask for an "Explain". 

First, we ask for a single record, using the PK: When selecting 1 record, we expect that both t01 and t16 need to read from just 1 tablet, a single storage call: 

QED, the explain shows what we expected: The SQL engine decides to do an "Index Scan" on the pkey.

For t01: It required 1 index/storage call, possibly because that table only has 1 tablet.

For t16, which has 16 tablets, it also requires just 1 index/storage call, because the condition (id=42) is able to use the hash-index on ID. So far so good.

Incidentally: This is what yugabyte is Good At: Single-Record lookups, OLTP-style, using hash-indexes (equal-condition) to process data.


Next, we ask for "all records" of the table. When selecting without a where-condition, we expect a different explain. Becasue t01 is located (split into) a single tablet, we expect 1 storage-call. But for t16.. ? 

We notice both plans now use a Seq Scan to look for the data. 

For t01: we see one single table/storage read-request, and that is expected because this table is "split" into just 1 tablet.

For t16, we notice 16 table/storage read requests, because this table is split into 16 tablets. The SQL-engine has to check every tablet for data, and there are 16 tablet, spread over al the nodes.

So far, everything looks "as expected".

Note that I dont want to discuss the "timing" of these queries. The 16 storage requests were diviced over 7 nodes in the cluster, with some nodes receiving more than 1 call. But when testing on a laptop running 7+ containers, timing is not a reliable measure...


Now for something unexpected...

Let's do some counting, to count the nr of records in a table.. would we expect to visit all tablets ? 

For t01, we see 1 storage-call, as expected. There is only 1 tablet under that table anyway.

For t16, we see Two storage-calls. Not 16 for visiting all-tablets. Maybe the nr of (distinct-)keys is stored somewhere separate? 

Note: same results for t01 and t16 when we do select-count(payload), or count of any other columns. For Table t16, the system seems to need 2 calls to determine the count.

Why then 2 calls, not just 1 call? or 16 calls to visit all tablets ? 

Another test: select count-distinct, for both ID (the unique-PK) and for Payload (not guaranteed unique) columns:

That is Funny (or: Expected?), when we specifically ask for "Distinct" counting, the SQL-engine seem to need to visit All Tablets again.


Lessons so far: Explain Looks OK, but maybe There is More.

Some of the behaviour is exacty "as expected". For some operations, the SQL-engine needs to visit a single tablet, and for others it needs to touch on all tablets to "inspect all data". That was to be expected.

But we found at least one "anomaly": for (some) count-queries, it is not necessary to visit all tablets. Is there some "aggregate" stored somewhere? 

Something to ask Yugabyte about ? 


------ End of Blogpost ---  Some Questions below ---- 


Q: Where can we find more info about the explain-information, notably the maning of "storage table read request", "storage index read request" and "storage read request" ? Are those all The Same Thing? Or: How do they differ ?


Q: How can the system "know" a count from only 2 storage-calls ?


Comments

Popular posts from this blog

[DRAFT] Yugabyte and Postgres - Finding information.

[Draft] Yugabyte ASH notes and suggestions.

[Draft] Postgres - Yugabyte: logging of sessions, and (active-) session-history