CockroachDB as database. First Look.

 TL;DR: CockroachDB was Easy to get up and running. It has a lot of the "features" that a mature RDBMS should have, and it seems quite Resilient, and smooth at that. But it doesn't quite feel like PostgreSQL.


Background

Because I'm exploring database options for "serverless" and "cloud native" environments, I've done some playing around with CockroachDB (link). To keep things simple, I deployed in docker containers. Connecting was tried via the cockroach-sql client and via psql and DBeaver from my macbook. Resilience testing so far only consisted of killing nodes. Inspection of data and system was mainly via SQL and CLI. 


Three Nodes, running in Docker

I used this set of instructions to start a 3-node cluster using my docker-desktop installation. 

What strikes me immediately: The CR container is quite small, at 500M it is about the size of the official pg container. The containers of yugabyte and orale are much bigger: 2G and 10G respectively.


Following the example, I ran a 3-node cluster, and did SQL- and some "destructive testing". That worked easily: in about 30 min (during a conf-call with another large, closed-source vendor...) I was connected to an SQL-prompt and could run SQL queries. 

That looked Cool and promising. An SQL prompt in short time,  let me try..

Oops. I guess "wire compatible" doesnt mean my psql (v15.4) works seamlessly. 

I also quickly found out that a lot of my own Postgres tooling would not work: I use plpgsql a lot, mainly for small utility-script like setting clever prompts, generate test-sets, running tests, generating CPU-activity, generating IO, measuring hiccups... I found myself struggling to un-tangle my plpgsql back to straight-SQL. 

Cockroach has this more or less documented (link to compatibility), but to me it was an bit of a surprise... Definitely something to keep in mind if you want to migrate from pure PostgreSQL to cockroach. 

DBeaver, being the Multi-Tool it is, connected easily though, so I could start walking around in the SQL-interface, and there was a lot to find there.

Node-kills, e.g. stopping a container, didnt seem to have much effect on a running "insert-into". No (eyball-)measurable hiccups at all. But Good Measuring has to wait until I get my plpgsql scripts re-written... Need Time.

I also did a brief "shell into the containers", but there was little to see there. Those nicely-lightweight images are Truly done "The Container Way": No Ballast...

The container-images contain hardly any tools to allow looking around: no ps, no file, no which, no top, no iostat, no vi(m). And of course no yum/apt/dnf to install things.  I guess this should be considered "a good thing", but as a tinkerer, I like to have those toolsets available. Inspecting those containers will have to wait a little.


SQL is Everything, and Everything is SQL.

The Nice Surprise was soon to come: cockroach has a lot of SQL-query-able tables andviews to "inspect" the internals.

Surprise: I spot 77 internal tables. And there the familiar known schemas: pg_catalog and information_schema. To me, as an old-fashioned DBA (Data-Beast Animal), that looks promising. I may be able to "Inspect" this system from the SQL-prompt. 

For meta-data and config-data, I much prefer an SQL interface. Rather than having to wander around the linux server(containers)-side or scrolling through web-gui interfaces or trying to make sense of tons of json peeled off some tcp-port. I've often spoken and have blogged : SQL is the Best Way to present data for inspection. (Will re-blog on that again, you bet: Metrics may belong in Prometeus, but Meta-data should be SQL-query-able).

These "internal' tables and views look quite information-rich. 

I'll point out a few Things to Explore:

There are tables with "node" information, including some with metrics. This may be interesting to examine configurations, maybe cluster-layout, and data-spread, and possibly investigate bottlenecks.

There is proof of a job-scheduling mechanism, internal to the database. Any good Mature RDBMS probably has such a mechanism (note: using such a scheduler requires Discipline, and the absence of a native job-scheduler in pg could be considered a Risk-Lowering-Factor, but that is another blog-topic).

There is even something resembling pg_stat_statements (aka v$sqlarea, for the oracle-tainted ppl):

This view shows the SQL-statements that the system may deem "slow" or problematic. The ambition to measure to the nanosecond seems quite forward-looking, imho. And in my case it made me smile a little: some of the the top-heavy statements were mine (known!) and the others came from DBeaver, which tends to do seq-scans on internal tables. 

After my initial euphoria of finding all these SQL "interfaces", I realised I need more time to study, to "discover" and to learn to use them.. Some documentation can be found here (link). My first impression is those views were created from 1) info that just  happens to be available, and 2) the need to find/troubleshoot certain items. Whichever the reasons, I am grateful to have them.

This concurs with what I know of the history of similar views and tooling in the oracle-system: Tooling is never high on the priority-list of product-managers. Good Tooling will only get created when a paying customer signals the Need and provides the Budget to pay for it (yet another topic for a blog one day...)

The old DBA/sysadmin adage is: NP (no problem....) actually: "Nobody Pays for Pro-active or Prevention". 

But cockroach seems to be ahead of PostgresSQL and some of the competitors in this area.


Remark after First Impression: 

(Caveat: Early notes, there is a lot to discover still...)

Cockroach may not be as pg-compatible as the marketing wants you to believe. "Wire-compatible" doesn't mean much to me if first thing I see is an error from a trivial "\d" and when my trusted old plpgsql scripts require a lot of typing to re-construct. Guess I have myself to blame for my "lock-in into pg".

But on the Positive side: cockroach seems resilient (no hiccups yet on node-kills, not even on an overloaded laptop). And this RDBMS seems to embrace "the SQL way"  (google: rule 4 from Codd...). It allows you to "look inside", which can be useful for exploring and very useful for adhoc-investigations and permanent monitoring.

There is also proof of some rich functionality inside, like the metrics on nodes, the sql-history, and a job-scheduler. In some ways, cockroach goes "over and beyond" what PostgreSQL can do today (Sept 2023, PostgreSQL is at v15). Consider that a Good thing: because postgres-extentions (like pg_cron?) are limited to a list provided by cockroach.

Remains the fact that CockroachDB is Not Open Source, and on first impression did not feel very PostgreSQL compatible. As a future-customer I would read that "compatibility page" very carefully, and possibly discuss with cockroach about their roadmap. 

For "greenfield" deployments, I would put cockroach on my shortlist. But if you prefer to stay Postgres Compatible and hesitate about vendor-lock-in: Be Very Careful.

I hope to be able to do some more Testing and Exploring in the near Future.


----- End of Blogpost -- some notes and questions below ----- 


Test: Grow cluster, how to join.. should I try with 7 nodes ?  (looks do-able)

Test: The internal tables/views are Very Interesting. Explore.

Test/Q: linux tooling. Need server-installation, observabilty is different when container-nodes are meagerly equiped. no linux tools. But do I really need to see "the server" ? Also Need to Investigate internal views/tables.

Q: can I find something like a "leadernode" what if I kill it, possibly under load.. test the same hiccup.. (not easy bcse no pl/pgsql. Need to re-work plpssql tooling...

Q: try explain-plans same as yb: indexes are conventional, should work better bcse indexes are dflt ordered, not hash ? 


Comments

Popular posts from this blog

yugabyte : Finding the data, in a truly Distributed Database

Testing Resilience of CockroachDB - Replacing 1 node.

yugabyte terminology and related information