Autonomous Data Warehouse – the new kid on the block. But does it really live up to the hype – will it make your analytics platform run faster or is it just more sales hype?
The most common question I get asked is “should we look at ADW”.
The second most common question (once I’ve said yes to the first one) then is “why”?
Let’s see how it really performs with some simple, reproducable tests that you can compare to your on premises databases, too.
What’s the target outcome we’re trying to achieve?
- We’d like to understand in what circumstances ADW is faster, when, where, how?
- We’ll also compare it to some other databases to give us some direct comparisons
What type of test will we run?
- We’re not testing an OLTP system.
- Neither do I think it’s relevant to test how good I am at writing a SQL query – let’s keep it simple and reproducible
- Most of my customers are running analytics systems – so large table scans and aggregations are the order of the day – so that’s what we’ve designed this test to validate.
- In every case, (except one), I find customers databases are disk bound – so this test mostly focuses looking at that.
What are the database specifications?
We’re running this test on 4x different databases. All “new” versions of Oracle.
- Oracle instance are running on the latest Oracle Cloud “OCI” infrastructure.
- All versions are “out of the box”, except the XE instance has archive log mode disabled (which improves write performance).
- The XE instance is also capped at 2GB of memory as per the license it comes with.
What are the test details?
- We’re going to create a pair of sample test tables with 2 million rows respectively.
- I choose these since they’re manageable sizes and quick and easy to produce.
- We could of course do multi-billion row queries – but then I’d also expect to be looking at the datamodelling (for efficient storage), partitions, indexing strategies and so forth. Easy to do – but this test is intended to be an entry point – should we look at it, not how fast can you make it.
- We’re then going to run a full table scan queries on the tables, 3 times (just for consistency).
- Finally we’re going to run a pure CPU query to compare performance. Is there any difference?
- You can download the exact script here.
Create Table Performance (2 Millions Rows)
- Fastest performance is the XE database (not entirely a surprise with archive log mode turned off).
- Write performance across the other databases looks comparable
Select SQL Performance
The XE database performs well (average of 2.64s), but the ADW blows everything else out of the water – this is the result we were looking for!
No particular surprises here – once we take disk IO out of the equation, the respective databases (on a single threaded query) are comparable.
- For ADW, our primary goal is looking at reporting performance – this defines the user experience on any analytics platform and is therefore the key metric we’re interested in.
- ADW wins hands down in this arena.
- 34x faster than the slowes DBCS & 4.7x times faster than the next fastest database.
- We can also see that the DBCS instance select performance is generally low – the XE database out performs it well here.
- So to answer the original question – should we look at ADW for a analytics platform – the answer is a resounding yes.