Oracle Vs. Snowflake

·

7 min read

Oracle Vs. Snowflake

I first started working with Oracle in 1987 on Oracle Version 5. That was not only before PL/SQL and row-level locking, but The Cloud and even The Internet itself had yet to be invented.

Back then, Oracle was an amazing product. The competition was hierarchical and network databases or Index Sequential (ISAM) files on big iron IBM mainframes. Using Oracle on a DEC VAX was like going from a model T-Ford to a Porsche 911 at half the cost.

Fast forward 30 years, and Oracle is just clinging to the top spot on the database ranking table, ahead of the MySQL database, while Snowflake, MongoDB and PostgreSQL are all growing.

In fact, Snowflake made it into the top 10 most popular databases in the past year, and is quickly climbing up the ranking.

Maybe it was because of the Oracle aggressive sales tactics or because of a selfish and arrogant attitude where customers are always put last.

The massive growth in popularity of Hadoop, and open source alternatives including MySQL, PostgreSQL have been eating into the Oracle market, and niche database alternatives from NoSQL vendors including MongoDB, Elasticsearch and Redis have become increasingly popular as business deal with web-scale data volumes and millisecond performance.

Finally, as I have indicated before Oracle is no longer a really sought after skill. It doesn’t even make it in the top 10 most sought-after database skills in 2019, as illustrated below.

Graph of most popular database skills.  Source: https://www.statista.com/statistics/793854/worldwide-developer-survey-most-wanted-database/

But Enterprise grade products like Oracle don’t just fall out of fashion. They either fail to innovate or like the dinosaurs, they die because they refuse to accept, let alone adapt to a rapidly changing world. One in which The Cloud is already dominating the IT landscape.


Want to learn more from a Snowflake Expert? Click on the image below for training from Analytics Today.

Snowflake Training from Analytics Today


Where did it all go wrong?

As early as 2005, the Turing award winner, professor Michael Stonebraker of MIT was predicting the demise of the dominance of Oracle, Microsoft and IBM in his seminal paper “One Size Fits All” – An idea who’s time has come and Gone. In this paper he accurately predicted the database market would fragment, and that the data warehouse market would be dominated by new players with the emergence of column-oriented database solutions.

Three years later he published OLTP through the looking glass which demonstrated the H-Store database, a stripped back open source database with a pure OLTP focus. This was an astonishing achievement, as the TCP-C world record was around 1,000 transactions per second per CPU core, and yet he managed over 35,000 on an Intel 2.8GHz desktop.

In total the prototype delivered an incredible 70,000 transactions per second, and he went on to build a highly successful commercial venture VoltDB, which recent benchmarks demonstrate single digit millisecond latency while maintaining a serializable isolation level and complete transactional consistency that is simply not possible on Oracle.

“The architecture of most DBMSs is essentially identical to that of System R [designed in the 1970s]”

— Dr Michael Stonebraker. MIT.

The underlying reason the team at MIT were able to achieve such an astonishing result, is illustrated in the chart below, and highlights the fact that the Oracle architecture is based upon a design conceived in the 1970s for the “System R” database.

In summary, they found that every commercial database product (ie. Oracle, SQL Server and DB2), spent around 93% of the time on overhead tasks, and just 7% doing any useful work. They discovered the database spent the majority of the time coordinating the in-memory buffer cache, writing to log files, and in-memory latching and locking.

Essentially the architecture was formulated at a time when memory was about 13 million times more expensive than today, and the entire architecture was based around the careful management of a historically expensive resource.

Even the release of Oracle in Memory which I personally demonstrated a 27 times performance improvement is still a sticking plaster on the problem, and I was unable to deploy this for a customer because of the massive licence cost per CPU core.

What don’t you need with Snowflake?

In discussing the differences between Snowflake and Oracle, rather than criticising Oracle, let’s consider what you don’t need with Snowflake:

  • Installation: There is no hardware to install, no operating system or database software to deploy. There are no patches to apply or database upgrades to schedule. In fact, switching to Snowflake may well be the last upgrade you ever need to do, and you can run with zero down-time.

  • Be Locked In: As Snowflake currently runs on Amazon AWS, Microsoft Azure platform and Google Cloud Platform. You can even seamlessly share data with customers, partners and third parties across all three cloud platforms using the Snowflake Data Exchange.

  • Database Management: There are no indexes to manage, no physical data partitioning needed, no statistics to capture or fear of a query performance cliff-edge if they are not correctly captured. In fact, Snowflake demonstrates near zero database administration effort.

  • Tug of War: Snowflake have abolished the tug of war for machine resources using workload isolation. Instead of a single, massive multi-core database server you can deploy an unlimited number of completely independent virtual warehouses. This means you can be loading terabytes of data on one warehouse, while transforming the data on another, and analysing the results on yet another.

  • Disk Upgrades: Never run out of disk space ever again. Snowflake supports literally unlimited data volumes on both Amazon AWS, Microsoft Azure and soon Google Cloud Platform. On Snowflake, some customers hold as much as a petabyte in a single table.

  • Data Compression: There is no need to pay the licence cost of the OLTP option or carefully load data to maximise data compression using insert append on Oracle. With Snowflake, all your data is automatically compressed using columnar compression, often to a factor of between 3 and 6 times.

  • Migrate to a Bigger Server: As the chart below illustrates, Snowflake is incrementally scalable, with a simple set of T-Shirt sizes, and can be increased from an extra-small to a 4X-Large server within milliseconds. To put this in context, that is increasing the size of the machine from one to 128 database servers. The chart demonstrates the reduction in elapsed time of a query to copy 1.3TB of data. This simple CTAS (Create table as select) is the mainstay of most ELT procesesses.

    • Deploy for High Availability: There is no requirement to deploy an expensive hot-standby data centre, with data replication and fail-over for high availability. Snowflake transparently writes data to three Availability Zones within a region and can automatically survive the loss of any two. Indeed, they recently announced the launch of cross-region and cross-cloud replication and automatic fail-over. This means a customer in the US West Coast will be able provide an automatic fail-over to the East Coast, and even fail-over from Amazon AWS to Microsoft Azure or Google Cloud.

    • Spend hours producing backups: As Snowflake provides up to 90 days of time travel including the ability to undrop a table, schema or even an entire database within seconds. This means you can query the data as it was up to 90 days ago, and with zero copy clones, take a transactionally consistent backup of a terabyte of data within five seconds, with recovery equally fast.

    • Worry about Security: As Snowflake includes end-to-end encryption with automatic key rotation, Multi-Factor Authentication, and even the option of dedicated cloud hardware with the Virtual Private Snowflake option.

Impact of Cloud Computing

“What the Hell is Cloud Computing? …. I’ve no idea what everyone’s talking about”. — Larry Ellison

Of course, despite the protests of Larry Ellison back in 2008 that cloud computing was “complete gibberish” it now seems Oracle has completely embraced a future in the cloud which is just as well since according to RightScale the battle of Cloud Vs. On-premises is already won. It's now a battle for Serverless, Stream Processing and Containers.

Conclusion

In conclusion, while the rest of the world was looking towards Hadoop to resolve the challenges of “Big Data”, Snowflake quietly redefined what’s possible. The solution can be deployed within minutes, with nothing more complex than a credit card, and an extra-small server running for an hour costs less than a cup of coffee. As a platform it’s remarkably flexible and will scale up performance to execute a massive terabyte sized queries, reducing elapsed time from 5.5 hours down to just two minutes. Equally, as the UK based food delivery service Deliveroo found, it can automatically scale out, to cope with a huge number of concurrent users, then silently scale back when not needed.

Unlike Oracle, it is incredibly simple to manage with no statistics to capture, no indexes to manage and data partitioning, compression and encryption which are automatic and completely transparent.

If you’d like to try Snowflake for yourself, you can start a free trial with $400 of credit. There’s no hardware to configure or software to install, and the online documentation is equally easy to follow with lots of getting started videos.


Snowflake Training from an Snowflake Superhero.

Click on the image below for more information.

Analytics Today Snowflake Training