Comparing SQL databases and Hadoop

Hadoop is a framework for processing data, what makes it better than standard relational databases. SQL (structured query language) is by design targeted at structured data. Many of Hadoop’s initial applications deal with unstructured data such as text. From this perspective Hadoop provides a more general paradigm than SQL.

For working only with structured data, the comparison is more nuanced. In principle, SQL and Hadoop can be complementary, as SQL is a query language which can be implemented on top of Hadoop as the execution engine.

But in practice, SQL databases tend to refer to a whole set of legacy technologies, with several dominant vendors, optimized for a historical set of applications. Many of these existing commercial databases are a mismatch to the requirements that Hadoop targets.

With that in mind, let’s make a more detailed comparison of Hadoop with typical SQL databases on specific dimensions.

Scale-out instead of scale-up

Scaling commercial relational databases is expensive. Their design is more friendly to scaling up. To run a bigger database you need to buy a bigger machine. In fact, it’s not unusual to see server vendors market their expensive high-end machines as “database-class servers.”

The post Comparing SQL databases and Hadoop appeared first on Big Data Made Simple – One source. Many perspectives..

Big Data Made Simple – One source. Many perspectives. » SQL