Comparing Big Data Performance with Different Data Lake Storages

Every big data user I talk to has a data lake and data warehouse use case. It typically starts with Hadoop, using HDFS as the data lake and Apache Spark for distributed processing. A data warehouse is always there because everyone likes SQL. The big trend in this area is embracing an as-a-service model and architecture. This is probably influenced by the cloud, but it does not stop at…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Yifeng Jiang

Yifeng Jiang

Software & solutions engineer, big data and machine learning, jogger, hiker, traveler, gamer.