Databricks users can now work with a network made up of Fivetran, Qlik, Infoworks, StreamSets, and Syncsort to automatically load data into the lakehouse. “Lakehouse” is a new term coined by Databricks to combine the best aspects of data warehouses and data lakes. This can be a significant value-add for an organization looking to combine business intelligence (BI) and machine learning (ML) use cases.
Source: Databricks, 2020
Databricks’ lakehouse provides the following key features:
The data lakehouse and the idea of providing a single unified data platform is not new. Vendors like Azure Synapse, Snowflake, and Amazon Redshift try to innovate the traditional data storage and processing platform. However, many of them are not fully functional. Technology offerings from some of these vendors are a mix of strengths and weaknesses. An organization must carefully evaluate their core mandatory requirement prior to adopting such a broad platform, as some critical functions needed in ETL or SQL may be missing in these technologies for some time to come.