Managed data lakehouse company Onehouse said today it’s adding a new feature to its platform called Onetable that advances its vision of easier, cheaper and faster data lakes.
The news came as it announced that it had raised $25 million in a Series A round of funding led by Addition and Greylock Partners, which previously co-led its seed round in Feburary 2022. To date, Onehouse has raised $33 million.
Onehouse’s data lakehouse is a relatively new kind of offering that enterprises can use to extract insights from their data more efficiently. It’s said to combine the best features of data warehouses and data lakes within a single platform.
The company’s lakehouse service is a cloud-based offering that’s designed for ease of use. Traditionally, setting up a lakehouse environment takes months of implementation work and technical expertise, but Onehouse says it can eliminate all that and get its platform up and running within minutes.
The platform is based on the open-source Apache Hudi offering, which was created by Onehouse founder and Chief Executive Vinoth Chandar while working as a data architect at Uber Technologies Inc. Uber uses Apache Hudi to process around 500 billion data points each day. Other users of Apache Hudi include Amazon.com Inc., Walmart Inc. and General Electric Co.’s aviation business.
One of the key advantages of Onehouse’s lakehouse is it does away with the need for companies to manage their own extract, transact and load processes, allowing users to start analyzing data as soon as it’s generated. It can access information from almost any kind of data warehouse, including AWS Redshift, Google BigQuery, and Snowflake, as well as data lake engines like AWS EMR and Databricks.
The problem companies face with ETL is that it’s a slow process that often takes hours, meaning that by the time the data is analyzed, it is less useful. With Onehouse, data can be ingested almost as soon as it’s created, allowing companies to analyze their data in close to real time.
With the addition of its new Onetable feature, Onehouse users can now leverage native performance accelerations in platforms such as Databricks and Snowflake by interoperating with their respective open metadata layers, Delta Lake and Apache Iceberg. In this way, Onetable is said to avoid data fragmentation by eliminating the need to copy data. The result is that companies can offload data management from high cost, proprietary cloud warehouse services, while streamlining their data architecture with interoperability that supports more diverse data use cases, such traditional analytics, stream processing, data science, machine learning and artificial intelligence.
“Over the past year we have built a first-of-its-kind cloud product to get data lakes up and running with just a few clicks,” Chandar said. “With Onetable, we are addressing a huge gap in the market around data interoperability, while enabling our customers to use Onehouse seamlessly with any major query engine.”
The new feature is likely just the first of more to come, as the company promised that the funds from today’s round will be used to continue advancing Onehouse and also grow its team to meet market demand for its product.
Greylock Partners’ Jerry Chen said Onehouse is playing a major role in shaping the data lakehouse market. “It’s delivering core infrastructure needs like data management, ingestion, performance tuning and interoperability with the ease of a cloud data warehouse,” he said.