“Everything delivered in 10 minutes” is a tough promise to keep. But with the power of data, the impossible becomes possible, and India’s instant delivery service Blinkit is meeting and even sometimes exceeding its goal.
Building a delivery system that can handle peak time demands of over 7,000 orders per minute requires a reliable, secure but also agile and scalable architectural foundation. Traditional data warehousing didn’t make the cut, so Blinkit chose a more revolutionary data management model – the open data lakehouse.
“It has been the primary driver of how, as a company, we have shifted our delivery model from us delivering in one day to someone who is delivering in 10 minutes,” said Satyam Krishna (pictured, center), engineering manager of Blinkit (Grofers India Pvt. Ltd.).
Krishna was joined by Blinkit’s Akshay Agarwal (pictured, right), lead engineer of data platforms; and Wen Phan (pictured, left), director of product marketing at Ahana Cloud Inc., for a conversation with theCUBE industry analyst John Furrier during the AWS Startup Showcase: “Data as Code — The Future of Enterprise Data and Analytics” event, an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how Ahana is providing Blinkit with a SaaS managed service for Presto, providing the company with the advanced data management capabilities it needs to meet its instant delivery promise. (* Disclosure below.)
Open data lakehouse combines performance, openness and cost efficiency
Over the past couple of decades, the data warehouse has been the default choice for SQL-based data processing and reporting. But the model encounters problems when it comes to managing cloud-scale data.
Expensive and difficult to manage, relational database warehousing has limited capacity and restricted access. This has led to a trend toward cloud data lakes that offer inexpensive, scalable storage. Reliability and performance are obtained by adopting open frameworks that can operate on top of the cloud data lake, and a distributed query engine enables data accessibility.
In short, instead of the closed data>SQL query-processing>reporting and dashboarding stack of traditional data warehouses, the open data lakehouse builds on the cloud data lake to allow data processing through open-source tools, such as Presto, TensorFlow and Apache Spark SQL, plus adding reporting and dashboarding, data science operations, and in-data lake transformation on top.
“It’s really a paradigm on being able to run a gamut of analytics on top of the open data lake,” Phan said.
Not being locked into a specific data format was important for Blinkit, which has multiple overlapping use-case scenarios. Using open table formats allows them to choose the most appropriate for each use without rearchitecting and reevaluate these choices as open table formats mature and expand.
“When you have an open architecture, they can kind of really put together best-of-breed. And as these projects evolve, they can continue to monitor it and then make decisions and continue to remain agile based on the landscape and how it’s evolving,” Phan stated.
The open data lakehouse model also smooths and speeds the path from having data to gaining value from that data, which brings benefits to the business.
“Having the ability to plug and play a different format based on the use case helps you to grow faster, helps you to take decisions faster,” Krishna stated.
Presto provides fast answers
This is where Presto’s ability to perform fast analytic queries on cloud-scale data comes to the fore. The open-source, distributed SQL query engine “helps you stitch that data together with multiple data formats, gives you the performance that you need, and it works out the best step,” Krishna stated.
Ahana has chosen Presto as its default data query engine because it is distributed, scalable, and can federate or connect and query across different data sources. This makes it flexible for use-case scenarios beyond just building on data lakes, according to Phan.
“You have to think about how are these organizations maturing with this technology. It’s not necessarily an all or nothing,” he said, explaining that companies may choose to augment their data lake with other analytical cloud data stores, such as Snowflake Inc.’s Data Cloud or Amazon Web Services Inc.’s Redshift.
Presto decouples your storage with the compute so that users can choose any kind of storage, according to Agarwal. It also makes access simple by removing the need to learn SQL or any of the different paradigms of querying different sources.
“[Consumers] just need to learn a single tool, and they get a single place to consume from and a single destination to write on,” he stated.
Open data lakehouse removes the boundaries between technological and business goals
“In terms of a managed service, the guys at Blinkit are a great high-performing team, but they have to be very efficient with their time and what they manage,” Phan said.
Ahana helps by removing the heavy lifting of managing data while still giving them the ability to fine-tune resources.
“That’s been kind of the balancing point,” Phan said. He brings this full circle to state how the total cost of ownership is calculated not only on actual service charges, but also on the gains brought by freeing up engineers to focus on primary business objectives.
Business objectives are of equal importance to the technology that drives them, even for engineers, according to Krishna.
“When your [business is] delivering in 10 minutes, your leaders want to take decisions in near real time, which means you need to move with them. You need to move with the business,” he said.
Stay tuned for the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the AWS Startup Showcase: “Data as Code — The Future of Enterprise Data and Analytics” event.
(* Disclosure: TheCUBE is a paid media partner for the AWS Startup Showcase: “Data as Code — The Future of Enterprise Data and Analytics” event. Neither Ahana Cloud Inc., the sponsor for theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)