Building on its growing momentum in the market for hybrid transactional/analytical database management systems, Oracle Corp. today is adding machine learning capabilities, new automation features and improved performance on the Amazon Web Services Inc. cloud to its MySQL HeatWave product.
Introduced in late 2020, HeatWave is an in-memory analytic accelerator that scales to thousands of cores, supports both online transaction processing and real-time analytics and has turned heads with its impressive price performance. The embedded machine learning features mean developers don’t have to extract data from another database to populate training models. Training also incurs no additional cost to customers in contrast to AWS’s SageMaker managed machine learning service, which is priced separately from AWS’s Redshift cloud data warehouse. Oracle claims MySQL HeatWave is 25 times faster than Redshift and costs up to 99% less.
Automated machine learning
Oracle is bundling some popular machine learning applications into the engine including support for anomaly detection, recommendation and multivariate time series forecasting. Anomaly detection is popular in financial services for uses such as determining why a credit card is declined as well as for monitoring machinery and manufacturing environments. Recommendation systems are used to provide personalized suggestions to shoppers or video viewers. Multivariate time-series forecasting solves complex problems involving multiple factors.
HeatWave AutoML doesn’t require a machine learning framework and can import models based on the Open Neural Network Exchange standard. Oracle also automated much of the training process, said Nipun Agarwal, senior vice president for MySQL database and HeatWave at Oracle.
“Not only do we now offer machine learning processing inside the database, but the training process, which is the more complicated part of machine learning, is fully automated as well,” he said. That includes an automated selection of algorithms, features and parameters. Oracle said that other cloud services only recommend algorithms, requiring users to select the most appropriate one and manually tune it.
“All models can be explained because we use model diagnostic techniques,” Agarwal said. “Users can run explanations directly from the console, which means no ML expertise is required. We have also introduced this notion of what-if scenarios where the user can toggle various attributes to see if it changes the outcome of the machine learning model.”
Machine learning is also being applied to MySQL Autopilot for better workload planning and monitoring. Auto Shape Prediction provides observability of OLTP workloads that “make it very easy to see why autopilot is making the decisions does,” Agarwal said. It can provide recommendations such as which tables to unload or purge from memory to reduce costs. Recommendations are supported by visual analyses of historical performance trends, including throughput and buffer pool hit rate.
Oracle is also fine-tuning HeatWave to run on the AWS cloud by tightening integration with S3 object storage, CloudWatch and PrivateLink. HeatWave now provides an optimized storage layer built on S3 object storage for hybrid columnar representation. When data is loaded from MySQL into HeatWave, a copy is made to the scale-out data management layer built on S3. If reloading is required, the data can be loaded without the need for transformation, resulting in significantly faster recovery times and availability, Oracle said. Data never leaves the AWS cloud, eliminating egress fees.
Agarwal said HeatWave on AWS is 20 times faster than Redshift and 10 times faster than Snowflake Inc.’s namesake data warehouse. Among the reasons, he said, is that “HeatWave is designed from the ground up for distributed scale-out query processing. It has been designed for commodity hardware and since we use ML-based automation the system learns and improves on the fly.”
Finally, Oracle is both expanding and changing the shape and pricing of HeatWave nodes. A smaller 32-gigabyte shape is being to the existing 512-gigabyte offering at a cost of $16 per month. In addition, the amount of data that can be processed by a standard 512-gigabyte node has been increased from 800 gigabytes to one terabyte. That adds up to a 15% overall price performance improvement, Oracle said.