The foundational nature of time series data across the Universe

The foundational nature of time series data across the Universe

Posted on

Time series data comes from monitoring changes over time. It’s not a new idea. Changes in rainfall patterns and stock performance figures have been tracked for hundreds of years. Before computers, time series data was plotted on graphs manually. Then, with the advent of the computing age, relational databases were modified by adding a column to capture time.

It was a good enough solution for decades. But the proliferation of connected devices has thrown data generation into the realm of exabytes per day. As an example of scale, consider that retail giant Walmart Inc. manages 1.5 billion messages from more than 7 million unique internet of things data points every day. And that’s just for its U.S. operations.

As IoT expands, organizations need specialized databases that can ingest metrics at scale and analyze data in real time. The solution is purpose-built time series databases that provide higher performance at lower costs.

“When you think about IoT and edge scale, where things are happening super-fast, ingestion is coming from many different sources and analysis often needs to be done in real time or near real time,” said theCUBE industry analyst Dave Vellante. “At scale time series databases are simply a better fit for the job.”

During a special broadcast from theCUBE and InfluxData Inc., “The Future Is Built on InfluxDB,” Vellante and fellow analyst John Furrier spoke with Evan Kaplan (pictured), chief executive officer of InfluxData; Brian Gilmore, director of internet of things and emerging technologies at InfluxData; Angelo Fausti, software engineer at the Vera C. Rubin Observatory; and Caleb MacLachlan, senior spacecraft operations software engineer with Loft Orbital Solutions Inc. In a series of sessions, they discussed the growing demand for time series databases and why organizations are choosing InfluxDB to gain insights from their timestamped data. (* Disclosure below.)

IoT sensors speak time series data

Walmart Labs is just one of the hundreds of companies, big and small, that rely on InfluxData to help manage their time series data. The reason is simple: When it comes to advanced analytics, time matters.

“Everybody wants to build increasing intelligent systems. And in order to build these increasing intelligent systems, you have to instrument the system well and you have to instrument it over time [to get] better and better,” Kaplan said.

In his session on “How Time Series Data Moves the World,” Kaplan discussed the factors driving the adoption of time series databases. He believes InfluxData came to be in the right place at the right time thanks to the foresight of founder Paul Dix.

“From my point of view, he invented modern time series,” Kaplan stated.

Before Dix’s innovation, developers had to customize general-purpose relational databases to optimize them for time series data. When open-source InfluxDB was launched, it eliminated this step, providing developers with a purpose-built time series platform that they could use to get to work immediately.

InfluxDB’s influence lies in the relationship between time series and IoT, according to Kaplan.

“If you think about it, what sensors do is they speak time series,” he said. “Pressure, temperature, volume, humidity, light, they’re measuring, they’re instrumenting something over time.”

And when it comes to ingesting metrics at scale and analyzing data in real time, purpose-built time series databases provide higher performance at a lower cost than their general-purpose counterparts.

Consumer IoT, the industrial IoT, and renewable and sustainability initiatives are the three “sweet spot” use cases for InfluxDB, according to Kaplan. The connection is that off-the-shelf systems can’t manage the complex interactions required between physical devices and software in the IoT. This puts the onus on developers to create a system that meets the core needs of the business. Whether it’s predicting gene expression when sequencing the human genome or anticipating order dates in a supply chain, “starting with the developer is where all of the good stuff happens,” Kaplan added.

Open-source code enables assembly line application development

Using how Boeing builds an aircraft as a comparison for how developers can quickly build custom time series data systems, Kaplan described how the majority of the process is sourcing pre-built parts. Just as Boeing sources physical components to build an airplane, relying on their engineers to build only the technologically complex and critical wings, software engineers can access open-source code to form the basis of their custom applications.

A developer could pull time series capability from InfluxData and assemble it with a Kafka interface to be able to stream in data, Kaplan said, giving an example of the process.

“[Developers] become very good integrators and assemblers, but they become masters of that bespoke application,” he said.

Any business more than a decade or two old that relies on time series data already runs applications on a traditional database that they’ve customized, according to Kaplan.

“There are probably hundreds of millions of time series applications out there today,” he said.

When it comes time for those companies to modernize their infrastructure (and for many businesses,that time is now) a purpose-built time series database should be the first purchase, Kaplan stated.

“There’s no system I can think of where time series shouldn’t be the foundational base layer,” he said. “If you just want to store your data and just leave it there and then maybe look it up every five years, that’s fine. That’s not time series. Time series is when you’re building a smarter more intelligent, more real-time system.”

Here is theCUBE’s complete video interview with Evan Kaplan:

Edge computing on Earth …

Edge computing has hit the mainstream, according to Brian Gilmore, InfluxData’s director of IoT and emerging technologies. In his session, titled “Life on the Edge,” he discussed how “data-driven” everything is becoming reality as physical and digital converge in the IoT.

The common link between consumer, commercial and scientific modernization is their hybrid nature, with data being generated in the cloud and out on the edge, according to Gilmore.

“It’s the ability for a developer to tie those two systems together and work with that data in a very unified, uniform way that’s giving them the opportunity to build solutions that really deliver value to whatever it is they’re trying to do, whether it’s the outer reaches of outer space or whether it’s optimizing the factory floor,” he stated.

InfluxDB enables this because the product itself is application programming interfaces, commonly known as APIs.

“People like to think of it as having this nice front end, but the front end is built on our public APIs,” Gilmore said.

This allows the developer to build hooks for data creation, processing, analytics, and also extraction to make the data accessible to other platforms, applications or microservices.

“You can use InfluxDB as sort of an input to automation control, the autonomy that people are trying to drive at the edge. But when you link it up with everything that’s in the cloud, that’s when you get all of the sort of cloud-scale capabilities of parallelized AI and machine learning and all of that,” Gilmore stated.

Here is theCUBE’s complete video interview with Brian Gilmore:

… and in outer space

The future frontier for edge is outer space, according to Gilmore, who believes companies need to prepare for a time when data is gathered across the Universe. During a special panel, two InfluxData customers said that future is already here.

Watching space from Earth is the mission of The Vera C. Rubin Observatory, which was represented in the panel discussion by software engineer Angelo Fausti. The observatory is home to the Legacy Survey of Space and Time, or LSST, which has the goal of producing the “deepest widest image of the Universe.”

“We are going to observe the sky with [the] eight-meter optical telescope and take 1,000 pictures every night with a 3.2 gigapixel camera. And we’re going to do that for 10 years,” Fausti stated.

This will produce an “unprecedented data set,” which Fausti estimates will consist of around 5 exabytes of image data. While the Rubin Observatory isn’t using InfluxDB for the image data itself, it is using it to record engineering data and metadata about the observations, such as telemetry events and the commands from the telescope.

The InfluxDB time series database is just one component of the InfluxData time series platform, which comprises the TICK stack — the Telegraf open-source server-based agent for collecting and sending metrics and events; InfluxDB; the Chronograf administrative user interface; and the Kapacitor native data processing agent. Tools such as Chronograf have been a game-changer for the LSST team, according to Fausti.

“Astronomers typically are not software engineers, but they are the ones that know better than anyone what needs to be monitored. And so they use Chronograf and they can create the dashboards and the visualizations that they need,” he said.

Edge computing has already reached that outer space for InfluxData customer Loft Orbital Solutions Inc., which provides space infrastructure as a service, enabling companies to access satellites on a pay-as-you-go basis.

“Our mission is to provide rapid access to space,” said Caleb MacLachlan, Loft Orbital senior spacecraft operations software engineer, during the customer panel portion of the event. The company’s model removes the cost barriers to deploying a mission in orbit, making it “almost as simple as deploying a VM in AWS or GCP,” MacLachlan said.

When Loft Orbital started, the company stored massive amounts of time series data, generated by the cameras, sensors, radios and other devices, on their satellites in a relational database.

“We found that it was so slow, and the size of our data would balloon over the course of a couple of days to the point where we weren’t able to even store all of the data that we were getting,” MacLachlan said.

The company solved this problem completely by migrating to InfluxDB. Now, it can “easily store the entire volume of data for the mission life so far, without having to worry about the size,” according to MacLachlan.

Like Fausti, MacLachlan believes that InfluxData’s potential to give non-developers simple tools to access the data they need, in the way they want it, is “the most powerful thing about it.”

“It gets us out of the way, us software engineers, who may not know quite as much as the scientists and engineers that are closer to the interesting math. And they build these crazy dashboards that I’m just like, wow, I had no idea you could do that. I had no idea that is something that you would want to see,” MacLachlan concluded.

Here is theCUBE’s complete video interview with Angelo Fausti and Caleb MacLachlan:

(* Disclosure: TheCUBE is a paid media partner for “The Future Is Built on InfluxDB” event. Neither InfluxData Inc., the sponsor for theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Image: SiliconANGLE

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *