The impact of generative artificial intelligence models on the internet, on applications and on people is still in its infancy — but it’s clear it’s a major “game-changer.”
“I call it the fifth wave in the industry,” said theCUBE industry analyst John Furrier, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. “This generative AI thing is real. And you’re starting to see startups come out in droves.”
What AI is doing right now is opening eyes in the mainstream, and the applications are almost “mind-blowing,” according to Furrier.
To learn more, theCUBE connected with some of the top tech startups seeking to build generative AI on AWS during the Startup Showcase Season 3 premiere.
Here are three key insights you may have missed:
1) If software ‘eats the world,’ computer vision is a big deal.
When it comes to the new wave being highlighted by Furrier involving large language models and computer vision, things are likely just getting started.
Software is everywhere, and anything can have software related to it, according to Joseph Nelson, co-founder and chief executive officer of Roboflow Inc. — but the limiting reactant is how to enable computers and machines to understand things as people can.
“Computer vision is that missing element that enables anything that you see to become software,” Nelson said. “In the virtue of if software is eating the world, computer vision kind of makes the aperture infinitely wide. The capabilities are there, the open-source models are there, the amount of data is there, the computer capabilities are only improving annually.”
But there’s a pretty big shortage of tooling, and an early but promising sign of the explosion of use cases, models and data sets that all involved will need to bring capabilities to bear.
What is the vision for the technology five or 10 years from now? If one were to picture a bell curve, the normal distribution of the types of things in the center of the bell curve would be identifying objects that are very common, or common objects in context, according to Nelson.
“Deep into the tail of this imagined visual normal distribution, you’re going to have a problem like one of our customers, Rivian — in tandem with AWS — is tackling to do visual quality assurance and manufacturing in production processes,” Nelson said.
Only Rivian knows what a Rivian is supposed to look like, Nelson explained. Then, in between those long tails of proprietary data of highly specific things that need to be understood, in the center of the curve is a whole kind of “messy middle” of problems.
“Over time, you’ll get more and more of these larger models that kind of eat outwards from that center of the distribution,” Nelson stated. “And so the question becomes for companies, when can you rely on maybe a model that just already exists? How do you use your data to get what may be capable off the shelf, so to speak, into something that is usable for you?”
Here’s theCUBE’s complete video interview with Joseph Nelson:
2) Optimizing deep learning performance will be a big focus.
Part of the big wave taking place right now will be a retooling of business with AI, according to Furrier.
“Companies that aren’t retooling their business right now with AI first will be out of business, in my opinion,” he said. “This really, truly is the beginning of the next-gen machine learning AI trend.”
That, of course, started with ChatGPT, but it is just the beginning. The models are great, but enterprises want to do that with their data, on their infrastructure at scale at the edge, according to Jay Marshall, head of global business development at Neuralmagic Inc..
“We’re helping enterprises accelerate that through optimizing models and then delivering them at scale in a more cost-effective fashion,” he said.
Some optimization tools and runtime are based around most of the common computer vision and natural language processing models, according to Marshall.
“Your YOLOs, your BERTs, you know, your DistilBERTs and what have you, so we work to help optimize those, again, who’ve gotten great performance and great value for customers trying to get those into production,” he said.
But when customers get into the large language models, Neuralmagic’s research teams have been right in the trenches with those.
“Being able to actually take a multi-$100 billion parameter model and sparsify that or optimize that down, shaving away a ton of parameters and being able to run it on smaller infrastructure,” Marshall said. “All this stuff came out in the last six months in terms of being turned loose into the wild. But we’re staying in the trenches with folks so that we can help optimize those, as well as not require, again, the heavy compute, the heavy cost, the heavy power consumption as those models evolve.”
Here’s theCUBE’s complete video interview with Jay Marshall:
3) There’s a mission to make AI sustainable and accessible for everyone.
There are, of course, questions about how platforms such as ChatGPT will and won’t change business.
“Every senior leader I talk to is rethinking about how to rebuild their business with AI, because now the large language models have come in, these foundational models are here, they can see value in their data,” Furrier said. “This is a 10-year journey in the big data world. Now it’s impacting that, and everyone’s rebuilding their company around this idea of being AI-first, because they see ways to eliminate things and make things more efficient.”
Giving customers a method of taking models into production in the most efficient way possible by automating the process of getting a model and then optimizing it for a variety of hardware and making it cost-effective is the goal of OctoML Inc., according to Luis Ceze (pictured, left), co-founder and CEO.
“What people have the opportunity to do today is to either train their own model that adds value to their business or find open models out there that can do very valuable things to them,” Ceze said. “The next step really is how do you take that model and put it into production in a cost-effective way so that the business can actually get value out of it, right?”
So why are the production costs such a concern, and how did they reach this point? Training costs often get a lot of attention, because they are normally a large number, according to Ceze. But we shouldn’t forget that customers typically pay a large, one-time upfront cost, he added.
“But when the model is put into production, the cost grows directly with model usage. And you actually want your model to be used because it’s adding value?” he said. “So, the question that a customer faces is: They have a model, they have a trained model, and now what? How much would it cost to run in production?”
It is important to remember that generative AI models like ChatGPT use lots of energy and cost a lot to run, Ceze pointed out. Given that model cost grows directly with usage, it’s essential to make sure that once models are put into production, the best cost structure possible is in place — for example, if a model costs $1 or $2 million to train but then costs about one to two cents per session to use.
“If you have a million active users, even if they use it just once a day, it’s $10,000 to $20,000 a day to operate that model in production. And that very, very quickly gets beyond what you paid to train it,” he stated.
Here’s theCUBE’s complete video interview with Luis Ceze and Anna Connolly (pictured), vice president of customer success and experience with OctoML:
To watch more of theCUBE’s coverage of the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event, here’s our complete event video playlist:
(* Disclosure: TheCUBE is a paid media partner for the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event. Neither AWS, the main sponsor for theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)