AI Today (Aired 03-04-26) Beyond GPUs: The Next Computing Breakthrough Powering the Future of AI

[00:00:00] Sam. [00:00:31] Welcome to AI Today. I'm Dr. Alan Badot, and today's show is one I've been wanting to do for a long time because there's a conversation that's happening behind closed doors at some of the most powerful AI labs, government agencies, and really some of the deepest parts of venture capital. [00:00:51] And it goes something like this. [00:00:54] The GPU got us here, but it can't take us where we need to go. There's shortages all over the place and we've just about hit a window where we can't go any farther. [00:01:06] Today we're going to talk about what's coming next. And, you know, it's, we're going to talk a little bit about neuromorphic computing, we're going to talk some about quantum computing, real quantum computing, and we're also going to talk a little bit about FPGAs and why organizations that figure this out first are really going to own the next decade when it comes to AI. So I want you to buckle up because it's going to be an interesting ride. [00:01:33] Now let's start with why GPUs really became the engine for the AI revolution. [00:01:42] So, a GPU is a graphics processing unit, and it was designed to render video games. [00:01:49] Remember, you know, going from, you know, Pong and some of those, those eight bit games to, you know, 16 and 32 and beyond. And then all of a sudden, Doom came out, right? And it was like, oh, look at the graphics in that. Well, what it is, is, you know, it's a, it's thousands and thousands of small, simple calculations that are happening simultaneously. And, you know, when we finally figured out how we could use those for computing, boy, that was pretty amazing. And then we also realized that that's exactly how training of a neural net needs to take place. [00:02:26] You've got parallel math that's taking place at a massive scale in the gpu that was just the perfect, you know, conduit to allow us to do those. Now, the numbers, if you think about it, are really staggering, right? You know, with Nvidia leading the way, of course, you know, they're H100 systems that can perform nearly 4,000 trillion operations a second. [00:02:53] I'm going to say that one more time. [00:02:55] Four thousand trillion operations per second. [00:03:00] That's four quadrillion. And that is not a typo. [00:03:04] That's, that's. It's still not enough, though. [00:03:09] Yeah, yeah, still not enough. [00:03:12] Here's the problem. So with AI models, they're growing faster and our ability to compute them is, it's not keeping up. [00:03:21] Every time you double the size of a model, every time, it's, it's, you know, the compute required isn't doubled, right. It quadruples. Okay? [00:03:33] And sometimes it's even more than that. [00:03:36] It's, it's all because of simple math. And, you know, we're on a collision course where AI ambition is going to surpass physical realities. [00:03:47] And, you know, let's talk. We're not going to talk about shortages of GPUs, even. We're not going to talk about some of those other driving factors. But the reality is, is that, you know, you hear people say Moore's Law. Well, Moore's Law was the old rule that said, you know, in essence, that a processor, you know, the power doubles every two years. [00:04:09] And, you know, at that time, of course, you know, it's, it can't keep pace, right. But we keep borrowing against, you know, a future that is based on silicon and, you know, that industry and it just can't deliver fast enough. [00:04:24] We've got energy crisis that are taking place with water and power and these large data centers that are going up all over the place. [00:04:34] And then you realize why some of these largest AI generators of models and stuff are investing in nuclear power. They're buying up power plants next to, you know, rivers that have been abandoned for a while. And all of that is so that they can have their own power. [00:04:56] Now, let's talk a little bit about a simple language really quick, right? When you, A simple model. When you talk about, like, we'll go back to the GPT4 class that, you know, was roughly about the same amount of electricity as about 500American homes, and that's used an entire year. That's one model. That's one training run that's significant. [00:05:21] And as we continue to go, that's just training. [00:05:27] Now when you start to think about inference, that every time somebody prompts it and hits enter, that's an API. Call every embedded AI feature that is used in every single app. [00:05:40] That's a tremendous amount of energy that is being used. You hear people talk about, well, you know, the math doesn't add up when you say that you're going to try to go green by a certain time and you've got, you know, all of these credits that you're going to do. And then there's AI, and AI is not going to allow you to do that. [00:05:59] So fundamentally, as we have big companies making very bold sustainability pledges, carbon neutrality is, it's not going to make it again. It's just not going to be able to sustain that. [00:06:19] Now, of course, you know, manufacturing, there are issues with that, of course, and other, you know, technology challenges that we just have to, we just have to try to overcome. But, you know, the realities continue to be that there's only so many of these that we can generate in a year. There's only so many new ones that can be brought out. I did a show earlier in the year where I showed you what a DGX Spark could look like. [00:06:49] It's pretty impressive, but still a lot of other challenges that are just going to, to overcome everything that we're trying to do with models. Now looking at that, we're going to say, okay, where's that breakdown going to hit? How are we going to, how are we going to fix that problem? Number one is real time edge inference, right? That is the biggest driver, whether it's a robot, a car, whether it's a battlefield sensor. Then, you know, there are processes and number crunching that goes on every second, every single decision milliseconds that are taking place all out at the edge. And, you know, all that means no data center. And that GPU is going to continue to draw power. It's going to continue to, to, to compute, and it's going to really start to fundamentally break down how much we can do, where we can do it and how we're going to be able to do that. [00:07:42] Now there are other research areas and that's what we're going to talk about today so folks can, can be prepared. But here's what I want you to understand. [00:07:52] The pivots already started. [00:07:55] You've heard me talk about it and some of the things that I'm doing, but it's, it's been going on pretty quietly. DARPA has been funding neuromorphic computing for a long time. [00:08:07] The DOE has been funding, you know, the Quantum computing initiative, about $2 billion over, over five years, I think was the last figure I saw. And folks are starting to do that. And we were starting to get to a point now where you're seeing these things pop up on the market commercially, and that's where you got to start to really pay attention. [00:08:30] Coming up, though, we're going to talk about, you know, what if your AI chip worked like your brain? [00:08:36] That's going to be the biggest thing. What if you could build a system that learns, adapts and responds in real time and it uses less power than a light bulb? [00:08:46] Neuromorphic computing could be one of the biggest breakthroughs that we've seen from an AI perspective in a very long time and, and you haven't heard of it. That's going to change when we come back. So stay with us. You know, we'll be back after a couple of messages from our sponsors. [00:09:35] Welcome back to AI Today. I'm your host, Dr. Alan Bedot, and we are talking about some new computing techniques. Well, not new for some of us, but new for many of you. [00:09:45] And I want to get the message out so you can understand some of the new technology breakthroughs that are going to happen. [00:09:51] So if you think about your human brain, think about the, you know, 86 billion neurons, approximately, that equates to about 100 trillion synaptic connections that are inside of your brain. [00:10:06] You can recognize faces, you have no issues with that, right? You can process language, you know, navigating different environments, you know, emotions all simultaneously. That's why you get upset when you're driving. You're lost, potentially, right? But it's all about power. [00:10:24] People don't talk about that, but your brain consumes about, about 20 watts, and that's the same as a dim light bulb. [00:10:33] Think about that, Think about how much compute power you have sitting on that neck. And it only consumes the amount of power of a light bulb. [00:10:44] So now you see the challenge that we're facing trying to use AI and trying to do all of these other things and have it make decisions for you. [00:10:54] So if you take a modern data center and the approximate of what a brain does, you need Approximately, oh, geez, two, 200 racks, potentially of, of, of GPUs, drawing about a megawatt of, of power and cooling and H vac and all that other stuff. That's why you can't match the brain's power. That's also why I have told you for many, many years that, you know, the, the, the whole singularity discussion is not imminent. [00:11:33] We've got some time, but we're starting though to see what the impact of these new compute capabilities are going to be. [00:11:41] Now the question, though, is always going to be right. You know, it's actually pretty simple. If you take a neuromorphic computer and if we stopped trying to force AI into silicon, and we're hoping that it works like the brain, you know, where are we going to be? [00:12:04] Is there going to be a cutoff? [00:12:06] Can we, can we pull back that onion a little bit and say, you know what, we're going to, we're going to strategically manage how much we're using from a GPU perspective and migrate to a neuromorphic type capability. [00:12:20] Now, of course, you know, let's break it down. The reality is, is that, you know, today, if you're running on a gpu, you're using probably something like an artificial neural net. Data flows through layers of nodes, right, and structured batches and everything is processed whether or not it needs to be done or not. You know, that's pretty powerful though. But it's still like, think about it. It's still like running your car engine at full throttle, whether you're going 90 miles an hour on the highway or if you're idling in a parking lot. [00:12:57] Think about that. That puts it into perspective. [00:13:00] And with neuromorphic computing, it's using a completely different type of model. It's using something called spike neural nets, and I've talked about those a little bit in the past, but we're gonna, we're gonna really dive deep into those here. But it's like a biological neuron. [00:13:18] It's only firing when there's a signal to process. [00:13:22] It's very sophisticated in that it's, you know, if it doesn't have to do something, it's not going to do it. And so the chip is literally, you know, will power down if it has unused notes. [00:13:35] It's event driven, it's massively parallel. It acts just like your brain. And it also means it's going to be energy efficient. [00:13:45] Sustainability is, you know, we can get there, but we've got to start bringing in some of these other types of compute capabilities. [00:13:53] The brain doesn't process all its neurons at the same time. Can you imagine what that would be like if everything was firing at the same time? [00:14:02] So when you're reading a screen, reading a screen, your visual cortex is active, your motor cortex isn't firing, unless you're like me and you like to use your hands a lot. [00:14:15] Neuromorphic chips, though, work the same way. [00:14:18] And that selectivity is where the efficiency really comes from. Now, you know, who's building, that's the bigger question, right? And how are we going to get there? There's, there's a lot of smaller companies, of course, that are, you know, really have taken the lead, just like AI was a few years ago. It's being, you know, really driven from the ground up. But then there are some other big players that you don't think of when you hear neuromorphic computing. [00:14:46] Intel has a system called lohit, and it's probably one of the most advanced commercial neuromorphic chips that are out there. [00:14:57] IBM's got one, it's called, it's called North Pole. [00:15:01] And it's also working to eliminate some of the bottlenecks that are, that you see from that. And then also like a brain scale, which is a much more focused team out of Heidelberg Heidelberg University. [00:15:17] And you know, they've got, you know, some, some exciting things that are taking place also. [00:15:23] And the key is it's a global race, it's accelerating fast, just like AI was when it started. [00:15:30] And what we're doing is, you know, we're trying to make sure that, you know, it complements everything that we're doing today because you can't make an immediate switch over. It's just not going to work that way. [00:15:44] So how then is neuromorphic computing not going to compete with traditional AI, but it could dominate it maybe completely, we're not sure. [00:15:58] At the edge though, any system that needs any kind of real time intelligence and it doesn't matter if it's in the cloud, it doesn't matter if it's a drone, it doesn't matter if it's a prosthetic limb, right? [00:16:09] You know, you need to be able to interpret those signals. Hearing aids is a great example of that, where with neuromorphics you'd be able to more efficiently separate speech from background noise. [00:16:23] Autonomous robots, you know, I think many people saw, you know, ATLAS from Boston Dynamics and some of the amazing things that it's been able to do. [00:16:31] But how much power is it drawing currently and how could neuromorphics reduce that amount of power? [00:16:39] Now, don't, don't let me, you know, try to, to fool you by saying that, oh yeah, these are all ready to go commercial scale and, you know, the world is going to be unicorns and rainbows and everything is going to be event driven. And now, you know, these systems are going to just work when they come out of the box. It doesn't work that way. We all know that it doesn't work that way for regular AI today. [00:17:04] So there's no way that it's going to work that way for neuromorphics. However, some of the other things around on chip learning are really going to reshape AI and how we're using it in the future and how we're applying it. [00:17:21] So traditional neural nets, they're trained once, then they're deployed. You can have some continual learning, right? But oftentimes that requires some additional retraining. [00:17:31] They don't adapt dynamically. Neuromorphic chips, however, they have the ability to update their own connections once, you know, they see that there's new data that's coming in. [00:17:43] That means that continuous learning without requiring all those additional retraining cycles is going to save even more energy. [00:17:53] It's going to also increase the amount of security that that system has because, you know, threat patterns, real time actions, you know, you're not going to be able to return signals to the cloud. You're not going to be able to intercept those. [00:18:09] And those breakthroughs are happening right now. You know, we're using them, some neuromorphic computing in some of the actions that we're doing as we're building our cognitive agents. [00:18:20] But even some of the other ones that are out there, it's not theoretical anymore. [00:18:25] You know, Intel's had their System released since 2021 and now though, what they're doing, 120 million, you know, synapses on a chip, it draws less than a watt of power. [00:18:39] That is phenomenal, really phenomenal. And they can solve a lot of different types of problems, problems that AI by itself on a GPU cannot solve. Problems still takes too long. [00:18:53] That's not an incremental improvement either, especially when you start to put those solutions together to solve much harder problems. Neuromorphics on one side, GPUs on another, and you know, a lot of different opportunities arise that you can really start to conquer. [00:19:11] Now looking at some of these, you know, we know that like quantum, and we're going to talk about that here soon. [00:19:18] You know, a lot of folks are paying attention to this. DARPA has their own neuromorphic cognition program. I think it's the neuromorphic cognition for urban surveillance. Right. And what they're trying to do is look for, you know, systems that can continually work with little amounts of power. Everything is about power now. [00:19:42] And you know, that's one of the drivers, but it's also one of the limitations. Now let me, let me just be 100% straight because this is what the show is about, right? [00:19:53] It's not going to replace the GPU cluster next year or probably the year after that. [00:19:59] There are some real constraints around knowledge and you know, just some ability to, to really refine what those models are doing. Programming models for neuromorphic chips are still maturing. There are some that you can, you can use something that you can integrate with traditional methods, but writing software is not trivial. I can attest to that because like I said, we're using spike neural nets in our software now, research tools, everything else, plug and play, you know, interactions with GPU signals. We're still not there yet, but we're, we are constantly looking at Those kind of things, but then also workload. [00:20:41] There's a lot of unknowns around how much a neuromorphic chip can really support, especially when it's a sparse, event driven, you know, low latency task. We're just not sure. [00:20:54] But I want you to, I want you to know that just like AI, when you start to put more money into it and you start to see those increases in private equity funding and you start to see a lot more research taking place, it's only a matter of time. [00:21:10] I want you to be prepared, I want you to think about, okay, you know, if this does come the next five years, what can we do? How's that going to impact our AI strategy? How are we going to get better and how are we going to continue to win that battle that we are constantly facing? And so neuromorphic computing is like your brain. It processes the world the same way biology does. [00:21:36] It's efficient, it's adaptive, and it can be incredibly precise if you're solving the right type of problem. [00:21:44] Again, it's always, you know, let's pick the right problem, let's apply the right methodology, and let's get the answer that we need in order to solve harder and harder problems. [00:21:57] Now we're going to stack quantum computing in, in the next segment. So stay with us. We'll be right back. [00:22:39] Welcome back to AI today. [00:22:41] For 20 years, quantum computing has been 10 years away. [00:22:47] Let that sink in. [00:22:49] For two full decades, every time someone asked when quantum computing would be real, the answer was always a decade. [00:22:58] It became really, you know, a running joke that the winter just kept on going. You know, the groundhog kept seeing its shadow in quantum computing. And so we were in this winter for a long and longer time. And boy, we fought, we fought and it just seemed like we weren't ever going to get there. But that, that excuse is not, is not going to hold out anymore. [00:23:24] What has happened really since really 2023, and you know, where we are up today, it's, it's really, it's not an incremental progress that we've, that we've made. It's really about inflection. [00:23:40] It's really that the theoretical breakthroughs that researchers have been chasing for 30 years are now being demonstrated in physical hardware. Quantum computing is by far the most successful branch of physics that, that we have. It runs, you know, our gis, it runs our clocks, it runs so many different things. [00:24:02] And, you know, all of that has been the result of, you know, the theory that has been laid down now from a computing perspective and trying to harness that power. [00:24:16] You know, we've talked about it on previous shows. But, you know, I'm not going to go back through and talk about the entire theory like I did in that one show. But what we're going to talk about is really error. And what those error corrections and thresholds and how they're being crossed, you know, logical qubits, you know, those are the kinds that you can actually compute reliably. Those are becoming real. [00:24:42] And in the middle of all this, you look at all the things that are taking place. And whether it's, you know, the noise that you hear from the stock market. Whether it's journal papers that are coming out. [00:24:54] There's really one company that has quietly been just continually improving. [00:25:02] And really put themselves in the lead for. For the quantum race, you know. So I'm going to talk about IQM for this next segment. We'll do this so often on. And some other segments too. We'll pick out some. But this is one that I can't do a product, you know, review on it officially because I can't wear it. But I'll tell you, in the last 30 years, these guys have been doing a lot of great things. [00:25:32] And, you know, if you look at it right, we know that the classical computer, you know, it's a switch. [00:25:41] It's either off, zero or on. [00:25:45] That's it. [00:25:46] Everything in your computer has to be done in sequences. That's why you've got zeros and ones. And we always talk about that with quantum. It's not the same. It's not the same at all. Because you can have a 0, a 1, a combination, all simultaneously. That's what we talked about with superposition before. It's not magic, okay? It's quantum mechanics. [00:26:06] Physical behavior of particles really, you know, at the subatomic scale. That's the big driver of it. [00:26:13] Now, when you throw an entanglement, that's when it starts to get fun, right? Because then, you know, these qubits that everybody has, they're correlated to states. And that allows it to, you know, relate to another state. [00:26:27] And as you begin to have that system, explore that solution space. [00:26:32] Certain problems exponentially become better to solve with a quantum computer than a classical computer. [00:26:41] Now, what's fun, all right? I get to. I get to look at a lot of different technologies. And I get to look at a lot of different, you know, types of quantum computers that are out there. Because, you know, there's more than just one. [00:26:54] But looking at things that are taking place, looking at Scale, looking at the ability to deliver a number of these systems in a short amount of time. [00:27:09] That's where I want to talk specifically with about iqm. [00:27:15] You know, they're headquartered in Finland and they're really doing something that no other, you know, major quantum player is doing right now and that's building full stack on premise quantum hardware for customers, including governments that they can control, that they can operate and that they can put behind their own security perimeter. [00:27:38] They've delivered one to doe, they've delivered a number of them to some Korean companies, they've delivered them to, to other US Companies. They are set to, to have some be delivered across the world. I think the number I heard was, was 20 systems, 20 quantum computing systems that you can use. [00:28:00] We've integrated it into our, you know, technology stack with our AI. Why? Because it's available, it's real. I don't have to explain, oh, it's a quantum simulation or something else. It's real. [00:28:15] So that also means, though you can solve commercial applications, we're not only looking at, you know, nuclear type, you know, simulations or looking at cryptography or some of those, you know, types of activities they traditionally use some quantum computing for. [00:28:34] We're looking at the possibility of having multiple qubits in multiple types of systems that, that are commercially available. [00:28:42] And that's what IQM is doing. [00:28:45] That's, that's so powerful because that allows, you know, really, you know, a sovereign quantum computing capability. [00:28:58] So they'll work. You know, iqm, you know, they work directly with researchers. They, you know, work with folks like myself trying to develop AI solutions. [00:29:08] But they're working with a whole bunch of other folks like, you know, different research agencies, defense agencies, strategic industries, all of those that can use quantum, that go way beyond what we're used to thinking about, whether it's Finland, Germany, Poland, the U.S. they have systems all across Europe as well as in the States. [00:29:28] And they recently, you know, announced it was during supercomputing that they were going to continue to expand and they're growing the US market. [00:29:37] That's a, that's a natural chapter considering we are, you know, the research leader in an awful lot of fields. [00:29:45] Now from a technical perspective though, right, it's great that they can scale, it's great that they can manufacture. But from a technical scale, you know, how they are designing their systems is not only impressive, but it's really, it's really the fundamentals that allow them to do this because their architecture is really a direct, what I would say collaboration with the application requirements. And so if you're solving a specific class of optimization problems, their approach is to tune the qubit connectivity, map the gate set, the control electronics, all of those things from the ground up. [00:30:25] And that result is something that gives you such an increase in your performance per qubit on those workloads that you can't see that even on other quantum computing hardware it's phenomenal. [00:30:39] And that kind of out of the box thinking, that kind of, you know, looking at the problem and saying, okay, how can we individualize it, how can we create it so our customers can, can all benefit from those sort of things is, is huge. [00:30:55] And you know, when you start looking at regulation, you start looking at some of the other, you know, compliance issues that folks are looking at. Their on premise model, you know, it just naturally folds into a lot of federal frameworks that U.S. customers have to adhere to, whether it's FISMA, FedRAMP, CMMC, you know, they really exist to control the data. And iqm, their capability to model that is synergistic instead of fighting it. [00:31:29] And that is really what makes it great to use, great to integrate, great to leverage in your code. [00:31:40] Now, five to seven years, they've got a great vision with that and it's going to continue to evolve. [00:31:47] But other quantum utility capabilities are coming. [00:31:53] Cryptography, again, it's going to replace a lot of the things that we're doing today and that's important. It's going to drive our ability to do different types of problem sets and different types of applications and it's going to change some of the classes of problems that we're actually solving. Because. Because we didn't think quantum computing could handle that, or we didn't think it was appropriate. But that's not the case anymore. We've got more flexibility and not less. [00:32:22] Now when you combine quantum with our friend the neuromorphic chip, with the hybrid infrastructure that we're using today, the conversation about hybrid computing is no longer CPU and gpu. [00:32:37] It can't be, has to be cpu, gpu, neuromorphics, Quantum and fpga, which is what we're going to talk about in our next segment. So stay with us. We'll be right back. [00:33:26] So if you think about a master carpenter, they don't use a sledgehammer to carve dovetail joints. [00:33:36] The most skilled tradespeople in the world don't necessarily have the biggest tools, do they? [00:33:43] They're the ones that know which tools to reach for, when to reach for them, and why. [00:33:49] They match the instrument to the problem. [00:33:53] And that gives them the precision that they need to, to do things that others can't. [00:34:00] Then they move on to the next problem. Right? [00:34:04] Well, the AI industry is starting to learn that lesson. It's been kind of painful, right? [00:34:10] So we've spent about a decade, boy, about a decade grabbing a hammer to solve every single problem. [00:34:22] And that hammer was the gpu, Right. [00:34:25] And then we've been wondering why, oh, geez, some of those nails, they're not, they're not going in as clean as we were hoping. [00:34:33] In this last segment, I want to bring everything together. [00:34:39] I want to discuss neuromorphics and how it can fit with quantum and how all this technology ties together with the fpga. [00:34:50] So welcome back to AI Today. [00:34:52] Let's figure out how we can do some things a little bit better. [00:34:56] And actually, when we talk about sustainability, mean it. [00:35:01] Now, just a really quick explainer for those that aren't familiar with FPGAs. Okay, so they are field programmable gate arrays. [00:35:13] Very fancy, right? [00:35:15] Unlike a CPU or gpu, they have a, a fixed architecture and it's really baked in at manufacturing. [00:35:22] It's essentially a blank canvas of programmable logic gates. [00:35:29] You can define a circuit, you can define the shape of the hardware. [00:35:36] The great thing though is that you can reprogram it as your needs change. [00:35:42] That means that it's hardware that you can configure like software. [00:35:49] That's amazing. [00:35:50] That gives you an ability to adapt. [00:35:54] That gives you, you know, put you in a unique position in a, you know, a compute ecosystem that no other processor or piece of hardware can really do. [00:36:05] It's a universal adapter. [00:36:08] You've got an interface layer, you've got an orchestration fabric, you've got A. These FPGAs that have been used on a lot of different systems just because of that flexibility. [00:36:21] And you know, as you start to do these reconfiguring for new missions or new problems that you're trying to solve, you know, that builds in on other technologies that we want to use, whether it's signal intelligence, whether it's radar, whether it's electronic type activities, but it can be used just about anywhere. [00:36:45] Now we've heard them and we've seen how they've been used a little bit in some of the robotics industry. [00:36:53] But the context of a heterogeneous AI system, the FPGA has to play a critical role in that. First, you know, look at the pre processing and the post processing for quantum circuits, for example, when you're preparing the data for the Structure that gets sent into a quantum processor and you're interpreting that, you know, the probabilistic results that, that come back, they require some deterministic low latency classical compute that you have to use. [00:37:25] FPGA perfect for that. [00:37:28] Second, the interface layers on neuromorphic chips, the spike trains coming from a neuromorphic processor, they need to be interpreted and they have to be routed. [00:37:39] Again, FPGAs are great at that. [00:37:44] So the enterprise options for serious hardware folks is tremendous. [00:37:53] Now again, you're talking about some of the same players, intel, they've got some, you know, the AgileX FPGA, it's really purpose designed for AI acceleration. [00:38:05] This is one of the areas where intel is, is actually closer to the lead with this. AMD's got, you know, their, their Versal adaptive compute acceleration platform. [00:38:20] Yeah, don't make me say that again. That's the, the FPGA that really are the fabric with AI engines built into them and they're adding ARM processors on top of that. Right. All on a single device. [00:38:33] So now we have a heterogeneous compute platform on one chip that is impressive. [00:38:41] And these are not, we're not talking about hobby components that somebody is soldering together. We are talking about enterprise grade infrastructure that changes how cloud is used, changes how you are buying your compute, changes the resources that you're using to solve problems. [00:39:00] And some of these from a financial institution perspective, they're already running in production. [00:39:07] Now I said at the close of the last session that we are redefining what a hybrid compute environment is. [00:39:16] And that's because we have to. But that also means that we have to redefine the math that goes along with that. [00:39:24] How are we going to ensemble all these solutions and get them together so that we can get one cohesive problem solved. [00:39:33] There's, there's a lot of optimism, but there's also a lot of confusion. [00:39:38] You know, hybrid computing generally means that CPUs handle the logic and the control flow, GPUs handle the math. [00:39:50] Well, that was really what powered us into our deep learning, you know, revolution. Very simple, very clean, very effective. [00:40:02] So 2023 and the new environment though, it's got to be much more sophisticated. [00:40:12] It's got to have that orchestration that takes place between the cpu, the gpu, the fpga, the neuromorphic processors, the quantum processing units, all receiving their workloads, all in sync with each other so that it's coordinated. [00:40:32] You can solve and understand different problem classes, your latency requirements, your power budget, your security Constraints are all handled simultaneously. [00:40:44] Kind of sounds like an AI problem, doesn't it? [00:40:48] Now what's fun when you get to play with these, you can start to solve problems that you never dreamt that were going to be possible. [00:40:59] Whether you're solving a multi dimensional logistics problem, whether you're solving, you know, a harder problem, even when you're trying to figure out, oh, the mechanics of brains and decisions and how those are made and you know, what sort of, you know, ramifications come from that. Or maybe it's game theory and trying to do some additional layers with that. [00:41:25] The more diverse that you can make your system, the more problems that you can solve as long as they're working together. It's that connectivity tissue that is really going to be the next generation of AI infrastructure drivers. [00:41:44] Cognitive orchestration layers. You've heard me talk about these. [00:41:50] This is where the system is receiving some sort of workload or understands its characteristics. [00:41:56] All from a processor or a cognitive agent that is driving that. [00:42:03] Cognitive agents, right. You've heard me talk about those, right? That's what we're building. [00:42:09] And that's why when we talk about some of these things, I don't talk about just a cognitive agent with a GPU and a cpu. [00:42:17] We talk about cognitive agents with true cognitive capabilities that are coming from different systems. And those systems are real, those systems are here, those systems are being used in other problem sets and we're bringing it to AI. [00:42:32] It's not science fiction anymore. [00:42:35] Early adopters or even, you know, for, for other fields have been doing this for a little while. [00:42:41] We talked about the government, we talked about darpa, we talked about, you know, some of the other chip manufacturers that are trying to put all of these things down on one chip. [00:42:52] But you know, what pharmaceuticals they are, they are really some of the leaders in this because they're taking, you know, hybrid classical quantum computing pipelines and they're changing it and they're adjusting it and it's, it's really allowing them to do molecular simulations that we never thought were going to be possible. [00:43:14] Now what's, what's really the driving function between, for, for all of this? [00:43:19] Well, I mentioned it early on in the show, but it's sustainability. [00:43:26] We have a breaking point coming. In some cases, we may have already hit that certain areas, certain resources, and when you start to put energy and sustainability together, the math doesn't work. [00:43:40] There is an inconvenient truth in AI that, you know, the industry just, it can't ignore. [00:43:47] AI's carbon footprint is becoming a regulatory Liability, it just is the EU and how they're trying to handle it, the SEC and what they're trying to do and every international government agency and national government agency in between. [00:44:07] Power consumption is the driver and the regulatory pressures that AI is putting on the energy industry, it's not going to, it's not going to reverse. [00:44:19] We're not putting that genie back in the bottle. [00:44:22] I know some folks think that there's some possibilities that they can regulate their way out of this, but it's not going to happen. Again, the math just doesn't add up. [00:44:33] So what I want folks to think about is you need to right size your problems that you're trying to solve with AI and what types of AI, but you also need to right size what your compute environment is going to look like. [00:44:54] We cannot just take GPUs and throw them at everything anymore. It's not efficient, it's costly. [00:45:01] It has limited us in some ways. I think by holding us back and from embracing these other technologies that are built to do better things with, you know, and solve harder problems than the GPU can by itself or even ever could. [00:45:21] That makes a lot of business sense, but it also makes math sense. [00:45:27] So if you deploy, just an example, if you deploy a neuromorphic chip for persistent edge, you know, inference, that's a task that a lot of organizations are running on a GPU server today. [00:45:41] You might cut the energy consumption by a factor of 1,000, not, not 10%. I didn't say 10%. I said a factor of 8,000. [00:45:53] And that is not an exaggeration. [00:45:56] That's some of Intel's published benchmarks that they have for certain types of workflows. [00:46:03] That's game changing cost, speed to development weight, everything. [00:46:09] If you use a quantum processor to solve an optimization problem that would take a GPU three or four days to approximate, that quantum system solves it at a higher accuracy. Even in maybe three hours, you've eliminated 21 hours of GPU cluster runtime. [00:46:29] And that's per optimization cycle. [00:46:32] Multiply that across an enterprise scale of operations and you are talking about megawatts, you know, megawatt hours of energy that you are, that you are saving. That's millions of dollars in compute cost. [00:46:46] That is a genuine material carbon reduction. And let's talk about the FPGAs, right, because they may be the most immediate, you know, actionable measure that we can take. [00:47:05] If you're an organization and you're running on, you know, full GPU clusters for workloads that can be handled by, you know, an FPGA acceleration, you are burning money, you are throwing it right out the window. You're using up energy unnecessarily. And that is right now and that is today. [00:47:24] So think about that as you are looking at these architectures. Don't just go to a GPU because that's all you're familiar with. There are actionable things that you can do to save energy and really advance your compute capability. And so I want to close with really a direct message to, you know, a lot of business leaders or energy or Whoever that is, CTOs, program managers, you know, AI officers, organizations that are going to win AI over the next decade are not going to be using only GPUs. [00:48:06] They are going to understand what their problems they're trying to solve and they're going to come up with the best compute strategy and the best architecture strategy. [00:48:15] It's a strategy that has to be built on multiple problem solving, multiple classes, matching workloads to architectures and building out that orchestration capability. [00:48:25] If you are doing the same from a compute perspective in a year that you were doing today, you're already behind. We say that about AI, but now these other technologies that can help facilitate more accurate AI, more efficient AI are starting to impose their will. [00:48:46] And so you need to pay attention. You have to look at it from the perspective of multi hybrid type environments. And it may, you know, it's going to take a little bit of time to get there. But you know, we have talked about though that quantum 20 years and it's 10 years away. We've talked about, you know, the impossibility for different types of solving problems. We've talked about FPGAs and their connective tissue. [00:49:14] Let's use the right tool, this is the right time to do that and let's pick the right problems and apply it that way. [00:49:21] And I would encourage you to look at companies that are doing that because it's going to come down to your bottom line. You're going to save money by companies that are more forward leaning, more forward thinking and applying these technologies for their AI to help you solve your, your harder problems. So I'm Dr. Alan Badot, this has been AI today. [00:49:41] Thank you for being here. [00:49:43] And if this has sparked any questions or different conversations, you know, you can always reach out to me, send me an email. I love it. I'd love to.

Show Notes

Episode Transcript

Other Episodes

Episode

AI Today (Aired 09-04-2024) : Streamline Business Plans & Market Research with AI

Episode

AI Today (Aired 12-17-25) Top Technology Trends for 2026: The 15 Innovations Shaping AI, Security, Business & Everyday Life

Episode

AI Today (Aired 06-10-2025) AI Surveillance and Identity Theft: What You Need to Know Before