Red Hat Research Quarterly

AI DIY: How research is making custom language models work with more of us

Red Hat Research Quarterly

AI DIY: How research is making custom language models work with more of us

“How many lives am I impacting?” That’s the question that set Akash Srivastava, Founding Manager of the Red Hat AI Innovation Team, on a path to developing the end-to-end open source LLM customization project known as InstructLab. A principal investigator (PI) at the MIT-IBM Watson AI Lab since 2019, Akash has a long professional history […]

about the author

Heidi Dempsey

Heidi Picher Dempsey is the US Research Director for Red Hat. She seeks and cultivates research and open source projects with academic and commercial partners in operating systems, hybrid clouds, performance optimization, networking, security, and distributed system operations.

Akash Srivastava

Article featured in

“How many lives am I impacting?” That’s the question that set Akash Srivastava, Founding Manager of the Red Hat AI Innovation Team, on a path to developing the end-to-end open source LLM customization project known as InstructLab. A principal investigator (PI) at the MIT-IBM Watson AI Lab since 2019, Akash has a long professional history as a researcher, which makes him a great person to shed some light on the often obscure pathways from research to product to product adoption. Red Hat US Research Director Heidi Dempsey interviewed Akash about his own path from curiosity to research to real-world impact, how he sees the democratization of AI evolving, and what cool things he and his team are working on next (hint: it’s bleeding-edge research in Generative AI, and it involves Red Hat’s recent acquisition of Neural Magic). Akash and Heidi also discuss balancing creativity with business demands and keeping the life-of-the-mind excitement and risk-taking spirit of research alive in industry settings.

Shaun Strohmer, Ed.

Heidi Dempsey: You and I have a common interest: the importance of research as something that feeds development and eventually product. Let’s start by talking about that in relation to InstructLab and democratizing AI development. Do you see that happening in multiple steps along the way, or by creating something that goes out to millions of people and then it’s democratized?

Akash Srivastava:  I like to think about democratization happening in different tiers. We started with the ChatGPTs of the world: proprietary models where the science itself was not democratized. Then some labs and companies broke through that tier by creating models and writing about them in detail, sharing the science behind them. To some, that was democratization. “To some” in this instance means “people who could afford the hardware.” But it was a step forward. 

Then someone said, it seems like language models only work when you have a lot of GPUs. Can we make smaller models? That was the next tier. In both software and hardware used for training models, we made tremendous progress that allowed people to use commodity GPUs to use for training. A lot of open source effort focused on how to squeeze out every ounce of compute these hardwares offer. Now you can actually use your gaming PC. You don’t have to get a $50,000 card; you can get a $4,000 card and start playing out this technology from your desk. To me, that is democratization.

Now we’ve reached a point where AI researchers and engineers can basically do everything they want to on these models, but what about people not trained in machine learning? That’s where we’re trying to make an impact now. Say I have domain expertise, I don’t have any knowledge about how Generative AI works, but I know it’s very useful for the things I do. My productivity goes up if I can use it, but I’m stuck relying on other companies to create something for me. With something like InstructLab, I can use one of these models, and the tools to use these models, in my task. I think that’s the ultimate level of democratization, when LLMs become useful for everybody.

Akash loves the Labrador breed of dogs, so much so that he initially named the InstructLab project simply “Labrador.” To celebrate the successful launch of InstructLab, Akash got a Labrador puppy and named it Ladoo—the same name, by popular vote, given to the InstructLab cartoon mascot.

Heidi Dempsey: So the first stage was democratization of the math and the programming, and the second stage is making it easier to use. Is there more to it?

I think that’s the ultimate level of democratization, when LLMs become useful for everybody.

Akash Srivastava:  First the science got democratized, then the engineering got democratized, and now the application layer is coming. And when I say engineering, I mean low-level engineering, hardware- and kernel-level engineering. Now that we have the science and engineering democratized, application developers and domain experts can come in and start building these beautiful apps for a much wider audience. 

Heidi Dempsey: This reminds me of the invention of HTTP: first there were distributed systems, but they were expensive, and the hardware and software to run them was all research lab stuff. Then eventually because of the protocol, we increased its availability. Now it’s on your phone, and all that stuff from the early days doesn’t matter to users anymore—it’s all behind the scenes. When do you think we’ll see the same thing with AI, and people only using AI on their phones?

Akash Srivastava:  I think a lot of people are already using it on their phones!

Heidi Dempsey: They definitely are. People want AI tools like Perplexity or ChatGPT or Apple Intelligence without having to understand what’s behind them.

Akash Srivastava: I use AI tools very often. The way I program has changed. GitHub Copilot was the first tool that actually impacted my daily life, and now Cursor with its fancy composer makes it seem almost like I’m programming in English.

Heidi Dempsey: You’re almost becoming a scientist instead of a programmer, right? 

Akash Srivastava:  I will admit, I was never much of a programmer [laughs]. For me, programming was a tool that helped me do my research. I was never incredibly good at it, but with these tools I’m empowered to take my ideas to prototyping remarkably fast. The current generation of language models is democratizing natural-language-related work because a lot of our business processes are basically a transaction in language, whether it’s financial forms or legal processes or business processes. Kids these days can create apps using these things. They don’t need to go to a university or wait to get advanced knowledge. With these tools, software engineering democratization is happening right now. My gut feeling, at least on the research side, is we’re going to see a lot more democratization of engineering in general.

Here’s an example: we have some active projects at MIT with the mechanical engineering department, where we’re looking at the equivalent of IDEs for them, like CAD tools and other engineering design simulators. It takes ages to develop expertise there, but what if there were tools like Copilot for engineering design, chip design, or circuit design? We’re going to see a flurry of startups and research labs producing Copilot- or Agent-like products for different branches of engineering, which is true democratization. 

Any place where you would naturally use computer programs, it’s just a matter of time until a piece of generative or other form of AI will come and help you as an assistant. One issue right now is how critical the domain is, and what the safety standards are. As you move towards domains like mechanical engineering, civil engineering, or electrical engineering, you don’t have much slack. A tiny error in a silicon chip design is going to cost you millions if not billions.

Heidi Dempsey: And speaking as a former civil engineer, if your bridge doesn’t line up, that’s even worse.

Akash Srivastava:  Absolutely. That’s why the human expert in the loop is going to remain a dominant paradigm in those fields in the near future. The requirement for precision in these mission-critical domains is not something current Generative AI can match. One of our grants for the MIT-IBM lab is for precision generative modeling, which is aimed at pushing the boundary of precision in generating modeling to create engineering designs that are so precise, they can be sent directly to the machine shop or a 3D printer.

Excitement makes an impact

Heidi Dempsey: Going backwards a little bit, how did you get into AI and computer engineering? Were you a math nerd, or did you take your toys apart?

Akash Srivastava: In elementary school I became fascinated with the idea of connecting human brains. What would happen? My dad got me this book by Ray Kurzweil, The Singularity is Near. I didn’t understand 90% of it because I was very young, but my dad and my sisters helped me make sense of it. After reading the book, I knew I needed to study this.

At the University of Sheffield, where I ended up, this particular degree (BSc in AI and Como Sci) was new, and people were still figuring out what it should include, so I did computer science, math courses, psychology courses, a bit of probability theory, chaos theory. For my Master’s and PhD, I was at the University of Edinburgh, which is amazingly good for machine learning. Thomas Bayes, who came up with Bayesian theory, and Geoffrey Hinton, who pioneered deep learning, both went there. By far the best time of my life was doing my PhD. 

Heidi Dempsey: It’s the life of the mind, right? When you’re walking around thinking about your problem five or six levels deep while you’re just eating a sandwich. So you used up all the knowledge in the UK, then you came to the United States and MIT. What was the difference?

Akash Srivastava: When I came over to the US and the MIT-IBM lab, they kept that university environment very intact. The best part was working with people—especially students— who are better than you. You question yourself: “Why are you working with me again?” Now, at Red Hat, my team actively engages with PhD students who are excited to work with us on the cutting edge to help the mission of democratizing AI.

Heidi Dempsey: Eventually, you have to move into industry because you decide not to be a professor. And you want to preserve some of that excitement and curiosity, but industry is trying to make money. So how do you walk that line of maintaining the adventuring spirit of research while delivering something for the bottom line?

Akash Srivastava: I was lucky because my PhD was in generative modeling, which at the time meant you could throw a stone and hit a job offer, and that job often was pure research. When I joined the MIT-IBM lab, I don’t think I ever felt like I had a real job because there was no pressure other than the normal conference and paper deadlines. 

How do we get to a point where what we do helps more people than just experts in our domain?

But at some point I started questioning my impact. Okay, I’m producing papers, which makes an incremental difference, but how many lives am I actually impacting? The question for me and a lot of people on my team was, how do we get to a point where what we do helps more people than just experts in our domain? The solution was very natural. This technology we just happened to have studied is transforming people’s lives. That’s how we pivoted into language modeling and figuring out how to make a good language model. Everybody was struggling: typically research is all out there but in the field of LLMs people would not publish details, especially right after ChatGPT came out. So that became our mission. 

My team was at NeurIPS in New Orleans, and I was at a workshop on multilingual models. It was a completely unrelated topic, but there was a picture of a taxonomy, and it just clicked. I ran and found the guys on my team and we made a bet that this was how you could synthetically generate  the data for the alignment problem in language models. You generate data using a taxonomy, and you can define what goes in your model. These guys, I am not kidding, in three days they were able to prototype this thing—during a conference where half the time you’re looking at posters and half the time you’re tipsy from all the parties they have.

Heidi Dempsey: So that turned into the taxonomy for Granite?

Akash Srivastava: Yes, that’s the basis for LAB (Large-scale Alignment for chatBots), which is the basis for InstructLab and how Granite models are aligned. After an intense frenzy of work, we showed it to my then-boss and we were blown away. We were beating Meta’s Llama, and we went from being super underdogs to beating the best models at that time. 

Heidi Dempsey: That’s really cool. But still: when you get to industry and you have a roadmap and things you have to deliver on a certain schedule, how do you retain and encourage creativity and the willingness to take risks? It sounds like at the MIT-IBM lab you could do that because the emphasis is on the research part, but when you came to Red Hat the emphasis was more on the product. 

Akash Srivastava:  I think it’s a two-part thing. First, the team really matters. I always say whenever I’m hiring, I can compromise on your degree or your expertise, but I will not compromise on your excitement. If discussing ideas, implementing them, and doing research doesn’t give you the joy it gives the rest of the team, you’re just going to feel left out. 

Part two is articulating very clearly how everybody is making an impact. The simplest way to understand my team is that when they wake up, they need to beat something to feel like they won the day. And they need a clear line of sight as to how the company will benefit from it.  If they feel like a little cog in some big machinery, they get bored. In fact, they come to me and tell me off—there’s no filter. “You explain to me right now, how is this thing helping the business? We joined this thing because we wanted to make an impact, and I don’t know how I’m making an impact.” I think for researchers in most places, the line of sight as to why you’re doing something is never made clear. Our team doesn’t have this problem, and it’s such a refreshing change.

By the time you’re reading this, we should have put out our work detailing the new state-of-the-art inference scaling technique.

Heidi Dempsey:  We see the same thing in Red Hat Research, with the excitement about measuring and analyzing stuff.  I had a team that changed from using small memory maps to big memory maps and they were really excited to see the flame graph of the performance of the CPU and GPU when they’re running certain programs for memory access. It wasn’t a significant impact on the product at that point; it was just, “Wow, look how much of a measurable change we made with this one thing.”

So what are you and your team fired up about now?

Being at Red Hat, we can not only do this kind of work and publish papers about it but also put our code and model out there and allow the entire community of makers, coders, and students to iterate upon it and make it better.

Akash Srivastava:  Right now InstructLab is the only example in the market of an end-to-end LLM customization solution, so we want to continue our efforts on the research and development side to keep it in the number one spot this year too. Everyday with our collaborators at MIT, other academic partners, and IBM, we’re working on the third generation of synthetic data generation and model alignment techniques. Being at Red Hat, we can not only do this kind of work and publish papers about it but also put our code and model out there and allow the entire community of makers, coders, and students to iterate upon it and make it better. 

Every year in my team we set a grand challenge. Last year it was scaling small models via data, and we invented InstructLab as a result. This year we’ve taken up the challenge of adding inference/test time scaling tools to our offering. By the time you’re reading this, we should have put out our work detailing the new state-of-the-art inference scaling technique. This is bleeding-edge research in the Generative AI domain, and it’s very strategic to our business. With Neural Magic joining us this quarter, we have an opportunity to establish ourselves as the leader in inference scaling techniques for small language models.

Breaking into AI

Heidi Dempsey: That’s very cool. So let’s talk about the flipside of that dynamic. There’s somebody sitting in a group somewhere who wants to do something not LLM-related. They can’t get any funding and their creativity is being suppressed because industry is so focused on solving everything with LLMs right now.

Akash Srivastava:  This is a problem everywhere, not just in industry but in academia or in getting funding for a startup. I always have to look for resources, and it’s so much easier if there is an LLM or Generative AI use case attached. This is a game that in academia we play very well. Understanding the broader impact of a particular technology is always very helpful when it comes to writing grants or pitching ideas.

It’s also that 80/20 thing, right? I’m happy to spend 20 percent of my own time to prove value for that other thing I’m convinced will have a benefit. We’re researchers, and we should be looking two, three, or more years down the road and preparing for that. In the meantime, make yourself productive and learn the business, learn the tools. Pivoting into Generative AI is not hard. Imagine this is the first year of your PhD. How do you learn? Work with senior people as the fourth or fifth person on their paper and learn the techniques.

Heidi Dempsey:  So you have to do the same thing with your new idea: find your group of collaborators and do your proof of concept. We used to do that upstream—Upstream First, right? But Upstream First with AI models is just not feasible.

Akash Srivastava: Upstream First requires fairly rigorous software engineering practices, but chances are, an average researcher, like some of my PhD students and some of the researchers on the team, might not have done a single pull request in their lives. It’s like asking a software engineer who’s trying to get into AI research to start by composing a well-written research paper. These are different workflows and skill sets from different domains. 

Working together requires some adjustments on both sides, and Red Hat is a great place for that.

Neither should change their workflow; it’s what makes them productive at their respective jobs. Research code will be dirty code, or at least not at production level. But—this is your way in. If you have a cool idea, go offer help. Working together requires some adjustments on both sides, and Red Hat is a great place for that. 

Working together requires some adjustments on both sides, and Red Hat is a great place for that.

Heidi Dempsey: Let’s take that issue of engineers working with non-specialists back to InstructLab. At first, there was a lot of talk about synthetic data and having a teacher model and a critic model. It doesn’t seem like that’s caught on as much as the rest of InstructLab. Do you have theories about why?

Akash Srivastava: This is a very interesting question. InstructLab, at a high level, is a tool for customization of language models. It’s a prescriptive method, where instead of Red Hat deciding what goes in your model, you make a list of things you want your model to know (knowledge) and a list of things you want your model to be able to do (skills). The machinery takes your prescription and converts it into some data, then we take that data and train the model. In many ways, the secret sauce is synthetic data generation. If you look across the industry, everybody trying to offer a customization toolkit is not giving you any way of generating the data. And buying it is super expensive. So one of the biggest problems InstructLab is solving for many users is generating data. 

Heidi Dempsey: They also have the domain knowledge—that’s why they’re coming to you. 

Akash Srivastava: And this is why I liked your previous question, because it gets to what this tool is. It’s a customization toolkit. That’s important because there’s nothing else right now available from any other company. That doesn’t exist except for InstructLab or RHEL AI  or OpenShift AI. 

Heidi Dempsey:  Very cool. That’s a lot like us in research. We’re working with pathologists and biologists and other researchers in the same way. And I love that outlook because it’s concentrating on the things your users do know and what they can contribute to the eventual model that’s going to solve the problem.

Akash Srivastava: I like that, because to me that’s the Red Hat way. And tell them the technical things so they will come back and contribute.

Heidi Dempsey: Thank you very much for such a lively conversion.

Akash Srivastava: Thank you—that was really fun!

SHARE THIS ARTICLE

More like this