Written by James Treneman, ex-Indie Game Developer and Services Delivery Manager at TrackIt

“It’s going to be so cool when NPCs are hooked up to LLMs!” Ever heard that before? Because I have – every time a gamer enters a GenAI conversation. To a typical player, it’s an obvious progression, and why wouldn’t it be? Chatbots are everywhere, and conversational AI is becoming mainstream. But the layman’s understanding is not deep enough to foresee some of the major issues with implementing anything resembling their vision of characters with interactive AI capability.

Generative AI has been THE buzzword for a while now, but it’s reached a fever pitch within the last year. Companies have been freezing hiring, shrinking workforces, and pausing new work all because of the air of uncertainty around the technology. Executive interest in GenAI has already increased 7x over the last 6 months, and many business leaders worry they are going to miss out on the revolution.

With so much buzz around AI in gaming, much of which is based on potential, it’s easy to understand why gamers might be expecting this evolution at any moment. Although they may not be aware of the blockers, their collective interest (and potential revenue) is driving demand for integrating powerful LLM technology into gaming experiences.

Here are some of the issues: 

  • COST COST COST
  • Latency
  • Knowledge base management
  • Hallucinations and inappropriate responses
  • Consistency with narrative style / voice of the writer

Managing a corpus of game lore that characters can access based on specific privileges is one of the more complex challenges. However, agentic AI could address this issue before it becomes a significant obstacle. By adopting the “agent” model for NPCs, these characters could be efficiently managed and seamlessly integrated into existing technology to individualize each one. Alternatively, all relevant lore could be included with every prompt.

That said, there are potential risks, such as hallucinations, inconsistent style, or inappropriate responses. However, without a clear understanding of the associated costs, these issues become less critical. As costs decrease, the incentive to address these problems will increase, driving rapid solution development.

Prime Use Case

A theoretical use case is a Skyrim style game where NPC interactions are powered by a LLM. Players could engage in conversational input and receive lifelike, text-based responses from NPCs. If you aren’t familiar with Skyrim it’s produced by Bethesda and sets the gold standard for open-world fantasy RPGs. Although it’s a fairly gritty offering, the gameplay is excellent, the narrative is extremely compelling, and players are faced with a multitude of high-impact decision points that dictate their journey and storyline. Who wouldn’t want to plug GenAI into that formula and create a limitless fantasy world?

It would be a magnetic combination that could potentially trigger an industry-wide paradigm shift. That’s probably why it gets suggested so often. Our concept involves users typing out their prompts to NPCs, but the implementation could be less granular, with the model generating dialogue trees for users to choose from. This technique is already being utilized by companies like Inworld AI and it makes a lot of sense in how it streamlines user interaction. But for maximum immersion, a conversational implementation remains the end goal.

As tempting as it is to utilize some of the natural speech models, we’re going to limit our scope to simple textual responses. But keep in mind, with the huge strides we’re seeing in affordability, speech output isn’t out of the question. 

To estimate cost we studied the following interaction as our primary use case:


User input (text): “why would the queen want me dead?”

NPC output (text): “Your past exploits have been useful to Her Majesty. However, the moral compass you recently developed has interfered with her plans. You should have simply followed orders.”

The ideal power player should play 40+ hours a month, and interact an average of 100 times per hour, so 4000 times a month. Players may spend periods heavily interacting with the model, like when they explore a new town or enter deep conversations as part of the plot, but much of their time would still be occupied by other trappings of the genre: exploring, delving, customizing, and the like. This will be a critical balance; gameplay needs to be compelling enough to make the NPC interactions impactful, yet we still want players to rely on local resources as much as possible.


Before we start calculating, it’s important to lock down some other aspects of the workflow. To streamline things and avoid managing a costly knowledge base, we will assume that the model receives an NPC’s entire knowledge base and the player prompt with every request. Without storing and managing a complex knowledge base, it’s the best way to operate while still giving developers strong control over NPC behavior. However, clever development in this area could easily lead to a more efficient and cost-effective way to utilize a persistent knowledge base, such as tags that could determine NPC access to specific information.

So our prime use case has the model ingesting the player’s question along with 10,000 tokens worth of lore with every interaction. That should be enough to make an adequately complex world come to life in this example setting.

Costs

Initially, we based our estimation on the Claude 3.5 Haku / Sonnet models which were cost-prohibitive on a per-user basis. However, amidst writing this piece, Amazon Nova was released which is a far more affordable option and shows just how fast the technology is advancing.

Price Comparison (per user):

  • Claude 3.5 Sonnet: $81/mo
  • Claude 3.5 Haiku: $31/mo
  • Nova Pro: $24/mo
  • Nova Lite: $2.40/mo
  • Nova Micro: $1.41/mo

As you can see we’re in the thick of it; truly on the precipice of viability. Because we only need textual responses, even Nova Micro may be powerful enough for our use case. At just $1.41 per user, margins are suddenly looking pretty attractive! With game subscriptions often costing $10+ a month, a killer app leveraging Nova could pull a huge player base, all willing to subscribe for the unique experience.

A Way Forward – The Future of Game Development

What is an eager developer to do? Throw in the towel and wait for someone else to solve these issues? Not a chance! GenAI gets so much attention that any solution provided in this problem space has the potential for extreme returns. Game developers are nothing if not true innovators of frugality, which is a root cause of the industry’s resistance to high-spend technologies. No matter what, it will be costly, and whatever game attempts to bridge the gap will need to be compelling enough to draw premium subscribers.

What’s needed is the kind of good old innovation game developers are known for: reduce scope, find the fun, focus on one extremely compelling innovation, and polish it up into a tidy package for the world to unwrap. There’s already a handful of demos and proof of concepts out there, but very little AAA development in this arena; understandable considering the potential liabilities and legal implications. My guess is that an indie or AA developer will be the first to crack the code, and AAA companies will assimilate or adopt the formula.

Well, get to work people! I’d like to play after all :]

About TrackIt

TrackIt is an international AWS cloud consulting, systems integration, and software development firm headquartered in Marina del Rey, CA.

We have built our reputation on helping media companies architect and implement cost-effective, reliable, and scalable Media & Entertainment workflows in the cloud. These include streaming and on-demand video solutions, media asset management, and archiving, incorporating the latest AI technology to build bespoke media solutions tailored to customer requirements.

Cloud-native software development is at the foundation of what we do. We specialize in Application Modernization, Containerization, Infrastructure as Code and event-driven serverless architectures by leveraging the latest AWS services. Along with our Managed Services offerings which provide 24/7 cloud infrastructure maintenance and support, we are able to provide complete solutions for the media industry.