Small Language Models: Big Potential for Enterprise Content

When it comes to enterprise content, sharper focus through fewer parameters gives small language models (SLMs) an edge over their larger counterparts. Generative AI agents powered by SLMs can be more accurate, more secure, and require fewer computing resources to operate. And all of those benefits can lead to fewer costs to implement and maintain.

How Does Generative Artificial Intelligence (Gen AI) Work?

To understand language models and how they’re used, it’s helpful to first understand the notion of generative artificial intelligence (gen AI). Gen AI is a type of deep learning AI that relies heavily on neural networks. Gen AI is “trained” to analyze existing content (also known as training data or the training corpus) and use statistical probability to generate new content in the form of text, imagery, sound, and other formats. Gen AI isn’t looking only at individual words, sounds, or images in existing content; it’s also looking for patterns and how the elements of language work together to create meaning (also known as the language model).

Gen AI analyzes what has happened before (as described in the training corpus) so that it can apply the language model to generate new content. How a gen AI tool responds to a request depends on how it’s been trained and the language model, or models, that it’s using.

Take this sentence as an example: “Bob went out to walk the __.” If we ask Microsoft CoPilot, a generative AI tool, to finish the sentence for us, CoPilot responds:

“…iguana. Why not, right? Bob could be a trendsetter in exotic pet ownership! Or perhaps there’s another creative spin you’d like to put on this—what do you think Bob’s walking?”

A different generative AI tool, trained on different content and/or using different content models, might come back with a more common suggestion:

“dog. Bob went out to walk the dog.”

The output of today’s generative AI tools is limited to the content they’ve been trained on, the language models underlying that content, and the task they’ve been asked to complete. Generative AI agents can’t come up with ideas they haven’t encountered in some way before.

What Are Large Language Models, Tokens, and Parameters?

In The Content Advantage, Colleen Jones explains that a large language model (LLM) is:

Pretrained on vast amounts of data, mostly text.
Capable of understanding and generating human language by processing that text in response to queries.
Learning context through a transformer architecture of several neural networks.

An important concept in generative AI is the token. A token is simply a unit of content: a word, a punctuation mark, a character, an image, etc.

Another important concept is the parameter. A parameter is a variable or setting that language models use to “understand” what each token in the training corpus “means,” and to generate a response to a request (e.g., asking Microsoft CoPilot to finish the sentence about Bob).

What Is a Small Language Model?

Small language models (SLMs) aren’t just like LLMs; they used to be LLMs. To create an SLM, developers use techniques such as distillation, pruning, and quantization to make an LLM smaller and more manageable while preserving as much of its original power as possible. Examples of SLMs and their LLM ancestors, include:

SLM	LLM Ancestor	Developer
DistilBERT	BERT	Google
GPT-4o mini	GPT-4	OpenAI
Gemma	Gemini	Google
Haiku	Claude	Anthropic
xGen-Sales	Agentforce	Salesforce
Granite 3.0	Granite	IBM
Llama 3.2	Llama	Meta
Phi-3	Phi	Microsoft
Ministral 3B	Les Ministraux	Mistral AI

For example, from Google’s Gemini came the SLM Gemma, and from BERT, DistilBERT. Microsoft’s Phi and Les Ministraux gave rise to Phi-3 and Ministral 3B, respectively.

The difference between an LLM and an SLM is in the number of parameters the model uses to do its job. While an LLM may have 140 billion parameters, an SLM may have fewer than 5 billion, for example.

As is the case with any effort, focus is a force multiplier. The more focused the task, the more likely it is that an SLM will be more well-suited to the job than an LLM. SLMs are ideal for powering narrowly defined tasks, such as:

Walking mobile app customers through the onboarding process,
Enabling auto-complete in a website’s search function, or
Summarizing text conversations and suggesting calendar entries on a mobile phone.

How Do Small Language Models Address Gen AI Risks for Enterprises?

Unless it’s deployed carefully, responsibly, and intentionally, generative AI has the potential to introduce significant risks to the enterprise. Their smaller size means that SLMs provide organizations with important opportunities to mitigate those risks.

SLMs Require Fewer Resources

For every parameter a language model has, the more expensive it is to operate. LLMs are notoriously resource intensive, requiring massive amounts of computing power. For example, while an LLM can require specialized hardware and hundreds of gigabytes (GB) of random-access memory (RAM), an SLM can run on a single computer and a few GB of memory. In one recent study that compared the BERT, DistilBERT, and TinyBERT language models, the LLM required more than 3 times as much energy as the smallest to generate similarly accurate responses.

For organizations concerned with sustainability and efficient stewardship, generative AI driven by SLMs simply costs less to operate and wastes fewer precious resources.

SLMs Offer Better Privacy and Security

Their smaller size means that SLMs can be deployed in local environments, on premises, and even on devices. Most of the essential privacy and security protocols are within reach of an organization’s tech team rather than being outsourced to the cloud and subject to interference by unknown actors.

It also means that organizations can keep their assets (and those of their partners and customers) close, reducing the risk of accidentally sharing information that shouldn’t be shared.

SLMs Can Be More Accurate and Relevant

While LLMs make generative AI powerful, they also make it unpredictable. As Colleen Jones observes in The Content Advantage:

[Generative AI] has potential to solve bigger problems at a wider scale with less human intervention over time, but it is less predictable, susceptible to biases and inaccuracies within its LLM, and requires large amounts of text and computing power.

Task-focused SLMs can be more accurate and relevant than LLMs because they don’t have to concern themselves with anything other than the domain of interest. An SLM that powers how-to guidance in setting up a smart TV, for example, doesn’t need to contemplate the history of Renaissance artists or other potentially confounding topics.

Smaller training data sets and smaller domains of concern mean that SLMs are easier to train and maintain. This is good news for organizations that need to keep up with content that changes often, quickly, or both. It also means fewer opportunities for the model to hallucinate because there are simply fewer concepts to understand.

SLMs Work Faster

Because SLMs analyze a smaller training corpus and have fewer parameters, their latency is lower than that of an LLM. Said another way, an SLM can be faster than an LLM because it doesn’t have to analyze everything under the sun.

How Are Organizations Using Small Language Models Today?

Enterprises have been slow to adopt generative AI in general and SLMs in particular. But while SLMs may be small, they offer mighty transformation potential. Consider the possibility of these real-world applications:

Retail: A mobile app that acts as a personal shopping assistant (e.g., “According to your purchase history, it seems like a size medium will fit you best.”)
Healthcare: A smart watch that monitors biometrics and gives advice (e.g., “Feeling stressed? Take a few deep breaths.”)
Finance: A credit card app that scans for potential fraud and seeks owner approval prior to a purchase (e.g., “We noticed that you haven’t spent this much money in this store before; are you attempting to buy $5,000 worth of pencils at Office Extravaganza right now?”)

Explore SLMs Further

Consider these articles delving into how organizations can start to use SLMs to unlock progress in applying generative AI.

We also cover SLMs in our AI + Enterprise Content Certification with Content Science Academy.

And as SLMs become more widely used, you can be sure the Content Science team will share more developments in their application.

The Author

Content Science partners with the world’s leading organizations to close the content gap in digital business. We bring together the complete capabilities you need to transform or scale your content approach. Through proprietary data, smart strategy, expert consulting, creative production, and one-of-a-kind products like ContentWRX and Content Science Academy, we turn insight into impact. Don’t simply compete on content. Win.

What to Check Out Next

This article is about

Artificial Intelligence

Comments

COMMENT GUIDELINES

We invite you to share your perspective in a constructive way. To comment, please sign in or register. Our moderating team will review all comments and may edit them for clarity. Our team also may delete comments that are off-topic or disrespectful. All postings become the property of
Content Science Review.