Select Page

Blog post

The Semantic Layer: A Reliable Map of the Enterprise Data Landscape

May 16, 2025
Reading Time: 9 min

Read about how the idea of a semantic layer started, what are its elements, and how we can use it to enhance the performance of large language models

This post is an abbreviated and updated version of the Knowledge Graph Insights podcast, episode 14, hosted by Larry Swanson. Larry is a digital architect and strategist as well as a co-organizer of the Future of Content meetup in Amsterdam, the Knowledge Graph Conference, Connected Data London, and Decoupled Days. In this episode, he discusses enterprise data challenges, the semantic layer, and knowledge graphs with Andreas Blumauer, SVP Growth and Marketing at Graphwise.

How did it all start?

Data and content professionals have been having knowledge management discussions for a long time and around 2005 the community was already divided into two camps. One side thought that documents contained information and that was enough. The other side argued that this was not knowledge and for data to become knowledge it needed a human to recognize it as such. But I always felt I was sitting between these two camps. I was excited about what AI could bring but at the same time I understood that we shouldn’t substitute human intelligence with Al.

Looking at the enterprise data architectures in those days, I thought there was something missing. There was the data layer (the document repositories) and there was the business logic layer (the applications) and nothing in between. And it seemed like a big hole. So, I was trying to introduce this new layer, which we started calling a “semantic layer” and most people looked at me like I was an alien. But I kept going in that direction. Today, everybody is discussing what this layer should look like and, in the context of a good AI strategy, it has become more important than ever before.

Why do large language models fail to deliver?

Language allows us to develop knowledge without using language. Usually, we don’t develop anything more complex than what we need for our survival. But knowledge is more rigid than language. It’s a structure that depends very much on context. The same knowledge may be very important in certain situations and not so important in other situations.

Different types of knowledge need to be handled differently. It’s all about understanding to embed knowledge in a process. Using language and the unstructured data side of things is just one way of moving forward. Using highly structured data to describe a particular element of our enterprise landscape is another way. So, it’s very important to link various aspects of data, which will give us different ways of getting value out of it. As soon as we interconnect our data and contextualize it dynamically, we increase its value. But we need to know how to do it and we don’t find this knowledge in our databases or our technical documentation. It’s just in our heads.

That’s why we do desktop data integration all day long. Large language models (LLMs) can’t do it for us. They can only produce answers based on what’s written down somewhere in a database but they can’t link data across silos in a usable way. They simply can’t understand the inherent structure of structured data so they throw it away and make random chunks out of the information. But we need that structure to understand the meaning of the data.

To help them deliver, first we want to link different types of data and then enrich it with domain knowledge. This domain knowledge is the background knowledge we need to be able to better understand which type of data to use in which situation. Here knowledge graphs can be a lifesaver as they can sustain the knowledge and help interpret content and data in a useful way.

Why do we need a map of the data landscape?

We live in a world where there’s a silo per application and when we try to follow the logic of a given domain, we get lost. We constantly need to ask our colleagues where we can find a report or to consult an expert on the topic we are working on. So, we need a map to be able not just to look around, but to navigate all data points. When we want to go from one place to another in our car, we use a map. We don’t want to keep stopping and asking for directions. But this is exactly how we work in our enterprises.

I think the next big step is the semantic data fabric, which offers many possibilities to navigate the data landscape. It’s driven by the domain knowledge model and it describes the way business objects are connected. For example, when we hire a new employee, we might want to find out where they worked before and what skills they acquired there. Then we’ll need to decide if we can reuse these skills for one of our projects or if the new person needs additional training. If so, what could be the best training for them?

All these questions keep us busy and instead of focusing on our work, we spend our time doing desktop data integration. We pull together information, talk to colleagues, take notes, create spreadsheets, and so on. This isn’t collaborative. And, even the best enterprise search engines can’t help us because they don’t connect the dots. They can only search over repositories, which speeds up our task, but doesn’t give us a map to navigate the data points. No wonder knowledge workers these days are so stressed as they have to do all this research before they can make a decision. At the end of the day, it’s all about decision-making.

How can we use the semantic layer as a map?

First, we need to describe each relevant business object in a way that allows it to connect to other business objects. To do that, we want to have a description based on a controlled vocabulary, or a taxonomy, or an ontology. So, when we use this domain knowledge model to classify all business objects, we can instantly connect, for example, a description of a person’s skills to other business objects such as projects, products, or standards. We could call this a recommender service that can understand which business objects belong together.

This is possible not just because Person A is tagged with Skill B and Project 1 is tagged with the same skill but because we use inference or inference tagging as we call it. That means we know that if a person has skill A, then we can automatically infer that this skill is connected to a particular ISO standard. The domain knowledge model allows us to do inference automatically.

This is how we do desktop data integration in our head as well. We pull together data from different silos and try to have a holistic view over it. Our background knowledge kicks in to help us understand how things are connected. So, all we need to do is to digitize that process. We invest so much money in huge data and content repositories and we skip the last 1% percent. The domain knowledge model is just 1% of the semantic layer but it’s a core piece of any AI strategy.

Can LLMs accelerate knowledge graph development?

When LLMs appeared, the graph community was shaken because we thought that they would take over completely. But it wasn’t the case. LLMs are great for bubbling up the unstructured data, so now enterprises are much more aware of its value. Which is great because there’s so much interesting stuff there. But we need to handle this data in a more structured way.

This is where domain knowledge models come in. They connect the pieces between the data and the content of an enterprise. And even the people! I’ve seen many times how the content people and the data people in an enterprise don’t talk to each other. But at the end of the day, it doesn’t matter if it’s data or content, it’s all about knowledge.

What are the elements of the semantic layer?

The semantic layer consists of two elements. First is the domain knowledge model and, typically it’s not just one but several, interconnected and mapped to each other. This work is still done by subject matter experts together with knowledge engineers. The second part of the semantics layer is the enterprise knowledge graph. It’s automatically generated through the use of the domain knowledge models and other ways of extracting and transforming data from existing data repositories. So, the majority of the semantic layer is always done automatically.

At the moment, the content people are more aware of the importance of a semantic layer. The data people are still more on the mapping side of things and not on the value of enriching the existing data with a domain knowledge model. A lot of AI strategies and RAG architecture discussions stick to the idea that vector databases will do the magic, which is rarely the case. Some data professionals are still hesitant to reuse taxonomies ontologies to enrich their RAG architecture and prefer to do everything with algorithms, LLMs, database embeddings, and so on. So, it’s the content people introducing taxonomy and ontology discussions into the decision-making processes. But, as I said, the content and the data people have started talking to each other as a result of the need to execute on the AI promise.

Wrapping it all up

Nowadays, we see a lot of mergers, acquisitions, and other cooperation models in many industries. It brings a strong need to map the different vocabularies – those used by humans as well as the ones used by machines. The semantic layer can do that. It can help interpret the data in a way that it can be instantly understood by others. So, this isn’t about imposing an ontology on the enterprise and making everyone use it. It’s about mapping different ways of looking at things. It’s an important tool for developing governance around data that allows a degree of fusing decentralized and centralized structures. We could call this glocalization – a mixture of global and local organizations coming together.

I think we even need a semantic layer in politics and religion to help different groups of people better understand and interact with each other. Then we’ll start to see that we are talking about the same things, just using different contexts around them. In my opinion, humans are much closer to each other than we currently think and what separates us is only terminology.

Subscribe to our Newsletter