Skip links

Interest detection from Social Media: how does it work?

Interests are a fundamental component of our audiences, stakeholders and customers. They tell us how to engage people, how to talk to them and, ultimately, the topics we can use to get their attention. And every marketer knows how precious and scarce this resource can be nowadays.

As a qualitative, non-discrete object, interests used to be a difficult information to obtain; fortunately, we now have social media, boasting a huge variety of data that can be used to infer interests. Listening tools of the like of KPI6 can do the trick for you – which means, they can extract interests from your user base and bring them to you in an easy, ready-to-read dashboard. Let’s learn together how that can be accomplished.

Are you eager to start using interest analysis to benefit your brand? Meet our experts and ask them now:


Interest Analysis

Social Media Analysis offers a large suite of tools we can use to improve our way of searching for a particular trend. Having a visual understanding of what a person talks about is a great advantage that can save us an enormous amount of time, especially if we are about to analyze not a single person, but a large group of people – an audience.

For example, we may want to understand what kind of people buy a given product, or the followers of a particular influencer, or again we may want to launch a marketing campaign targeting users who see the contents of our Facebook page. To help our customers, KPI6’s AI Researchers have developed an intelligent system that quickly eases the pain of navigating through more than one thousand interests in a smart way.

We follow the standard set by IAB (Interactive Advertising Bureau), where interests are grouped in more than 20 main categories, each containing up to four levels of precision. We can show if an audience talks more about rock music – and so becoming a part of the class named | Art and Entertainment | Music | Rock Music. Or maybe they prefer to travel, as we can suppose if they write about Disneyland very often.

Knowledge Graph

We commonly make the assumption that if a person writes frequently about a entity such as “Cristiano Ronaldo”, we can very likely say that he often talks about soccer. In other words, we can infer that an entity (Cristiano Ronaldo) is linked to an interest (Soccer). But how can we do that on an analytics platform? Manually writing all possible couples among entities and interests is impossible, since it would require too much effort.

How can we know that “Cristiano Ronaldo” is a football player, then?

We can, thanks to a brand-new technology – not so brand-new, actually – that has been used extensively by large and small companies to make their products appear a bit smarter. Think about what Google did with online researches: if we type “Cristiano Ronaldo” the research engine shows us other information outside of the usual web pages. A table appears, showing information like age, partner, children and personal statistics, as well as salary and how many goals he has scored during the current season.

Google is not doing it by hand, and – you are going to believe that – nor even with a magic wand. The answer is actually pretty simple: the word “Cristiano Ronaldo” is matched to a single entry in a private big archive of data that stores all the information about the popular football player – the so-called “Knowledge Graph”.

The entity matching problem

We won’t bore you with technical details and we’ll try instead to keep things as simple as they should be. Imagine we have entities like cities, foods, people and so on. We also have qualities like dates, professions, ages. A single entity contains links to some of those traits, because a relation exists that links the entity and the quality. Relations can also link entities with other entities, but not qualities on other qualities.

Now we have some basic rules that will tell us that, if we find the entity “Cristiano Ronaldo” with the relation “profession” and quality “soccer”, we will know that the aforementioned entity is a football player.

But there’s a missing piece in the puzzle: we need a way to know if we are still talking about the “Cristiano Ronaldo” knowledge graph entity if we encounter words like “Cristiano Ronaldo”, “Ronaldo” or “CR7”.

When the user mentions an entity, we first recognize it in the text; only then we link it to an entity of the knowledge graph without NLP (Natural Language Processing) technology. When the prediction has been made, we extract all the relevant information we want to show in the chart. We organize all the information hierarchically, showing only the general interest classes like “Art and Entertainment” and “Automotive”. It is the customer, according to her own needs, the one who will expand a category. For example, in the Automotive category, we can see sub-interests like “Sports Car” on 80%, “Utility Cars” 15% and “SUV” 5%. Clearly, this particular audience likes to talk about luxury items; people belonging to it will probably have a high income, too.

With great power comes great responsibility

When people ask us to enlist all the interests we can predict, we feel a bit troubled. We really don’t know! What we can do is generating a subset of the interest hierarchy. Remember: an hierarchy is a branch of a tree that goes from the abstract to the specific, like Automobile | Cars | Sport Cars. The other part is only known when the entity is evaluated. We do not write by hand all the interests; instead, we let the system infer for us what our charts will display.

For example, imagine how many genres of music exist in the world. Imagine that a big database that stores all of them also exists – The Knowledge Graph – and that we can access that information with much less effort. In order to do that, we match the name of the song to the related entity in the database, and look for the relation “genre”. If a hit is found, we stick to the genre “Rock Music”, and attach it to the prediction as a specific interest of “Music”.

We can predict more than one thousand interests this way, but we cannot predict them in advantage. Nevertheless, we do think that we have gained much more than we have lost.

Language Support

KPI6’s Interests Detection supports English and Italian, but we are planning to extend the compatibility for French, Spanish, German and Arabic. The cool thing is, since our tool is more of a “Entity Detection” than an interest detection, we can find English and Italian entities inside any other language text. So, it works well if you want to analyze a French text!

Japanese or Chinese audiences are still difficult to handle, because we cannot yet read ideograms, but we have succeeded in extracting interests from Arabian users. Therefore, it is very likely that we can find results in audience analysis because there are more people; hence, an higher chance to find a English or Italian word in the text.

Are you eager to start using interest analysis to benefit your brand? Meet our experts and ask them now:


Interest Detection, as you can see, works in a complex way, but it has been developed to make things simpler for our users and to turn that complexity into powerful, understandable insights.

Chances are you can make a good use of well-classified interests – any company would love to know one or two key topics that can quickly and surely engage its audience. Sometimes, interests are surprising and unpredictable: without a tool, you always risk your money in possibly unprofitable activities – you might be convinced that you customers love football, while they spend their Sundays playing golf… and your budget for TV spots is gone!

Interest detection is only a part of our Artificial Intelligence-based features. Find all the tools we can offer you to boost your company’s ROI:


Iscriviti alla community KPI6 e ricevi news e report riguardanti le più importanti tematiche della tua professione: Social Media Listening, Big Data Analysis, Influencer Marketing e molto altro ancora.