THE CHALLENGE: CREATING WORD LISTS OR TAXONOMIES

wordlistImagine if you could search for key information in an unfamiliar field without knowing terminology specific to the industry.

Most likely, you’re an expert in your industry. You know what business terms are most frequently used by your company and your competitors. Imagine trying to group or categorize the myriads of documents, emails, and other electronic information floating around in your company without an understanding of terminology. In the past, the only option was to somehow develop a relevant word list to aid in searching through information.

Besides the fact that this is a difficult task for an outsider, the result is also problematic. Taxonomies are always evolving and changing. For example, consider the fact that less than two years ago if you offered to “Google” something, people would have looked at you funny. “Google” is now an accepted verb for searching online.

THE SOLUTION: CONTENT ANALYST

Implementing Content Analyst in this case is as simple as feeding it a complete set of data – for a taxonomy, a mere sampling of <100 documents yields 97.5+% accuracy (far above a taxonomy developed by people) – and having it present a taxonomy that is based on the concepts contained in the data. You can also add specific categories, inclusion of additional modifiers, and fold new data in as new vocabulary or information is introduced to your business.

Solutions in Action:
Let your data do the talking
The information’s already in there – all the concepts, all the terms, all the contextual meanings and interpretations. Instead of trying to develop a common list of categories or taxonomy, Content Analyst literally pulls this from within the data itself; giving the data its own voice. Take a tip from your mother and listen to your data.

Life before Content Analyst
When you’re not an expert in that industry – a problem faced by attorneys and most outside knowledge workers – or if you’re trying to focus on a particular business issue or segment – you first need to develop a relevant word list; the underpinning of how you will organize and categorize that electronic information.

This is problematic. Taxonomies are always evolving and changing. Word lists are a source of continual negotiations – nobody can ever agree.

So you develop your taxonomy only to have to redo it six months later when it’s woefully out of date. And you put in an “other” category – that over time becomes 10% of your overall data.

Implementing Content Analyst
Implementing Content Analyst in this case is as simple as feeding it a complete set of data – and having it present a taxonomy that is based on the concepts contained in the data. If you require manual additions (specific categories, inclusion of additional modifiers, etc.) you can add this in also.

Life after Content Analyst
Taxonomy becomes one less thing you have to worry about. Content Analyst will merrily go along assigning categories based on content as long as you run data through the engine. And to accommodate those changes that life throws at us, you have two choices: you can simply fold new data into Content Analyst and it adds those new terms into your vocabulary, or you can select a new sample and re-index it to create a new taxonomy - so when business takes a major attack, it’s no longer a major setback.

 

spotlight