THE CHALLENGE: SEARCHING ACROSS LANGUAGES

Imagine if you could search for important information regardless of language.

We are in a global world. Unfortunately, very few businesses operate in multiple languages. In the past, if you had documents in foreign languages, they were first translated and then reviewed for relevance. More recently machine translation was introduced, but it can’t fully deliver on the promise—primarily because it tries to correlate words or phrases across languages. One of the reasons foreign languages can be difficult to learn is that language is idiomatic. Each language expresses content in different conceptual terms or phrases. “Comment allez-vous?” may mean “how are you?" in French, but the literal translation “How do you go?” is meaningless in English.

Searching is no different. Cross-lingual solutions like BabelFish™ rely on the electronic equivalent of French-English dictionaries along with lists of common idiomatic terms. Therefore, search results using literally translated keywords are hardly reliable.

THE SOLUTION: CONTENT ANALYST

Content Analyst is trained in most major languages – including Far East and Asian languages such as Arabic, Chinese, and Japanese – and can accurately deliver results based on relevance, rather than keyword or word-list association. How did we do it? Think of the Rosetta Stone, which contained the same story in three languages (Latin, Greek, and Egyptian). This allowed archeologists to translate Egyptian hieroglyphs based on associations to the Latin and Greek languages. The same idea, applied technologically, allowed us to train Content Analyst to “think” in these major languages. Content Analyst also “thinks” in Unicode-B so if required it can be cross-trained to understand lesser-known languages as well.

Solutions in Action:
Lost in Translation - Life before Content Analyst
Life in the Intelligence Community may appear very “James Bond,” but the reality is much more like “9 to 5." Each day another endless stream of news articles and RSS feeds arrives from countries across the globe. There are no optimal ways to address handling information in various languages. You can hire multi-lingual researchers--but they are expensive and very few speak more than 2-3 languages. Then your next issue is deciphering enough information to decide which information is worth fully translating. One reported mishap involved a car advertisement that was translated from Farsi to English. At a time when our intelligence community is supposed to know what is happening in near “real-time”, time-wasting practices such as this are no longer acceptable.

Implementing Content Analyst
Feeding a set of identical documents into Content Analyst (as few as 20) that have been translated into the desired languages is all that is needed to begin accurate conceptual searches across multiple languages. Nothing else is required--the intelligence is already built into Content Analyst to handle specific language rules and requirements.

Life after Content Analyst
The proverbial “needle in a haystack” can now be found. Content Analyst returns searches that cover multiple languages, prioritizes which articles are most relevant, and best of all, the client can use our unique Novelty filter which means any article already seen would never be returned in subsequent search results. There are many benefits to using Content Analyst for searching multiple languages, but the ones most appreciated by customers are the reduction of translation costs, reducing backlog, and eliminating translations of irrelevant data.

 

spotlight