Financial institutions’ Know Your Customer (KYC) and Anti-Money Laundering (AML) processes have historically incorporated “negative news” searches. When done correctly, negative news searches can inform risk assessments and add important context to flagged transactions. More importantly, they can help avoid the major reputational damage—and associated bottom-line impact—that comes with on-boarding the wrong types of customer. Unfortunately, legacy approaches to negative news mostly find results about the wrong people or irrelevant ones about the right people. The only real benefit of these solutions is that the financial institution can say it ran the searches, as an (apparent) appeasement to its directors and regulators. Any true risk mitigation is outweighed by the significantly higher operational costs that come with a deluge of false positive results. We think there is a better way.

Negative news, or adverse media, refers to information in the public domain that portrays a person or organization in a negative light. In practice it typically refers to reporting from a credible news organization about criminal activity, including allegations, lawsuits, arrests, and disciplinary actions. Unlike other data-driven processes required in AML, such as sanctions screening, the sources of information required for negative news are diffuse, subjective and effectively innumerable. Furthermore, the underlying data of interest – the free-form, unstructured text of the news articles – is not easily stored and queried from within a structured database with predefined rows and columns.

For years the AML industry has accepted negative news solutions from a few key heavyweight providers. Most of these providers started their AML software business in sanctions screening, and accordingly mapped this problem to the solution they already had: an authoritative list of names that they could run searches against.

The challenge of disparate media sources has forced legacy providers to employ small armies of analysts to manually screen countless sources every day and cut-and-paste names onto an expanded list of names.

The problem with this sanction screening-inspired approach is two-fold. The first problem also plagues sanctions screening applications: names are common. It is not unusual for a sanctions screening program to produce just one true positive name match for every 10,000 “hits.” While that false positive rate should be shockingly unacceptable in any industry, it is generally regarded as reasonable in sanctions screening due to the enormous inherent risk in a name appearing on a sanctions list. However, for negative news screening, a name appearing in a single news article can imply a wide range of risk. There is no justification for a financial institution to expect or accept such a ridiculous false positive rate. And yet, this is the embarrassing truth of the performance of solutions available.

The second problem is a problem for the providers themselves: news content proliferates every day and it is nearly impossible to keep up. At the same time, clients expect timeliness and are asking for monitoring programs with daily alerting.

Providers have historically responded by growing their human curation teams, but this path is not scalable. There is a rapidly approaching point where the customer will no longer tolerate the growing pass-through costs of the behind-the-scenes manual review team.

The most common adaptation of the solutions providers has been to automate the retrieval of news using a keyword-based filter. Now the extension of the authoritative list of names becomes an authoritative list of keywords: only news articles with one of
these keywords is flagged, and any occurrence of a customer name in the same article produces a hit. This adaptation breeds a second category of false positives: incorrect risk assignment. The customer mentioned may be a witness, victim, defendant, or otherwise incidental party in the news event, or the keyword itself may be used in an entirely different context (e.g. “‘I Killed His Confidence’- Marat Safin Blames Himself For Roger Federer’s Struggle Against John Millman”).

Recognizing these shortcomings in their solutions, many providers have inserted their own protective layer of professional services between the raw “hits” output from their software and the final alerts that are presented to investigators. These consultants help to mitigate the worst false positives but they can do little to identify the more subtle incorrect person identification. Furthermore, clients purchasing these solutions, thinking they are leveraging software technology to reduce operational costs and remove the bias and errors of human curation, find themselves sorely disappointed. It’s not surprising that many AML executives therefore choose instead to unleash their investigators on Google, despite the obvious shortcomings of a search engine optimized to return popular links and drive clicks.

A New Approach

Quantfind recognizes the untapped potential of adverse media in mitigating reputational and regulatory risk, and challenges the status quo that has existed for far too long. Quantifind’s approach to negative news screening is not inspired by sanctions name screening, but rather is based on intelligence automation and the dynamic discovery of predictive signals. Quantifind does not rely on a static list to match to, but rather analyzes tens of thousands of articles daily for emerging risk typologies. Machine learning is leveraged to analyze the context around a name to improve results across both the accuracy and relevance dimensions. Quantifind not only extracts the correct name, robust to common misspellings and aware of different cultural name structures, but also makes sure to flag only those articles that are in fact relevant to an AML investigation. Quantifind’s solutions have automated the gathering, extraction, and curation of results—at scale—without a team of human analysts in the loop.

Some of the key attributes of the Quantifind solution include:

  • Automated preprocessing of tens of thousands of news articles a day, before any specific searches are launched
  • Real-time, full-text search against millions of news articles on-demand
  • Automated on-the-fly results curation with machine learning and AI:
    • Estimation of probability of a true match, using advanced entity resolution models
    • Assignment of risk, using a suite of dynamic risk factor typology models
  • Automated news article clustering and summarization
  • Global, multi-language news data from over 10,000 individual sources
  • Delivered as pure Software-as-a-Service through both APIs and a web application
  • Service configurable to users’ risk tolerance by Risk Level, specific Risk Factor(s), and estimated accuracy of supporting evidence

Quantifind’s approach to negative news represents a next-generation solution that is in use at some of the world’s largest global financial institutions. The Quantifind negative news solution is a part of a more comprehensive software solution representing a one-stop shop for financial crimes investigators and analysts to access and integrate public domain information. The full solution uses machine learning on top of proprietary infrastructure specifically designed for unstructured text analytics, refined over ten years and billions of documents processed. To learn more about Quantifind’s solution and to schedule a demonstration, visit us at or contact us.

Quantifind’s solution for negative news—part of a broader solution for public domain data in financial crimes use cases—makes use of a series of model-based filters to isolate only the most accurate, relevant results. Quantifind’s service runs a full-text search against over six million news articles—prefiltered for relevance—to return on average fifty-four documents with a name match. Entity resolution and the application risk typology models isolate the one document with greater than 90% chance likely to be a true match (“Strong Confidence”) and which indicates High Risk. The entire search and curation process happens on-the-fly at the time of a search against Quantifind, with results returned in milliseconds.