Article

Are You Feeding Your Fraud Model Identity Data Designed for Risk?

We recently published a blog post that explains how tree-based models are popular when it comes to fraud prevention and that fraud models must be fed the right data. Today, there are many identity data providers to choose from, and most offer data consumption via API. However, identity data APIs are not all the same. Few providers design their identity data and APIs specifically for risk. Many identity data APIs are developed without a particular model in mind. The APIs are entirely generic; they are for any use case under the sun.

General Purpose Identity Data APIs in Fraud Models

A general purpose identity data API could be used in a fraud model, but there are some challenges to using an identity API that is for any use case and not risk in particular. For example, a common type of API used in fraud prevention models and systems is an address validation API. An effective address validation API ensures that addresses are valid and helps merchants make sure orders are shipped to the right addresses. But what happens when an address validation API is developed for general use cases and not with fraud in mind?

When an address validation API is general purpose and not built with a specific model in mind, the provider could change the output- and change it at any time. Let’s say you’re using an address validation API and the model has features A B C. What happens if the provider changes those features to D E F? – First, your entire model blows up because the features of the API model were changed. Second, you must now retrain your model.

Consistency is not the only thing to consider when choosing an address validation API. The API should have the ability to normalize an address in a different format because not all customers input an address the same way. In addition, the API should not only be able to validate addresses effectively but also provide a unique identifier (UUID) for every address. The UUID can be used in a fraud model to calculate velocities. Our address validation API is built around these requirements.

Mapping APIs Are Not Validation APIs

Some companies try to take a mapping API and adapt it into an address validation tool. For example, the Google Maps platform includes a Geocoding API capable of geocoding and reverse geocoding addresses. But geocoding is just one of the many mapping capabilities provided by the Google Maps platform. And Google Maps is not designed to be used for address validation specifically. Future releases of the Google Maps platform could be problematic for your fraud model- why is that?

The intent is different when it comes to a mapping platform vs. a validation or identity data platform. A mapping platform like Google Maps starts with an assumption of innocence and tries to “find the closest match to the address you’re looking for.” A validation or identity data platform starts with an assumption of suspicion and tries to “determine whether the address you gave is legitimate.” If you’re on the road trying to get directions, you want a service that will do its best to figure out what address you’re after even if it’s misspelled. But if you’re trying to determine if the address is fabricated, that’s a very different need.

Identity Data APIs Designed for Risk

Whitepages Pro identity data APIs are designed for risk use cases. So, if you’re using our Global Address Validation API in your fraud model, you don’t have to worry that the features will eventually change. Rest assured, the API can be added to your fraud model without running the risk of future breaking changes. The API will continue to reliably and quickly parse, normalize, validate, and geocode addresses from any country in the world. Our Global Address Validation API is also often used to calculate address velocities that can be used in a fraud ML model.

When it comes to machine learning models, data consistency is incredibly valuable. And the identity data provided by our APIs is always consistent.

Our APIs are designed specifically for risk, extremely fast, and ideal for tree-based models. Need help choosing one of our data APIs for your ML model? Our team of machine learning solutions architects is happy to help– contact us today.

Thanks for reading! You might be interested in these posts, too: