Use Address Validation API to process addresses at high volume

Objective

As a developer, you often work with datasets containing customer addresses which may not be of good quality. You need to ensure that addresses are correct for use cases ranging from customer id verification, to delivery, and more.

The Address Validation API is a product from Google Maps Platform that you can use to validate an address. However, it only processes one address at a time. In this document, we will look into how to use the High Volume Address Validation under different scenarios, from API testing to one-time and recurring address validation.

Use cases

Now let's understand the use cases where High Volume Address Validation is useful.

Testing

You often want to test the Address Validation API by running thousands of addresses. You might have the addresses in a Comma Separated Value file and want to validate the quality of the addresses.

One-time validation of addresses

While onboarding to the Address Validation API, you want to validate your existing address database against the user database.

Recurring validation of addresses

A number of scenarios call for validating addresses on a recurring basis:

  • You may have scheduled jobs to validate addresses for details captured during the day for example, from customer signups, order details, delivery schedules.
  • You may receive data dumps containing addresses from different departments e.g. from sales to marketing. The new department receiving the addresses often wants to validate them before using.
  • You might collect addresses during surveys, or various promotions and later on update in the online system. You would like to validate the addresses are correct while inputting them in the system.

Technical deep dive

For the purposes of this document, we assume that:

  • You are calling the Address Validation API with addresses from a customer database (i.e. a database with customer details)
  • You can cache validity flags against individual addresses in your database.
  • Validity flags are retrieved from the Address Validation API when an individual customer logs in.

Caching for production use

When using Address Validation API, you often want to cache some part of the response from the API call. While our Terms of Service limit what data can be cached, any data that can be cached via Address Validation API must be cached against a user account. This means that in the database, the address, or address metadata must be cached against a user's email address or other primary ID.

For the High Volume Address Validation use case, data caching must follow the Address Validation API Service Specific Terms, outlined in Section 11.3. Based on this information, you will be able to determine whether a user's address may be invalid, in which case you will prompt the user for a corrected address during their next interaction with your application.

  • Data from the Verdict object

    • inputGranularity
    • validationGranularity
    • geocodeGranularity
    • addressComplete
    • hasUnconfirmedComponents
    • hasInferredComponents
    • hasReplacedComponents
  • Data from the AddressComponent object

    • confirmationLevel
    • inferred
    • spellCorrected
    • replaced
    • unexpected

If you want to cache any information about the actual address, then that data must be cached only with the user's consent. This ensures that the user is well aware why a particular service is storing their address and they are OK with the terms of sharing their address.

An example of user consent would be direct interaction with an ecommerce address form on a checkout page. There is an understanding that you will cache and process the address for the purposes of shipping a package.

With user's consent, you can cache formattedAddress and other key components from the response. However, in a headless scenario, a user cannot provide consent since the address validation is happening from the backend. Therefore, you can cache very limited information in this headless scenario.

Understanding the response

If the Address Validation API response contains the following markers, then you can be confident the input address is of deliverable quality:

  • The addressComplete marker in the Verdict object is true,
  • The validationGranularity in the Verdict object is PREMISE or SUB_PREMISE
  • None of the AddressComponent are marked as:
    • Inferred(Note: inferred=truecan happen when addressComplete=true)
    • spellCorrected
    • replaced
    • unexpected, and
  • confirmationLevel: The confirmation level on the AddressComponent is set toCONFIRMEDorUNCONFIRMED_BUT_PLAUSIBLE

If the API response does not contain the above markers, then the input address was likely of poor quality, and you can cache flags in your database to reflect that. Cached flags indicate that the address as a whole is poor quality, while more detailed flags such as Spell Corrected indicate the specific type of address quality issue. On the next customer interaction with an address flagged as poor quality you can call the Address Validation API with the existing address. The Address Validation API will return the corrected address which you can display via a UI prompt. Once the customer accepts the formatted address you can cache the following from the response:

  • formattedAddress
  • postalAddress
  • addressComponent componentNamesor
  • UspsData standardizedAddress

Implementing a headless address validation

Based on the discussion above:

  • It is often necessary to cache some part of the response from the Address Validation API for business reasons.
  • However the Terms of Service in Google Maps Platform restricts what data can be cached.

In the following section, we will discuss a two step process on how to conform to the Terms of Service and implement high volume address validation.

Step 1:

In the first step we will look into how to implement a high volume address validation script from an existing data pipeline. This process will allow you to store specific fields from the Address Validation API response in a Terms of Service compliant way.

Diagram A: The following diagram shows how a data pipeline can be enhanced with a High Volume Address Validation logic.

alt_text

  • According to the Terms of Service , you can cache addressComplete,validationGranularity and validationFlags when validating addresses in a headless fashion .

  • You can cache the addressComplete,validationGranularity and validationFlags, PlaceID against a particular UserID in the customer database.

Thus during this step of the implementation we will cache the above mentioned fields against the UserID.

For more information see details on the actual data structure.

Step 2:

In step 1, we collected feedback that some addresses in the input dataset may not be of high quality. In the next step, we will take these flagged addresses and present them to the user and get their consent to correct the stored address.

Diagram B: This diagram shows how an end to end integration of the user consent flow could look like:

alt_text

  1. When the user logs in, first check if you have cached any validation flags in your system, such as:

    • addressComplete is true
    • validationGranularity not being PREMISE or SUB_PREMISE
    • validationFlags being inferred,spellCorrected,replaced,unexpected.
      • If there are no flags, there is a high confidence that the existing cached address is of good quality, and it can be used.
  2. If there are flags, you should present the user with a UI to correct/update their address.

  3. You can call the Address Validation API again with the updated or cached address and present the corrected address to the user to confirm.

  4. If the address is of good quality, the Address Validation API returns a formattedAddress.

  5. You can either present that address to the user if corrections have been made, or silently accept if there are no corrections.

  6. Once the user accepts, you can cache the formattedAddress in the database.

Pseudo code implementing Step 2:

If addressComplete is FALSE

OR

If validationGranularity is Not PREMISE OR SUB_PREMISE

OR

If validationFlags is inferred OR spellCorrected OR replaced OR unexpected
  {

    # This means there are issues with the existing cached address

    Call UI to present the address to user

}
Else{

    # This means existing address is good
  Proceed to checkout
}

Conclusion

High Volume Address Validation is a common use case you are likely to encounter in many applications. This document attempts to demonstrate some scenarios and a design pattern on how to implement such a solution conforming to Google Maps Platform Terms of service.

We have further written a reference implementation of High Volume Address Validation as an open source library on Github. Check it out to get started building with High Volume Address Validation quickly. Also visit the article on design patterns of how to use the library in different scenarios.

Next Steps

Download the Improve checkout, delivery, and operations with reliable addresses Whitepaper and view the Improving checkout, delivery, and operations with Address Validation Webinar.

Suggested further reading:

Contributors

Google maintains this article. The following contributors originally wrote it.
Principal authors:

Henrik Valve | Solutions Engineer
Thomas Anglaret | Solutions Engineer
Sarthak Ganguly | Solutions Engineer