Using Automated Geospatial Functions in Grid Analysis to Select Ideal ATM Locations

Suvratanu BaidyaNovember 2, 2016

The banking industry generates a huge volume of data on a day-to-day basis. To differentiate themselves from the competition, banks are increasingly adopting big data analytics as part of their core strategy.

 

One challenge banks face is the selection of ideal geographic locations for new automated teller machines (ATMs). Two analytic methods are used to determine optimal locations:

  • Key performance indicators (KPIs) analysis, which we covered in depth in a previous blog post and will summarize here.
  • Grid analysis, which will be the focus of this post.

There are hundreds of KPIs to be considered for each small area, which we refer to here as “grid.” At the country level, there are millions of grids, and so a huge volume of data gets generated.

 

Banks must be very strategic in picking the next location for their ATMs and branches. Hence the need for 100+ KPIs to justify which area is best suited for a new ATM, based on the maximum potential.

 

A bank that has a nationwide, or an international, presence faces a problem: How should it effectively analyse potential locations within small geographic areas? Executing this analysis manually is tedious and time consuming, so we developed an automated process.

 

Sample Use Case: Automated Geospatial Functions in Grid Analysis

First, let’s consider a geographic area in which we need to find the best location for ATMs. We take as an example the U.S. state of Alaska, which we divide into grids. In this example, we use grids of 10 kilometres by 10 kilometres. We begin by entering the dimensions onto the state map, which results in the image below.

 

Geospatial Grid

 

But here it’s difficult to find the grids that are probably covered with snow or otherwise uninhabitable by human beings. Also, of the remaining grids, many would be shared across multiple districts. Hence, we need additional effort to re-divide the grids across each district.

So we create grids to meet these criteria:

  • Be reflective of state or district boundaries
  • Be robust enough to handle the volume of data when dealing with millions of records
  • Be smart enough to consider only the habitable areas and to omit unusable areas, such as bodies of water, forests, etc.

The ideal solution would, with minimal manual effort, entail all of the underlying sub-shapes like districts, while also excluding uninhabitable areas.

 

Continuing with our example, we take a small area of the state and omit the lake in the middle. The district-level breakdowns are maintained during grid division. This makes the grids more meaningful, as illustrated below, and leaves us with less manual work to remove unnecessary grids.

Geospatial Grid 2

For this process, we perform the following steps:

  • Shape the original area for grid division.
  • Remove irrelevant areas from the original area.
  • Divide each relevant area into grids.
  • Segment and store each relevant area of grids.

We next employ a series of user defined functions (UDFs) using Vertica geospatial functions, along with Java. Vertica allows us to write custom codes using Java, which can then be deployed in the database and called using Structured Query Language (SQL) queries like inbuilt functions. Considering the volume of data and the processing time involved, we can deploy some of the functions as stand-alone executable Java Archive (JAR) files that run without logging into the Vertica database.

 

Case Study: Grid Analysis for a Major Banking Client

We developed a solution very similar to this use case for one of our banking clients. Some of the challenges we faced and the solutions developed included:

  • Challenge – If the parent shape has sub-shapes within it (e.g., state files inside a country or districts inside of states), the boundaries of the sub-shapes need to be retained while creating the grid.
    Solution – We designed the solution so the process loops through the entirety of sub-shapes and retains the child shape while creating the grids. This maintains the sub-shape boundaries even after grid division. If there is any grid that falls across two districts, the process divided the grid into two shapes—one for each district. This measure ensured we had separate grids for an area overlaying multiple child shapes.
  • Challenge – The sheer volume of data resulting from the operation. Each operation can create millions of rows within each of the multiple steps, creating unnecessary load on the database.
    Solution – We created temporary tables to store the irrelevant data during different steps of the operation and to store only the final relevant data in the database.
  • Challenge – How to identify the individual grids within each area once multiple, overlaying grids were created. Most of the grids would be squares or rectangles, rendering it impossible to tell each one apart from the others by merely looking at them or by looking at the coordinates.
    Solution – We saved all of the grids with information on the parent sub-shape. We saved grids with more information on each row for better understanding on parent sub-shape.

This process can be run for a shape file containing many or no child shape files inside. In case the parent shape contains children, the grids align such that all the grid shapes are in the same formation, regardless of the location of the child shapes. Also, to identify the grids, we stored the parent information inside the grids, so that the user can locate which sub-shape the grid belongs to. This process drastically reduces manual effort by automatically dividing the grids for KPI-based analysis.

 

Conclusion

We implemented the above solution by dividing geographic areas into grids and then calculating 100s+ KPIs for each area. With the automated process, we reduced significant manual effort. This solution can also be applied in other industries, such as retail, to find an ideal store location and more.

To learn more about the use of data-driven geospatial analysis for banking or other industry applications, contact us today.