👃 ☹️ 💶 In the access area. Find the distance from a point to an area and reduce reverse geocoding requests 🔍 🤚🏼 👼🏿

More than once I had to implement the functional of calculating the distance from a certain geographical point to an area on the map - for example, to the Moscow Ring Road. As a result, I found two ways to solve the problem that showed good results, and now we regularly use them in production. I will describe them in the first part of the article. And in the second I’ll show how you can cache geodata to use geocoder less.

Part one. Two ways to find the distance from a point to an area on a map

If your mobile application is truly mobile, it works with the coordinates of the device. The location of the user (and device) affects various business indicators of the application, such as delivery cost, complexity factor, etc.
Below I will show examples of implementation of algorithms in python using the scipy and shapely libraries.
For geocoding we use Google Maps. It suits both functionality and cost of use. At the time of this writing, Google allows you to make the first 20,000 geocoding requests per month for free.

Method 1. We calculate the route based on the vertices of the polygon

Suppose we need to find the distance from a point somewhere in the Moscow region to the Moscow Ring Road. A real path is needed, not a geometric one. Therefore, first we build a landfill from the exit points from the Moscow Ring Road, and they do not coincide with the vertices of the road outline on the map.

exit_coordinates: List[Tuple[float, float]]
latitude: float
longitude: float

To work with geometry, we use the shapely library . With its help it is easy to determine whether the point of interest to us is inside the polygon or not. If it is, the distance is obviously 0.

from shapely.geometry import Polygon, Point
polygon = Polygon(exit_coordinates)
polygon.contains(Point(latitude,longitude))

We are interested in the case when the point is outside the polygon. The task of finding the nearest objects (exits from the Moscow Ring Road) to the starting point is quite typical in mathematics. Usually it is solved using a KD tree. So let's do it. In python, the tree is implemented in the scipy library . We will find the nearest peaks from the exits from the Moscow Ring Road to our point.

from scipy.spatial import KDTree
kd_tree = KDTree(exits_coordinates)
dists, indexes = kd_tree.query((latitude, longitude), k=3)
nearest_coordinates = list()
for _, index in zip(dists, indexes):
		nearest_coordinates.append(exits_coordinates[index])

The number k - the number of peaks - should not be too small, because the nearest exit from the Moscow Ring Road may be temporarily blocked. Or it may be somewhere beyond the river and not have a direct path to our point. Too large k is also useless for us, because a suitable exit is located in several nearby peaks. Now let's turn to the Google Maps service. It already has a functionality for finding distances from an array of points to a specific point - the Distance Matrix API.
Upd: the Distance Matrix API service bills the route to each nearest peak separately. Thus, this method is more expensive than the second, described below. thanksovetnikov

import googlemaps
gmaps_client = googlemaps.Client(key=settings.GOOGLE_MAPS_API_KEY)
gmaps_client.distance_matrix(
    origins=nearest_coordinates,
    destinations=(latitude, longitude),
    mode='driving',
)

We can only parse the answer. Usually it has several routes, and the very first is not always the shortest. Therefore, we choose the appropriate value.

Method 2. We calculate the route from the center of the landfill

Now, in addition to the vertices of the polygon, we define a certain point inside it, from which we will build all the routes of Google Maps. We will use the Directions API, which will build routes for us, and choose the shortest of them.

start_point: Tuple[float, float],
destination: Tuple[float, float]
polygon: shapely.geometry.Polygon
gmaps_client = googlemaps.Client(key=settings.GOOGLE_MAPS_API_KEY)
driving_path = gmaps_client.directions(start_point, destination)

We start iteratively from the end to add the lengths of the path segments until the coordinate of the beginning of the segment is inside the polygon (parsing omitted):

for step in reversed(driving_path):
    start_point = step['start_location']['lat'], step['start_location']['lng']
    if is_inside_polygon(start_point, self.polygon):
        end_point = step['end_location']['lat'], step['end_location']['lng']
        distance += get_part_outside_polygon(
            get_line(start_point, end_point), polygon
        ) * step['distance']['value']
        break
    distance += step['distance']['value']

We can only add part of the segment outside the polygon. To do this, using the shapely library, we build a geometric line between the coordinates of the beginning and end of the path and find how much of its length lies outside the polygon. The same percentage of the length is calculated for the obtained segment of the path.

A path is a terrain route on real roads with natural curvature. If this is a long straight line (avenue or highway), then our approximation will allow us to calculate the route exactly in percentage terms.
If the road crossing the landfill is curved enough, such an approximation will be incorrect. But the curved parts in the Google Maps route itself are usually of short length, and the errors in their calculation will not affect the result.

from shapely.geometry import LineString, Polygon
line = LineString(point1, point2)
part_outside_len = float(line.difference(polygon).length) / float(line.length)

To be honest, I did not compare these two methods. I have been using them for more than a year, both work without failures. I decided not to paint their implementation in detail. Instead, he opened access to his geo library. Liba can also do regular geocoding, including with efficient caching. Issues and pull-requests are welcome!

Part two. Save Reverse Geocoding Requests

Often we need to determine the addresses of points located nearby and corresponding to one object in the real world. Each time, making a request to an external geocoding system is not cool, and here the question of caching reasonably arises.
Typically, the client sends the coordinates with excessive accuracy, for example, 59.80447691, 30.39570337. But how many signs in the fractional part will be significant for us?
For the lazy and in a hurry, the answer is four. For everyone else, see the explanation below.

First, a little geography.

The equator is 40,075.696 km long, and it is zero latitude.
Meridians are lines of longitude, perpendicular to lines of latitude. The length of any meridian is 40,008.55 km
Degree of latitude - 40,008.55 km / 360 = 111.134861111 km. So, one hundredth gives a change of about a kilometer, and one ten thousandth gives a change of 10 meters.
, , , 60 ( -) 2 . — 40 075,696 * 0.5 / 360 = 55,66066888 , — 5 .

An error of 1/10000 degrees gives a rectangle of error of 5-10 by 10 meters. This will allow us to exactly “get” into the building, as the building is much larger than the point. And if, due to an error, the point does not fall into the building, Google will still determine the closest to it.
Rounding up to three characters is risky, accuracy is already reduced significantly. And up to five - it makes no sense, because GPS accuracy in smartphones usually does not exceed just a few meters.

Thus, if there is a value in the cache with the key “address: 59.8045,30.3957”, then all coordinates, when rounded to 4 characters, the same key is obtained, correspond to one geocoded address. We make fewer requests - we work faster and pay less for using the geocoding service. Profit!

In the access area. Find the distance from a point to an area and reduce reverse geocoding requests

Part one. Two ways to find the distance from a point to an area on a map

Method 1. We calculate the route based on the vertices of the polygon

Method 2. We calculate the route from the center of the landfill

Part two. Save Reverse Geocoding Requests

First, a little geography.

More articles: