Concept | Geo join recipe#

The Geo join recipe is a visual recipe that joins two or more datasets using geographic features that meet certain criteria, such as points within a specific distance, features that intersect, or points within a specific geography.

Note

This recipe is similar to but has more functionality than the Geo-join processor, which can only perform a geographic nearest-neighbor join between two datasets with latitude and longitude coordinates.

Tip

If you are unfamiliar with joins, you might want to review the Join recipe concept before moving on here.

Use case#

Let’s say you want to optimize the use of different city WiFi hotspots. You might want to combine two datasets: one containing the coverage areas of different hotspots and another containing foot traffic data.

In this case, we can use the Geo join recipe to combine these datasets. Using the Is contained within matching option, we can match foot traffic geopoints (coordinates) to certain geographical areas. We’ll learn about matching operators in a moment.

A slide showing the input and output datasets for the example WiFi traffic use case.

Note

  • Dataiku lets you delineate a geographic area using polygons.

  • To understand the different ways in which you can visualize the results of a Geo join, take a look at the reference documentation on Map Charts.

Geo join matching operators#

Instead of using logical or comparison operators to match datasets, the Geo join recipe includes a variety of geospatial matching operators to combine datasets.

Important

Key columns must have geometry or geopoint storage types.

A Dataiku screenshot of the Geo join settings window.

Matching operator

Matches if

Is within distance of

Points or geometries are within a user-defined distance from each other.

Is beyond distance of

Points or geometries are beyond a user-defined distance from each other.

Contains

Points or geometries in the left dataset contain points or geometries in the right dataset.

Is contained within

Points or geometries in the left dataset are contained in points or geometries in the right dataset.

Intersects

Points or geometries in key columns have points in common.

Is disjoint to

Points or geometries in key columns do not have any points in common.

Touches

Points or geometries in key columns have points in common only on a boundary. Point to point matches are not included here.

Is strictly equal to

Geometries occupy the same space, even if the order of their vertices differ.

What’s next?#

Get some hands-on practice using the Geo join recipe with the Tutorial | Geo join recipe!

Note

Detailed information can be found in the reference documentation on Geo joins.