Working with Joins in LookML

LookML
Version

On this Page
Docs Menu

Joins enable the exploration of data from more than one view at the same time. You can “join” together different views in order to see how your data relates to each other.

For example, your database might include tables like order_items, orders, and users. We can use joins to explore data from all these tables at the same time. This page will walk through joining in LookML, including specific join parameters and joining patterns.

Joins Start with an Explore

Joins are defined in the model file to establish the relationship between an explore and a view. Joins connect one or more views to a single explore, either directly, or through another joined view.

Let’s consider two database tables order_items and orders. Once you have generated views for both of these tables, you can declare one or more of them as an explore in the model file, like this:

- explore: order_items
explore: order_items { … }

When we run a query from the order_items explore, order_items will appear in the FROM clause of the generated SQL, like this:

SELECT … FROM order_items

We can also join additional information to our order_items explore. For example, suppose we wanted to add data about the order that the order_item is a part of. You might do something like this:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id}
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} ;; } }

The LookML above accomplishes two things. First, you will be able to see fields from both orders and order_items in the UI, like so:

Second, the LookML describes how to join orders and order_items together. That LookML would translate to the following SQL:

SELECT … FROM order_items LEFT JOIN orders ON order_items.order_id = orders.id

Below, we’ll step through these LookML parameters one-by-one to explain them in greater detail. You can also read the join parameter reference page to learn more about how this LookML is translated into SQL.

Join Parameters

There are four main parameters that are used to join: joins, join, type, relationship, and sql_on. Let’s walk through each of them from the example above.

Step 1: Starting the Explore

First, create the order_items explore, like so:

- explore: order_items joins:
explore: order_items { … }

Step 2: join

If you want to join a table, you must first declare it in a view. In this example, orders is an existing view in our model.

We can then use the join parameter to declare which view to join to order_items:

- explore: order_items joins: - join: orders
explore: order_items { join: orders { … } }

Step 3: type

We can next consider which type of join to perform. Looker supports LEFT JOIN, INNER JOIN, FULL OUTER JOIN, and CROSS JOIN. These correspond to the type parameter values of left_outer, inner, full_outer, and cross.

- explore: order_items joins: - join: orders type: left_outer
explore: order_items { join: orders { type: left_outer } }

The default value of type is left_outer, which will generally be your most popular join type.

Step 4: relationship

We can then define a join relationship between order_items and orders. Properly declaring the relationship of a join is important for Looker to calculate accurate measures. The relationship is defined from the order_items explore to the orders view. The possible options are one_to_one, many_to_one, one_to_many, and many_to_many.

In our example, there can be many order_items for a single order, so the relationship is many_to_one:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one
explore: order_items { join: orders { type: left_outer relationship: many_to_one } }

If you do not include a relationship in your join, Looker will default to many_to_one.

Step 5: sql_on

The final step is declaring how to join these two tables together. This can be done with either sql_on or foreign_key. We usually suggest sql_on since it can do everything foreign_key can do, but is typically easier to understand.

sql_on is equivalent to the ON clause in the generated SQL for a query. With this parameter, we can declare which fields should be matched up to perform the join:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id}
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} ;; } }

We can also write more complex joins. For example, you may want to join only orders with id greater than 1000:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} AND ${orders.id} > 1000
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} AND ${orders.id} > 1000 ;; } }

Check out substitution operators to learn more about the ${ ... } syntax in these examples.

Step 6: Testing

We’ve now created a new join in LookML! You can test that this join is functioning as expected by going to the Order Items Explore. You should see fields from both order_items and orders.

Check out Model Development to learn more about testing LookML changes.

Joining Through Another View

You may want to join a view to an explore through another view. In the example above, we joined orders to order_items via the order_id field. We might also want to join users data to the order_items explore, even though there is no common field between those two views. This can be done by joining through the orders view.

We can use sql_on or foreign_key to join users to orders instead of order_items. We do this by correctly scoping the field from orders as orders.user_id.

Here is an example using sql_on:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id}   - join: users type: left_outer relationship: many_to_one sql_on: ${orders.user_id} = ${users.id}
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} ;; } join: users { type: left_outer relationship: many_to_one sql_on: ${orders.user_id} = ${users.id} ;; } }

Joining a View More than Once

Let’s say we have a users view that contains data for both buyers and sellers. We may want to join data from this view into order_items, but do so separately for buyers and sellers. We can join users twice, with different names, using the from parameter.

The from parameter allows you to specify which view to use in a join, while giving the join a different name. For example:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id}   - join: buyers from: users type: left_outer relationship: many_to_one sql_on: ${orders.buyer_id} = ${buyers.id}   - join: sellers from: users type: left_outer relationship: many_to_one sql_on: ${orders.seller_id} = ${sellers.id}
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} ;; } join: buyers { from: users type: left_outer relationship: many_to_one sql_on: ${orders.buyer_id} = ${buyers.id} ;; } join: sellers { from: users type: left_outer relationship: many_to_one sql_on: ${orders.seller_id} = ${sellers.id} ;; } }

In this case, only buyer data will be joined in as buyers, while only seller data will be joined in as sellers.

Note: the users view must now be referred to by its aliased names buyers and sellers in the join.

Limiting Fields from a Join

The fields parameter can be used to specify which fields are brought from a join into an explore. By default, all fields from a view will be brought in when joined. However, sometimes you might want to bring through only a subset of fields.

For example, when orders is joined to order_items, we may want to bring only the shipping and tax fields through the join, like so:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} fields: [shipping, tax]
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} ;; fields: [shipping, tax] } }

You can also reference a set of fields, such as [set_a*]. Each set is defined within a view using the set parameter. Suppose we have the following set defined in the orders view:

sets: orders_set: - created_date - shipping - tax
set: orders_set { fields: [created_date, shipping, tax] }

We can then choose to bring only these 3 fields through when we join orders to order_items, like so:

- explore: order_items joins: - join: orders type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} fields: [orders_set*]
explore: order_items { join: orders { type: left_outer relationship: many_to_one sql_on: ${order_items.order_id} = ${orders.id} ;; fields: [orders_set*] } }

Symmetric Aggregates

Looker uses a feature called “symmetric aggregates” to calculate aggregations (like sums and averages) correctly, even when joins result in a fanout. Symmetric aggregates are described in more detail in this Discourse post, while the fanout problem they solve is explained in this blog post.

In the most recent release, the following SQL dialects support symmetric aggregates:

  • Amazon Aurora
  • Amazon Redshift
  • Apache Spark 1.5+
  • Apache Spark 2
  • Aster Data
  • Clustrix
  • DataVirtuality
  • Denodo
  • Exasol
  • Google BigQuery
  • Google BigQuery Standard SQL
  • Google Cloud PostgreSQL
  • Google Cloud SQL
  • MariaDB
  • MemSQL
  • Microsoft Azure SQL Data Warehouse
  • Microsoft Azure SQL Database
  • Microsoft SQL Server 2005
  • Microsoft SQL Server 2008+
  • MySQL
  • Oracle
  • PostgreSQL
  • PrestoDB
  • Qubole Presto v0.157+
  • Snowflake
  • Teradata
  • Vectorwise
  • Vertica 6
  • Vertica 7+

The following SQL dialects do not currently support symmetric aggregates:

  • Amazon Athena
  • Apache Hive 2
  • Cloudera Impala
  • Google Cloud Spanner
  • Greenplum
  • IBM dashDB
  • IBM DB2
  • Qubole Presto v0.142
  • SAP HANA (in Labs)
  • XtremeData

If your dialect does not support symmetric aggregates, you will need to be careful when executing joins in Looker, as some types of joins can result in inaccurate aggregations (like sums and averages). This problem, and the workarounds for it, are described in great detail in this blog post.

Learn More About Joins

This has been a quick tutorial about joining in Looker. To learn more about join parameters in LookML, check out the Join Reference.

Still have questions?
Go to Discourse - or - Email Support
Top