Why and how to create modular LookML models

Hugo Selbie, Senior Professional Services Consultant, Looker at Google Cloud

Jun 14, 2021

The benefits of modular programming

Modular programming in software design is not a new technique, but it is an extremely powerful one. It makes your files smaller and easier to understand than they would be otherwise, which eases collaboration and simplifies the process of finding out where an error originated.

With modular programming, the functionality of a project is separated into independent, interchangeable modules. Each module contains everything necessary to execute only one aspect of the functionality you are building toward. Since each module has to work on its own, the model never becomes overly complicated or inscrutable to someone who comes to the project after it’s built.

In my role as a professional services consultant, one of my most closely held goals is to convey the necessity of developing in Looker with a modular approach.

The drawbacks of monolithic architecture

I often see the opposite: developers who think of themselves not as modular component builders, but as artisanal code creators who just need to get code to work by an allotted deadline. This usually results in monolithic architecture, meaning that all the components of the model are interconnected and interdependent. Those dependencies make the code more difficult to work with.

Common pitfalls of a monolithic architecture are:

  • Developers are inextricably bound to their model because it’s complicated and poorly documented.
  • New developers face a difficult ramping time, increasing their time before they can make meaningful model contributions.
  • Legacy code or broken code is sometimes preserved because new developers are afraid of breaking existing content and dependencies.

I recommend a modular approach instead so you can avoid these inefficiencies and get the most out of Looker.

Looker’s semantic modelling layer, LookML, is built for collaboration: it’s version controlled, and it allows you to declare repeatable pieces of generated SQL that are invoked by a user-friendly front end experience. These SQL queries are then executed against your database. The way LookML works goes a long way toward making the development process as sensible as possible, but it’s not foolproof. The way you use LookML can make or break the Looker experience for your team.

Simply put, your code will be in its most efficient and useful state when any incoming developer of similar LookML expertise can swap into your LookML model and understand/amend that LookML model in a reasonable amount of time.

How to create modular LookML models

Although there are many different approaches to achieving the goal of the modular developer, these are my top three recommendations.

1. Strict repeatable code commenting practices

Code commenting is an underutilized tool to increase the ability of the modular developer to understand the LookML they are looking at (that is, what it does and why it was designed in that way).

Good examples of this are:

  1. View files

#############################################################################################################
# Owner: Marketing Team
# Created by: Captain Anonymous
# Created: April 2020
# Purpose: ETL Common Analytics Terms_CATs_Components data
#          from https://docs.google.com/spreadsheets/d/myspreadsheet/
#          Marketing usage focuses on the EPD team and SME mapping
#############################################################################################################

  1. Lower LookML objects (dimensions, measures etc.)

dimension: pickup_time {
  #Added on 1/26/19 by Captain Anonymous
  #Filtered Dimension to exclude multi-host chats
  type: number
  sql: CASE WHEN ${TABLE}.pickup_time > 500 
THEN NULL  
ELSE ${TABLE}.pickup_time end  ;;
  }

Arguably the comment here could be better served as a description parameter, and there is no one “right” way to convey the information. Either way, you give a developer instant information about why this LookML object exists and what it does.

  1. Model Files

# Made by Captain Anonymous 10/16/19
# Purpose is to view OKRs combined with Github issues
# Challenges, we need full joins to Github data
explore: github_objective_key_results {
join: issue {
  view_label: "GitHub Issue"
  sql_on: ${customer_reported_issues.issue_number} = ${issue.number}
      and ${customer_reported_issues.html_url} = ${issue.url} ;;
  # Gross but we still want to see all Github issues here
  type: full_outer
  relationship: many_to_one
 }
}

This comment goes beyond the “why” and “what” to note a challenge associated with this model: it requires full joins to GitHub data. If there’s an easy way to break your LookML, make that clear so new developers don’t accidentally do the wrong thing.

2. Ensure your code follows repeatable style guidelines

Adherence to repeatable code structure is the biggest challenge you will face to eliminating entropic models; the key thing is that it’s simple, repeatable, and documented. A really good start is adherence to the Look at me Sideways (LAMS) style guide.

From there, I’m a big proponent of adopting a layered approach using Looker’s refinement syntax — either for separating natural objects like dimensions and measures, or potentially for grouping similar topics.

The advantage of this approach is if you are quickly navigating around a LookML file you can fold your LookML and hop to the appropriate section of the file.

Example code structure:


##### Order_Items Dimensions
view: order_items {
  sql_table_name: thelook_events.ORDER_ITEMS ;;
  drill_fields: [id]

  dimension: id {
    primary_key: yes
    type: number
    sql: ${TABLE}.ID ;;
  }

  dimension: sale_price {
    type: number
    sql: ${TABLE}.SALE_PRICE ;;
  }

  dimension_group: shipped {
    type: time
    timeframes: [
      raw,
      time,
      date,
      week,
      month,
      quarter,
      year
    ]
    sql: ${TABLE}.SHIPPED_AT ;;
  }

##### Order_Items Measures
view: +order_items {
  measure: count {
    type: count
    drill_fields: [detail*]
  }

  dimension: inventory_item_id {
    type: number
    # hidden: yes
    sql: ${TABLE}.INVENTORY_ITEM_ID ;;
  }
}

##### Order_Items Sets
view: +order_items {
  # ----- Sets of fields for drilling ------
  set: detail {
    fields: [
      id,
      inventory_items.product_name,
      inventory_items.id,
      users.last_name,
      users.first_name,
      users.id
    ]
  }
}

3. A modular approach to file management using Looker’s integration with GitHub folders

Many code snippets can be included in a model file. For example (and this is not an exhaustive list): access grants, datagroups, named value formats, and fiscal offsets. Having all this code cluttering your model files can seriously degrade the readability of the Explores and joins you’re accessing. Adopting a more modular approach, however, can clean these dependencies up, and you can include whatever snippets you need in a nice, declarative way.

Consider this structure:

  • model_include_files
    • access_grants.lkml
    • base_explores.lkml (to be extended across models; alternatively, you can define Explores in individual files to be included in the model in a targeted way)
    • datagroups.lkml
    • model_config.lkml (e.g. persist with, fiscal offsets, week start)
    • named_value_formats.lkml
  • model_files
    • Model_file.model.lkml
  • view_files
    • all_view_files
    • OR separated by natural project break

An example of usage in the model_file.model.lkml would be include: "/model_includes/*" to include all the associated files of the model_include_files folder. Alternatively, you can include individual files directly as they are needed across Looker model files.

Check your work

All this work to put in a best-in-class LookML structure means nothing if:

  1. You have introduced breaking changes to the model by accident
  2. You are not following the best practice procedures you have defined among your team

To protect your work from either scenario, I recommend you invoke the checks below.

  1. Verify the SQL in your LookML sql parameter is correct (you can check this using a tool like Spectacles)
  2. Verify that your LookML change hasn’t introduced content breaking changes (see this script example or use Spectacles)
  3. Run automated data tests on your data (for instance, using native Looker tests)
  4. Run external data tests using API calls (script example)
  5. Lint your code (you could use Look at me Sideways)

A lightweight series of checks can be set up by implementing GitHub Actions.

When it comes to designing those modules, remember this timeless advice: if you can only do one thing, do it well.


To ask questions and talk shop with other LookML developers, drop in to the LookML category of the Looker Community.

Next Previous

Subscribe for the latest posts