Software engineers know what good code looks like. It’s readable, organized, modular, version-controlled, well-tested, and it doesn’t repeat itself. It leverages work others have already done and it’s been reviewed before it ships, ensuring that it makes sense to more than just its author.
Developers realized that if they were going to be building immensely complex systems in collaboration with tens or hundreds (or thousands) of colleagues, all of these principles were essential to moving their craft forward.
Unfortunately, analytics has been slow to adopt similar principles. Data scientists have certainly moved things in the right direction, using R and Python to write real code that conforms to many of these principles. But most data work is still done in SQL and/or Excel, using manual, unaudited processes that waste time, impede collaboration, and lead to mistakes.
Looker has been focused on correcting that problem since its very beginning. As I’ve written, evolving SQL into a much more flexible, reusable abstraction - LookML - was the key first step that Looker took in moving analysts toward a better way of working.
Looker 6 takes us further than ever in bringing good coding practices to analysts of all stripes.
Version control has been included in Looker since the beginning, but we’re always improving it. In recent releases, we’ve added the ability to keep multiple branches of work in Git and to specify pull request and code review workflows for developers. With Looker 6, we’re adding the ability to organize increasingly complex projects by putting files into folders within Looker’s IDE.
To make sure that LookML developers can leverage work that others have already done - rather than repeating it - Looker 6 also gives you more options than ever for referring to others’ projects and importing their code seamlessly. In the same way that software developers point to others’ libraries and then use those functions in their own code, LookML developers can leverage analytic patterns and data models that others have written without having to copy and paste the code (and then worry that they’ll miss any future upgrades to that code).
With the directory of Looker Blocks™ constantly growing, there is tons of publicly available code that developers can import with a single line of code. And even if you’re just looking to manage internal projects in a simpler way, importing projects from where they live, rather than repeating the code in multiple places, is a great way to maintain a hub and spoke analytic organization.
Looker 6 also gives LookML developers more control than ever over how the fields they create are used and who can view them. This new field-level access control is critically important to writing good code, because it prevents developers from having to repeat code to create similar models with differing levels of access.
Finally, Looker 6 brings a fundamental concept of software development to analytic model development: automated testing. As code gets increasingly complex, it becomes impossible to predict all of the downstream effects that a change might have. That’s why good code has comprehensive test coverage that knows what the software is supposed to do and warns the developer if something breaks.
Looker 6 brings this same concept to data, allowing you to specify known values for your data and test to make sure that your data arrives at that correct value as you integrate new transformations into your code. That way, if a change you make suddenly delivers an unexpected value for last year’s revenue, or a first transaction date decades before your business started, you can be alerted immediately, before you affect others.
In all, software engineers are still the leaders in using good tooling to write good code. But Looker 6 brings analysts further than ever before. And we’ve got a lot more planned to give analysts all the tools they need to build and maintain the complex data systems their organizations need to be successful.