Looker, how to manage your LookML project
If you are reading this article you are certainly wondering how you can manage your LookML actively. Looker is a great tool to manage versioning in a Business Intelligence environment. But if you are not familiar with push, pull, revert, shared branch, pull request, commit… It can be a little bit tricky to pick and choose the right approach.
Well, I’ll try to help you 😇
First of all you will have to play with a git flow, please find below the schema:
In summary the process is:
- I’m a developer, I can code and commit on my personal or shared branch (only use shared branch please)
- Then I can merge my commit to production
It’s the most simplest process you can implement in Looker to manage your LookML. But if you are dealing with a lot of users, a lot of dashboards, Looks or even LookML Developers it’s impossible to have a great overview of what you are doing. You will need to go a little bit deeper and you will maybe start to think about splitting your code into several instances…
To be honest I’m not a huge fan of the multi instances because it’s an addition of complexity when you can do some really great stuff directly in a single one. With a smooth process and some basic reflexes you can do almost everything in a single instance.
First you will have to implement 2 simple things:
- Pull Request REQUIRED. After a developer commits changes to their development branch, the Git button in the Looker IDE prompts the developer to open a pull request. The developer must open a pull request to merge their development branch into the production branch. Then, other Looker developers can review and approve the pull request from the Git provider’s web interface. You will have also to define your Pull Request process in your Git provider. Meaning, your LookML developer CAN NOT validate and deploy his own Pull Request.
- Advanced deploy mode REQUIRED. With the option you can pick and choose the right commit to deploy. So it’s really easy to define which commit will be production ready and also roll back if you have any trouble. And do not forget to tag your deploy!
Those 2 things are mandatory in EVERY LookML Project. But you will also have to define roles in your Team. You can have 2 profiles to target:
- Tech Lead, his role is to confirm Pull Request and deploy commits in the Advanced Deploy Mode. He will also support developers and share best practices
- LookML Developer, his role to code LookML commit and create Pull Requests
Simple. You will also have to define your project rhythm. Let’s take an example with a 2 weeks sprint logic:
As a Tech Lead I will have to define a Pull Request validation logic. Let’s define it on Wed each week. During this meeting, you will have to review all of the PR. If it’s necessary you should also ask the developers to explain the code. It’s the moment where you can speak about the new features and how it has been implemented.
Still as a Tech Lead and after the PR sync you should jump on the advanced deploy mode in order to pick and choose the right PR for the instance.
Even if your project is on a 2 weeks sprint you can still deploy every week. The first week will be for small features or fixes and the next week will be for more important topics.
Warning, never EVER deploy on a Friday, if something fails you will not have the time to manage or handle fixes. You will work under stress and it’s never a best practice.
Lastly, you should consider having more than one LookML project when you start to have more than 4 / 5 developers working at the same time on it. Even if each developer is working on his own model for example, it’s alway tricky to sync everyone. Keep in mind, multi LookML projects will help you be more autonomous on what you’re doing. Having only one LookML Project for all use cases can quickly be a nightmare to manage.
What should I do next?
For now you have a simple process with:
- Pull Request required
- GitFlow managed with only the Tech Lead with approver role
- Advanced Deploy Mode
- And a rhythm to manage everything
Well it’s great but you should also consider implementing Data Tests in your process. It’s alway critical to test A LOT your code and it’s even more critical when you’re dealing with a single Looker’s instance. I wrote a small article on Data Tests. In summary you should as soon as possible activate the “require data tests to pass before deploying this project to production” option.
And you should actively implement in ALL YOUR USER STORY a sub task of test creation. Let’s be completely clear, for EVERY User Story you should work 50% of the time on the data tests you should implement.
If you want to go deeper, you can have a look at Spectacles, awesome product where LookML will be tested and all undiscovered errors will be pointed. Spectacles can run tests during Pull Request or even during health check you can run on a schedule you will defined.
With this little trick and as a Tech Lead you will receive clear Pull Requests already tested.
Last but not least, do not forget to use Content Validator. It’s a powerful service from Looker in order to check what your code can break. Even if your LookML is correct you can still break things in Looker (table calculation, filters values…). And by the way if you are in Development Mode, the validation results will reflect your saved LookML, even if it hasn’t been pushed to production… So go for it.
Single, multi instances… It’s alway a complex topic. I’m strongly focused on alway having the simplest structure BUT having a strong and clear process for everyone. Especially for developers, I prefer simple processes with heavy best practices repeated rather than complex structure.
Finding the right balance it’s alway tricky…
Edit 30/09/2022: this process work fine for small instance with not so many developers or users. If your goal is having like many dev working in parallel on many different Look MlL projects and thousand of users working on Looker in the same time, you should think about splitting usage in different instances.
Hope it will help you!