Why you should define your team skills matrix before designing your data architecture!

Michaël Scherding
6 min readDec 15, 2021
Do not know why he’s here, but I love it ^^

As usual, I hope you are all doing well!

Today I want to discuss a tricky part of any new technical project, especially when you are dealing with a ‘’shift lines’’ project. Let’s start with the context.

Context

As you all know, the Cloud is now part of our daily journey. And many new companies want to jump in the Cloud in order to be more flexible and embrace high availability at scale. Cloud is great, but the negative effect is that for one use case you will have (a lot of) different ways to achieve your goal. And it’s even true with data when for example you want to pick and choose the best approach to ingest your data.

You can have many options to face the problem and identifying the best solution can be tricky. Most of the time we can choose the most powerful option in order to have the best results possible. But in my opinion, we should think in a different way. Cloud is great, but we often forget that we ‘’still’ need real people to run all of this mess and sometimes the team in charge can be lost with all the best practices we have to implement (security, versioning, CI/CD, FinOps, DevOps…). So why not take the problem in another way? Why not define a clear skills map of the actual team in charge and develop the best architecture around it? Let’s go deeper in this approach.

Step 1, define the skills

First step will be to create your matrix and define which skills are important in your context. Let’s take a simple example. I want to create my own data platform in order to have in a single place everything and then use it to find new insights. Basically and in a data context we will focus (in order to keep the example simple) on only 4 skills:

  • SQL
  • Code (Python, Java…)
  • Platform (GCP, AWS…)
  • Git

And I know that during the project I will have 5 different ‘’resources’’ on the project to build, deploy and then run the platform. The matrix should looks like that:

The scale will be simple with:

Step 2, complete the matrix

When you are comfortable with your matrix you will have 2 options to complete the document. The right approach will depend on your feeling with the team, are they all working together since a long time? Do they have a good mindset? If yes, you can complete directly during a workshop with everyone in the same place. The right level will be defined by nature, and during the process everyone should be able to share their ideas. If not, maybe you could think about a form to complete and everyone can fulfill the form alone on their side. We can also anonymize the answers, we just need the overall team global experience.

In order to help them to define the right level of experience for each topic, you can share a business case solution tutorial step by step and ask them the question “how do you feel regarding the tuto?”. Answers will look like:

  • I can run the process without the tuto (level 4)
  • I understand the tuto and I will sometimes need to refer on (level 3)
  • I understand the tuto, but I really need to deep dive (level 2)
  • I do not understand the tuto (level 1)

After this step, your matrix will look like that:

And you can start to create simple graph in order to help you to visualize the results:

For the moment, you start to have a better overview of what your capacities are. My team is comfortable with SQL but needs improvement in Code, Platform and Git.

Step 3, start to think about architecture and alway come back to your matrix

Now it’s time to start to work on your use case and define the best approach. To do so, let’s take a simple example. I’m currently using Salesforce to track my commercial leads. I want to ingest data from Salesforce in order to find new insights with other data sources. I know that I can hit Salesforce API to retrieve data, so what are my options?

  1. Use Cloud Run
  2. Use Cloud Functions
  3. Use DataFlow
  4. Use Data Fusion
  5. Use Fivetran
  6. etc…

Then for each solution I will always refer to my skills matrix in order to alway keep in mind what will be the most simplest solution to implement versus what will be the trickiest solutions to play with. Example:

Dataflow

Or Fivetran

What you can learn is that maybe going on Dataflow will take you more time to master. The gap of skills is important especially on code, platform and git. You need to know that before picking and choosing the right approach of architecture for your company and also for your team.

Step 4, conclude

Well, I do not want to rush on the fact that only your team skills will define your future stack you need to keep on the table many options and especially one tricky to evaluate, what is my team learning appetites. It can be a game changer because even if they are not on the right level of competencies, they will bridge the gap quickly.

This matrix can be really helpful especially if you are a consultant facing client. Our role is to help them reach their goals, but we also need to face them with their own capacities.

I want to highlight that sometimes you need to focus more on the gap you will have to fill between what your master today and what you will need to master tomorrow. I’m alway really careful with the ‘’feeling’’ of the human behind the code, and especially today we need to take care about what we can do and what we want to do. Do not forget, even if we can do powerful things with the Cloud we also need to progress step by step. And maybe the good option is to start with some managed solutions to avoid complexity in order to maybe jump on more complex and elegant projects in the future… when everyone feels ready to work on it. As always, humans first.

Take care.

Michaël

--

--