SAP BO Universes versus Looker Explores

Michaël Scherding
4 min readJul 22, 2024

--

Introduction

In business intelligence, data modeling plays a crucial role in transforming raw data into actionable insights. Two prominent tools used for this purpose are SAP BusinessObjects (BO) and Looker. This article jump into the fundamental differences between these tools and explores best practices for data modeling with Looker and BigQuery, highlighting how to optimize performance and reduce costs.

Universes in SAP BO

A Universe in SAP BO is a semantic layer that translates raw data into business-friendly terms. Universes are highly flexible, allowing the creation of multiple contexts to handle complex join paths and avoid ambiguities. These contexts define alternative join paths, each representing a unique way to navigate through the data.

Explores in Looker

An Explore in Looker is a logical view that centers around a primary table, with other tables joined to it. This structured approach simplifies data exploration and ensures consistency.

Concrete example with Looker

Imagine an Explore based on the primary table Orders. We will join three tables directly to Orders and a fourth table to one of the directly joined tables.

Tables and Joins:

  • Primary Table: Orders

Direct Joins:

  • Customers: joined on Orders.customer_id = Customers.id
  • Products: joined on Orders.product_id = Products.id
  • Shippers: joined on Orders.shipper_id = Shippers.id

Indirect Join:

  • Regions: joined on Shippers.region_id = Regions.id

When selecting a dimension from the Regions table, such as Regions.region_name, Looker performs the necessary joins using the defined paths, starting from Orders and proceeding through Shippers to Regions.

So what’s the point?

In Looker, it’s essential to be meticulous with join conditions because it always starts from the main table when building queries, applying joins based on this starting point. This approach necessitates careful design to ensure that each Explore is relevant and the joins make sense. Unlike SAP BusinessObjects (BO), where you can define various contexts within a single Universe for different join conditions, Looker’s model encourages a more focused approach by splitting data into multiple Explores, each with a specific main table. While SAP BO offers powerful flexibility and can handle complex join scenarios within one Universe, this can lead to very large and intricate models that are harder to manage. Looker’s method may seem limited but ultimately promotes clarity and coherent data exploration by ensuring each Explore has well-defined, sensible joins.

Best practices for modeling with Looker on BigQuery

To ensure optimal performance and cost efficiency in Looker when using BigQuery, it’s essential to follow certain best practices inspired by BI Engine recommendations.

Minimize joins

Reducing the number of joins in each Explore is crucial. Too many joins complicate SQL queries, increasing processing time and costs. Aim for Explores with a maximum of 4 to 5 joins unless absolutely necessary.

Denormalization

BigQuery performs better with pre-joined or pre-aggregated tables. Data frequently accessed together should be stored together. Using materialized views can help create flattened tables that pre-aggregate and pre-join necessary data, reducing the need for repetitive joins in every query.

Configure specific Explores

Create dedicated Explores for different use cases, such as one for sales and another for customers. This avoids excessive joins in a single Explore and ensures efficient data exploration. Prioritize Explores based on the importance of analyses to streamline and optimize data retrieval.

Case Study, marketing analysis of sales performance by customer

Need: The marketing team wants to analyze sales performance by customer, combining information from customer_explore and sales_explore.

Option 1: Merge Explore

Combining data from two Explores (customer_explore and sales_explore) into one view seems straightforward but comes with limitations:

  • Row Limits: Limited to a certain number of rows, which may be insufficient for large-scale analyses.
  • Join Types: Uses LEFT OUTER joins, which can lead to duplicates or missing values.
  • Performance: Complex and slow queries, impacting costs and performance.

Option 2: Remodel with a Materialized Table

A more efficient approach involves creating a materialized table.

Step 1: Create the Materialized Table in BigQuery

Combine Customers, Orders, Order_Items, and Products:

CREATE MATERIALIZED VIEW my_project.my_dataset.customer_sales_summary AS
SELECT
c.customer_id,
c.customer_name,
o.order_id,
o.order_date,
oi.product_id,
oi.quantity,
p.product_name,
p.product_category,
p.price
FROM
my_project.my_dataset.Customers c
JOIN
my_project.my_dataset.Orders o ON c.customer_id = o.customer_id
JOIN
my_project.my_dataset.Order_Items oi ON o.order_id = oi.order_id
JOIN
my_project.my_dataset.Products p ON oi.product_id = p.product_id;

Advantages of materialized tables

  • Improved performance: Faster queries due to pre-aggregated and pre-joined data.
  • Simplicity: Simplified model with fewer necessary joins.
  • Reduced costs: Fewer repetitive calculations, lowering processing costs.

The semantic layer in Looker

The semantic layer in Looker represents the last mile of analytics. This layer is crucial for managing data access from a secure zone and leveraging caching to limit costs. To make the most of this semantic layer, it’s vital to define a data model that meets Looker’s simplicity requirements.

Best Practices for LookML

To maintain a readable and effective LookML:

  • Avoid derived tables: Minimize the creation of derived tables or persistent derived tables in Looker.
  • Preliminary modeling: Rely on a model built in advance with specialized tools, providing a solid foundation for Looker.

Conclusion

By following these practices and using materialized tables, you can create optimized data models that enhance performance and reduce costs while ensuring secure and efficient data management with Looker. This approach not only simplifies the data model but also leverages the full power of BigQuery and Looker’s semantic layer to deliver fast, reliable insights.

Take care 🤘

--

--

No responses yet