[REPLAY] Product Talk: Using AI to enhance the data marketplace search experience

Watch the replay
Glossary

Star schema

A star schema is the simplest type of database/data warehouse schema used to store data, with the model’s design resembling a star shape.

What is a Star Schema?

A data warehouse schema governs how to structure tables and their mutual relationships within a database or data warehouse, meaning it creates the shape your data takes.

The star schema consists of a single ‘fact table’ that contains information about events or facts, surrounded by a single level of ‘dimension tables’ (or lookup tables) that contain descriptive information about the dimensions of these facts or events. Data is denormalized into these dimensions and facts.

What are the characteristics of the Star Schema?

The star schema has these features:

  • Denormalization: Star schemas denormalize the data. This means they add redundant columns to some dimension tables to make querying and working with the data faster and easier.
  • Non-hierarchical structure: Star schemas are single level, with multiple dimension tables connecting directly to the central fact table.

What are the advantages and disadvantages of the Star Schema?

What are the advantages of the Star Schema?

  • It is simple to understand, design and implement
  • It is well suited to running simple queries, and works well with OLAP cubes
  • It delivers faster performance due to its reduced number of joins and efficient indexing of fact and dimension tables
  • It delivers clear, intuitive analysis, with easily understandable relationships
  • It is scalable and can handle large amounts of data

What are the disadvantages of the Star Schema?

  • Due to redundancy, it requires more storage space than other schema models, adding to costs.
  • It does not enforce data integrity, due to its denormalized structure, potentially impacting data quality.
  • Denormalized dimension tables can be more difficult to maintain, as updates may require changes in multiple places.
  • It can be harder to run more complex queries.

What are the uses of the Star Schema?

Star schemas can be applied to data warehouses, databases, data marts, and other tools and are optimized to simplify the querying of large data sets.

Analysts are able to create queries that filter and group data by one or more dimensions and then aggregate results at different levels of granularity.

In contrast, the star schema does not lend itself to applications such as online transaction processing. This is because its denormalized structure requires data to be processed and verified carefully on an ongoing basis to ensure its integrity, impacting performance.

What is the difference between the Star Schema and the Snowflake Schema?

Data warehouses are normally built on either a star schema or snowflake schema model. There are seven main differences between them:

  • Normalization: Star schemas are denormalized, with values repeated within a table. By contrast, the snowflake schema has a fully normalized data structure, with dimensional hierarchies stored in separate dimensional tables.
  • Data redundancy: The star schema stores repeated data, leading to data redundancy.
  • Disk space: As it stores redundant data, the star schema uses more disk space than the snowflake schema.
  • Query complexity: It is simpler to run queries on a star schema database, as it has only one level of dimension tables, and does not require queries with multiple joins.
  • Query performance: Star schema queries are faster than those in snowflake schemas due to being less complex.
  • Data integrity: In the star schema, multiple copies of the same data exist in different dimensional tables. This means new inserts, updates, or deletes can compromise the integrity of data. In contrast the snowflake schema stores dimension data once, improving data integrity.
  • Set up and maintenance: As they are simpler, star schemas are easier to design and set up than snowflake schema. However, due to potential data integrity issues star schemas are harder to maintain when new data is added to a data warehouse.

What is a Starflake Schema?

As the name suggests, a starflake schema is a combination of a star schema and a snowflake schema. It aims to bring together the benefits of both approaches and is based on a snowflake schema where only some of the dimension tables have been denormalized.
Shared dimensional hierarchies are placed in outriggers in order to normalize the schema.

Learn more
How to break down organizational silos to engage everyone in your data project Data access
How to break down organizational silos to engage everyone in your data project

Organizational silos prevent data sharing and collaboration, increasing risk and reducing efficiency and innovation. How can companies remove them and ensure that data flows seamlessly around the organization so that it can be used by every employee?

The need to develop data skills as part of digital equity programs Digital transformation
The need to develop data skills as part of digital equity programs

Digital equity is key to empowering everyone to interact confidently in today’s digital world. Our blog explains the importance of data skills to digital inclusion - and how organizations can help drive data democratization.

5 tips for deploying a data mesh approach in your organization Data Trends
5 tips for deploying a data mesh approach in your organization

What are the best practices around deploying the data mesh approach in your organization?

Start creating the best data experiences