7/8/2023 0 Comments Dimension tableIf location data is moved to an outrigger, each time there is Type 2 slow change to a location, all related dimension table rows must also undergo Type 2 changes. I discuss this technique in Chapter 7 of Star Schema The Complete Reference, and point out that it opens the door for a different kind of ETL challenge. This ensures a single consistent representation of locations. Each dimension that contains location data will contain a location_key. If their attributes or slow change characteristics are not managed identically, inconsistencies may develop across these tables.ĭesigners reason that creating one location table solves this problem. In this case, the concern is that the ETL routines may not consistently process locations across multiple dimension tables. The characteristics of a location might appear in a customer dimension table, a warehouse dimension table and a department dimension table. This happens when there is a set of attributes that appear in multiple dimension tables. There are times when some designers might choose a snowflake path - placing dimension keys into dimension tables. The fact table may contain other day_keys as well (such as day_key_claim, which supports analysis of claim data by the date of the claim.) To support analysis of claim information using the policy effective date, the fact table will contain a day_key_policy_effective. In any star where you may want filter/group/aggregate facts by some aspect of the date in question, add date keys to the fact tableįor example, in Robert's case there might be a Policy dimension table with an attribute called policy_effective_date.Place good old-fashioned dates in the dimension table.When you have an date that (1) clearly describes something represented by a dimension, and (2) will also be used to group facts, do two things: Some designers may allow limited use of snowflaking, but not in this situation.ĭate in Dimension Table, Day Key in Fact Table It increases SQL complexity, reduces schema understandability, and may impact performance. Examples include dimension tables that represent orders, contracts, tests and so forth.Īvoid placing date_key columns in dimension tables. To maintain the relationship between the demographic and customer columns, the demographic key should be a foreign key in the Customer dimension table.Montreal A: Many models feature at least one major dimension with a date that might be used to qualify facts in more than one star. In this case, separating the demographic columns into another table improves query response time. You can then relate the two tables to each other with a non-identifying relationship so that you can browse both dimensions interactively.įor example, if the Customer dimension table contains both customer address and demographic information, users may query the demographic fields such as, age, gender, income_level, and marital_status more frequently than they query the other columns. In the case where a denormalized dimension table is extremely large and you need to query a portion of the columns more often than the rest, you may want to break the single dimension table into two separate dimension tables (create a vertical partition) for efficiency. Overview of Large Dimension Tables Partitioning
0 Comments
Leave a Reply. |