Difference between revisions of "Cube DSL"
(→Time dimensions) |
|||
(76 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | ||
− | + | ||
== Introduction == | == Introduction == | ||
Line 6: | Line 5: | ||
=== MDX and OLAP Cube === | === MDX and OLAP Cube === | ||
− | '''MDX'''(MultiDimensional eXpressions) | + | '''MDX'''(MultiDimensional eXpressions) is a query language for '''OLAP''' (OnLine Analytical Processing) databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas. |
− | SQL was designed to query dimensional data structures, called tables, where data are organized in rows | + | SQL was designed to query three-dimensional data structures, called tables, where data are organized in rows, columns and values. In OLAP, data are organized around multiple measures, dimensions, hierarchies, and levels. |
− | Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels | + | Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels are the centerpieces of MDX. In this way, business models are mapped to language-specific concepts (e.g. a list of countries will be mapped as an MDX dimension). ''Measures'' in MDX are a special case of dimensions and they hold numeric values. The concrete value of a dimension (e.g. Paris) is a member of the dimension and is similar to the value contained in an SQL table. |
− | An OLAP Cube is a collection of dimensions indexing a list of measures. | + | An ''OLAP Cube'' is a collection of dimensions indexing a list of measures. For example, a geometrical cube can have three dimensions, whereas a logistical "cube" can have many more dimensions. |
− | More information: | + | More information can be found at: |
http://en.wikipedia.org/wiki/MultiDimensional_eXpressions | http://en.wikipedia.org/wiki/MultiDimensional_eXpressions | ||
Line 24: | Line 23: | ||
===Mondrian=== | ===Mondrian=== | ||
− | Mondrian is an open source OLAP server, written in Java. It supports the MDX query language and | + | Mondrian is an open source OLAP server, written in Java. It supports the MDX query language and XML for analysis and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache. |
Mondrian is used for: | Mondrian is used for: | ||
− | * | + | *high performance, interactive analysis of large or small volumes of information |
− | * | + | *dimensional exploration of data, for example analysis of sales by product line, by region, by time period |
− | * | + | *parsing the MDX language into SQL to retrieve answers to dimensional queries |
− | * | + | *high-speed queries through the use of aggregate tables in the RDBMS (Relational DataBase Management System) |
− | * | + | *advanced calculations using the calculation expressions of the MDX language |
====Schema==== | ====Schema==== | ||
A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model. | A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model. | ||
− | The logical model consists of the constructs used to write queries in MDX language: cubes, dimensions, hierarchies, levels, and members. | + | The logical model consists of the constructs used to write queries in the MDX language: cubes, dimensions, hierarchies, levels, and members. |
The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database. | The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database. | ||
Line 43: | Line 42: | ||
=====Schema file===== | =====Schema file===== | ||
− | The schema file defines the dimensions and measures for the cube and maps them to the | + | The schema file is an XML file that defines the dimensions and measures for the cube and maps them to the database tables. The product and location dimensions contain several levels each. Each level defines its corresponding field in the database table. The different values of the field become the members of the level. |
'''Note''': The levels must be specified in the intended order of breakdown. | '''Note''': The levels must be specified in the intended order of breakdown. | ||
Line 149: | Line 148: | ||
|<NameExpression>||SQL expression used to compute the name of a member, in lieu of Level.nameColumn. | |<NameExpression>||SQL expression used to compute the name of a member, in lieu of Level.nameColumn. | ||
|- | |- | ||
− | |<CaptionExpression>||SQL expression used to compute the caption of a member, in lieu | + | |<CaptionExpression>||SQL expression used to compute the caption of a member, in lieu of Level.captionColumn. |
|- | |- | ||
− | |<OrdinalExpression>||SQL expression used to sort members of a level, in lieu | + | |<OrdinalExpression>||SQL expression used to sort members of a level, in lieu of Level.ordinalColumn. |
|- | |- | ||
− | |<ParentExpression>||SQL expression used to compute a measure, in lieu | + | |<ParentExpression>||SQL expression used to compute a measure, in lieu of Level.parentColumn. |
|- | |- | ||
|<Property>||Member property. The definition is against a hierarchy or level, but the property will be available to all members. | |<Property>||Member property. The definition is against a hierarchy or level, but the property will be available to all members. | ||
Line 171: | Line 170: | ||
|<Table>||Fact or dimension table. | |<Table>||Fact or dimension table. | ||
|- | |- | ||
− | |<View>||Defines a | + | |<View>||Defines a (virtual) table using an SQL query, which can have different variants for different underlying databases. |
|- | |- | ||
− | |<Join>||Defines a | + | |<Join>||Defines a (virtual) table by joining a set of queries. |
|- | |- | ||
|<InlineTable>||Defines a table using an inline dataset. | |<InlineTable>||Defines a table using an inline dataset. | ||
Line 213: | Line 212: | ||
|<CubeGrant>||A set of rights to a cube. | |<CubeGrant>||A set of rights to a cube. | ||
|- | |- | ||
− | |<HierarchyGrant>||A set of rights to a hierarchy and levels within that hierarchy. | + | |<HierarchyGrant>||A set of rights to a hierarchy and to levels within that hierarchy. |
|- | |- | ||
|<MemberGrant>||A set of rights to a member and its children. | |<MemberGrant>||A set of rights to a member and its children. | ||
Line 235: | Line 234: | ||
|<PropertyFormatter>||Property formatter. | |<PropertyFormatter>||Property formatter. | ||
|- | |- | ||
− | |<Script>||Script fragment to implement an SPI such as user-defined function, member formatter, or cell formatter. | + | |<Script>||Script fragment to implement an SPI such as a user-defined function, a member formatter, or a cell formatter. |
|} | |} | ||
Line 263: | Line 262: | ||
|<Value>||Value of a column in an <InlineTable> dataset. | |<Value>||Value of a column in an <InlineTable> dataset. | ||
|- | |- | ||
− | |<MeasureExpression>||SQL expression used to compute a measure, in lieu | + | |<MeasureExpression>||SQL expression used to compute a measure, in lieu of Measure.column. |
|- | |- | ||
|<SQL>||The SQL expression for a particular database dialect. | |<SQL>||The SQL expression for a particular database dialect. | ||
Line 270: | Line 269: | ||
=====Cubes===== | =====Cubes===== | ||
− | A | + | A Cube is a named collection of Measures and Dimensions. The one thing the Measures and Dimensions have in common is the fact table. The fact table holds the columns from which Measures are calculated, and contains references to the tables which hold the Dimensions. |
► '''Syntax''': | ► '''Syntax''': | ||
Line 282: | Line 281: | ||
− | The fact table is defined using the <code><Table></code> element. If the fact table is not in the default schema, you can provide an explicit schema using the | + | The fact table is defined using the <code><Table></code> element. If the fact table is not in the default schema, you can provide an explicit schema using the <code>schema</code> attribute, for example: |
Line 291: | Line 290: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | You can also use the <code><View></code> construct to build more complicated SQL statements. The <code><Join></code> construct is not supported for fact tables. | + | You can also use the <code><View></code> construct to build more complicated SQL statements.<br> |
− | + | The <code><Join></code> construct is not supported for fact tables. | |
=====Measures===== | =====Measures===== | ||
− | Each | + | Each Measure has a <code>name</code>, a <code>column</code> in the fact table, and an <code>aggregator</code>. |
► '''Syntax''': | ► '''Syntax''': | ||
Line 315: | Line 314: | ||
− | The | + | The aggregators are |
− | *'''sum''' - | + | *'''sum''' - Returns the sum value of the specified set expression. |
− | *'''count''' - | + | *'''count''' - Returns the number of elements in the specified set. |
− | *'''min''' - | + | *'''min''' - Returns the minimum value of the specified set expression. |
− | *'''max''' - | + | *'''max''' - Returns the maximum value of the specified set expression. |
− | *'''avg''' - | + | *'''avg''' - Computes the average value of the specified set. |
− | *'''distinct-count''' - | + | *'''distinct-count''' - Returns the number of unique, non-empty elements in the specified set. It has some limitations if your cube contains a parent-child hierarchy. |
− | The optional datatype attribute specifies how cell values are represented in Mondrian's cache, and how they are returned via XML for | + | The optional <code>datatype</code> attribute specifies how cell values are represented in Mondrian's cache, and how they are returned via XML for analysis. |
− | The datatype attribute can have values | + | The <code>datatype</code> attribute can have values: |
*'''String''' | *'''String''' | ||
Line 336: | Line 335: | ||
*'''Timestamp'''. | *'''Timestamp'''. | ||
− | An optional formatString attribute specifies how the value is to be printed. The ',' and '.' symbols are locale-sensitive. | + | An optional <code>formatString</code> attribute specifies how the value is to be printed. The ',' and '.' symbols are locale-sensitive. |
− | A | + | A Measure can have a <code>caption</code> attribute to be returned by the Member.getCaption() method instead of the name. Defining a specific caption does make sense if special letters (e.g., Σ or Π) are to be displayed. |
► '''Example''': | ► '''Example''': | ||
Line 346: | Line 345: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | =====Dimensions, | + | =====Dimensions, Hierarchies, Levels===== |
− | A member is a point within a | + | A member is a point within a Dimension determined by a particular set of attribute values. The ''gender'' Hierarchy has the two members 'M' and 'F'. 'San Francisco', 'California' and 'USA' are all members of the ''store'' Hierarchy. |
− | A | + | A Hierarchy is a set of members organized into a structure for convenient analysis. For example, the ''store'' Hierarchy consists of the store name, city, state, and nation. The Hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of the stores in that city. |
− | A | + | A Level is a collection of members which have the same distance from the root of the Hierarchy. |
− | A | + | A Dimension is a collection of Hierarchies which discriminate on the same fact-table attribute (say, the day that a sale occurred). |
► '''Syntax''': | ► '''Syntax''': | ||
Line 363: | Line 362: | ||
<Table name="{table name}"/> | <Table name="{table name}"/> | ||
<Level name="{level name}" column="{column name}" | <Level name="{level name}" column="{column name}" | ||
− | + | uniqueMembers="true"/> | |
</Hierarchy> | </Hierarchy> | ||
</Dimension> | </Dimension> | ||
Line 380: | Line 379: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | This | + | This Dimension consists of a single Hierarchy, which consists of a single Level called ''Gender''. |
− | The values for the | + | The values for the Dimension come from the ''gender'' column in the ''customer'' table. The ''gender'' column contains two values, 'F' and 'M', so the ''Gender'' Dimension contains the members <code>[Gender].[F]</code> and <code>[Gender].[M]</code>. |
− | For any given sale, the | + | For any given sale, the ''Gender'' Dimension is the gender of the customer who made that purchase. This is expressed by joining from the fact table "sales_fact_1997.customer_id" to the Dimension table "customer.customer_id". |
Line 399: | Line 398: | ||
======"ALL" member====== | ======"ALL" member====== | ||
− | By default, every | + | By default, every Hierarchy contains a top Level called <code>(All)</code>, which contains a single member called <code>(All {hierarchyName})</code>. This member is parent of all other members of the Hierarchy, and thus represents a grand total. It is also the default member of the Hierarchy; that is, the member which is used for calculating cell values when the Hierarchy is not included on an axis or in the slicer. The <code>allMemberName</code> and <code>allLevelName</code> attributes override the default names of the all Level and all member. |
− | + | ||
− | + | ||
+ | If the <code><Hierarchy></code> element has <code>hasAll="false"</code>, the 'all' Level is suppressed. The default member of that Dimension will now be the first member of the first Level; for example, in a ''Time'' Hierarchy, it will be the first year in the Hierarchy. Changing the default member can be confusing, so you should generally set <code>hasAll="true"</code>. | ||
======Time dimensions====== | ======Time dimensions====== | ||
− | Time dimensions based on year/month/week/day are coded differently in the Mondrian schema due to the MDX time related functions : | + | Time dimensions based on year/month/week/day are coded differently in the Mondrian schema due to the MDX time-related functions: |
− | *ParallelPeriod([level[, index[, member]]]) | + | *ParallelPeriod([level[, index[, member]]]) returns a member from a previous/next period with the same position in the level as the specified member. |
− | *PeriodsToDate([level[, member]]) | + | *PeriodsToDate([level[, member]]) returns the set of members starting from the first (earliest) sibling of the specified member and ending with the specified member itself. |
− | *WTD([member]) | + | *WTD([member]) returns PeriodsToDate( Week_Level, Member ). |
− | *MTD([member]) | + | *MTD([member]) returns PeriodsToDate( Month_Level, Member ). |
− | *QTD([member]) | + | *QTD([member]) returns PeriodsToDate( Quarter_Level, Member ). |
− | *YTD([member]) | + | *YTD([member]) returns PeriodsToDate( Year_Level, Member ). |
− | *LastPeriod(index[, member]) | + | *LastPeriod(index[, member]) returns the Level members that stand before the specified member, and including the specified member itself. |
− | Time dimensions have <code>type="TimeDimension"</code>. The role of a | + | Time dimensions have <code>type="TimeDimension"</code>. The role of a Level in a time dimension is indicated by the Level's <code>levelType</code> attribute, whose allowable values include: |
{| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5" | {| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5" | ||
Line 422: | Line 420: | ||
|TimeYears||Level is a year. | |TimeYears||Level is a year. | ||
|- | |- | ||
− | |TimeQuarters||Level is a | + | |TimeQuarters||Level is a quarter. |
|- | |- | ||
|TimeMonths||Level is a month. | |TimeMonths||Level is a month. | ||
Line 453: | Line 451: | ||
</Dimension> | </Dimension> | ||
</syntaxhighlight> | </syntaxhighlight> | ||
+ | |||
+ | ======Multiple hierarchies====== | ||
+ | |||
+ | A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension. | ||
+ | |||
+ | ======Shared dimensions====== | ||
+ | |||
+ | A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension. | ||
+ | |||
+ | When generating the SQL for a join, Mondrian needs to know which column to join to. If you are joining to a join, then you need to tell it which of the tables in the join that column belongs to (usually, it will be the first table in the join). | ||
+ | Because shared dimensions don't belong to a Cube, you have to give them an explicit table (or other data source). When you use them in a particular Cube, you specify the foreign key. This example shows the ''Store Type'' Dimension being joined to the ''Sales'' Cube using <code>sales_fact_1997.store_id</code> as the <code>foreignKey</code>, and to the ''Warehouse'' Cube using <code>warehouse.warehouse_store_id</code> as the <code>foreignKey</code>: | ||
+ | |||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Dimension name="Store Type"> | ||
+ | <Hierarchy hasAll="true" primaryKey="store_id"> | ||
+ | <Table name="store"/> | ||
+ | <Level name="Store Type" column="store_type" | ||
+ | uniqueMembers="true"/> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | |||
+ | <Cube name="Sales"> | ||
+ | <Table name="sales_fact_1997"/> | ||
+ | ... | ||
+ | <DimensionUsage name="Store Type" source="Store Type" | ||
+ | foreignKey="store_id"/> | ||
+ | </Cube> | ||
+ | |||
+ | <Cube name="Warehouse"> | ||
+ | <Table name="warehouse"/> | ||
+ | ... | ||
+ | <DimensionUsage name="Store Type" source="Store Type" | ||
+ | foreignKey="warehouse_store_id"/> | ||
+ | </Cube> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ======Snowflake schema (join)====== | ||
+ | |||
+ | A Dimension can be based upon more than one table, provided that there is a well-defined path to join these tables to the fact table. This kind of Dimension is known as a "snowflake", and is defined using the <code><Join></code> operator. | ||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Cube name="Sales"> | ||
+ | ... | ||
+ | <Dimension name="Product" foreignKey="product_id"> | ||
+ | <Hierarchy hasAll="true" primaryKey="product_id" | ||
+ | primaryKeyTable="product"> | ||
+ | <Join leftKey="product_class_key" rightAlias="product_class" | ||
+ | rightKey="product_class_id"> | ||
+ | <Table name="product"/> | ||
+ | <Join leftKey="product_type_id" rightKey="product_type_id"> | ||
+ | <Table name="product_class"/> | ||
+ | <Table name="product_type"/> | ||
+ | </Join> | ||
+ | </Join> | ||
+ | <!-- Level declarations ... --> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | </Cube> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | |||
+ | This defines a "Product" Dimension consisting of three tables:<br> | ||
+ | *The fact table joins to the <code>product</code> table via <code>foreignKey="product_id"</code>, which in turn joins to the | ||
+ | *<code>product_class</code> table via <code>foreignKey="product_class_id"</code>, which in turn joins to the | ||
+ | *<code>product_type</code> table via <code>foreignKey="product_type_id"</code>. | ||
+ | We require a <code><Join></code> element nested within a <code><Join></code> element because <code><Join></code> takes two operands; the operands can be tables, joins, or even queries. | ||
+ | |||
+ | The arrangement of the tables seems complex, but the simple rule of thumb is to order the tables by the number of rows they contain. The <code>product</code> table has the most rows, so it joins to the fact table and appears first. <code>product_class</code> has fewer rows, and <code>product_type</code>, at the tip of the snowflake, has fewest of all. | ||
+ | |||
+ | Note that the outer <code><Join></code> element has a <code>rightAlias</code> attribute. This is necessary because the right component of the join (the inner <code><Join></code> element) consists of more than one table. No <code>leftAlias</code> attribute is necessary in this case, because the <code>leftKey</code> column unambiguously comes from the <code>product</code> table. | ||
+ | |||
+ | ==Cube DSL== | ||
+ | |||
+ | A Cube can be considered a multi-dimensional generalization of a two- or three-dimensional spreadsheet. Each cell of the cube holds a number that represents some measure of the business, such as sales, profits, expenses, budget and forecast. The elements of a dimension can be organized as a hierarchy, that is, a set of parent-child relationships, typically where a parent member summarizes its children. | ||
+ | |||
+ | CubeDSL generates the schema file. | ||
+ | |||
+ | The main semantic elements of the CubeDSL are: | ||
+ | *'''package''' - The root element that contains all the other elements. A model can contain multiple packages. | ||
+ | *'''schema''' - Defines the name of the Mondrian schema file | ||
+ | *'''import''' declarations - Used to import external models or even Java classes. | ||
+ | *'''cube''' - The abstraction of a Mondrian cube. It contains further elements such as dimension, measure. | ||
+ | *'''dimension''' - Defines an attribute, or set of attributes, which is a collection of hierarchies, and by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer and the store in which the product was sold; color, gender, and store are all dimensions. | ||
+ | *'''hierarchy''' - Defines a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation. A hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of all the stores in that city. | ||
+ | *'''entity''' - Defines which table and columns of the table the “hierarchy” uses. | ||
+ | *'''level''' - Defines a collection of members which have the same distance from the root of the hierarchy. | ||
+ | *'''property''' - Defines a member property. The definition is against a level, but the property is available to all members. | ||
+ | *'''dimensionUsage''' - Defines the shared dimension usage in a cube. | ||
+ | *'''measure''' - The members of a special dimension, a quantity being measured, for example, unit sales of a product, or cost price of inventory items. | ||
+ | |||
+ | ===Keywords=== | ||
+ | |||
+ | The CubeDSL generates a schema file in XML file type. It should be defined with the following keywords in a file with the ''.cube'' extension: | ||
+ | |||
+ | ====package, schema==== | ||
+ | ► '''Syntax''': | ||
+ | <syntaxhighlight lang="java"> | ||
+ | package <package name> schema <schema name>{ | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="java"> | ||
+ | package local.test4doku.cubes schema TEST{ | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | It generates the following schema code in an XML file: | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Schema name="TEST"> | ||
+ | ... | ||
+ | </Schema> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ====dimension, hierarchy, level, property==== | ||
+ | ► '''Syntax''': | ||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension <dimension name> { | ||
+ | hierarchy [hierarchy name] [has All] [allMemberName=<all member name>] | ||
+ | [defaultMember=<default member>]{ | ||
+ | entity <entity name> key=<key name> { | ||
+ | level <level name> column=<column name> [uniqueMembers]; | ||
+ | ... | ||
+ | } | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | '''Notes:''' | ||
+ | *If the hierarchy name is not defined, the default value is the dimension name. | ||
+ | *By default, every hierarchy contains a top level called <code>(ALL)</code>, which contains a single member called <code>(All {hierarchy name})</code>. This member is the parent of all other members of the hierarchy and also the default member of the hierarchy. Use <code>allMemberName</code> to override the default name of the <code>All</code> member. | ||
+ | *If <code>hasAll="false"</code>, the default member of the dimension will be the first member of the first level. Use <code>defaultMember</code> to override the default member of the hierarchy. | ||
+ | |||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension CustomerGenders { | ||
+ | hierarchy allMemberName="All Genders" defaultMember="[All Genders]"{ | ||
+ | entity Tcustomer key=cus_id { | ||
+ | level Gender column=gender ; | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | It generates the following schema code in an XML file: | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Dimension name="CustomerGenders"> | ||
+ | <Hierarchy hasAll="false" allMemberName="All Genders" defaultMember="[All | ||
+ | Genders]" primaryKey="CUS_ID"> | ||
+ | <Table name="TCUSTOMER" /> | ||
+ | <Level name="Gender" column="GENDER" uniqueMembers="false" | ||
+ | type="String"/> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | |||
+ | =====join===== | ||
+ | ► '''Syntax''': | ||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension <dimension name> { | ||
+ | hierarchy [<hierarchy name>] [has All] [allMemberName=<all member name>] | ||
+ | [defaultMember=<default member>] { | ||
+ | entity <entity name> key=<primarykey name> { | ||
+ | level <level name> column=<column name> [uniqueMembers]; | ||
+ | ... | ||
+ | entity <entity name> [key=<key name>] { | ||
+ | over <leftkey name>; | ||
+ | level <level name> column=<column name> [uniqueMembers]; | ||
+ | ... | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension ProductTypes { | ||
+ | hierarchy hasAll { | ||
+ | entity Tproduct key=pro_id { | ||
+ | level proName column=pro_name uniqueMembers; | ||
+ | |||
+ | entity Tproduct_type key=pro_type_id { | ||
+ | over type; | ||
+ | level proTypeName column=pro_type_name uniqueMembers; | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | It generates the following schema code in an XML file: | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Dimension name="ProductTypes"> | ||
+ | <Hierarchy hasAll="true" primaryKey="PRO_ID" primaryKeyTable="TPRODUCT"> | ||
+ | <Join leftKey="TYPE_ID" rightKey="PRO_TYPE_ID"> | ||
+ | <Table name="TPRODUCT" /> | ||
+ | <Table name="TPRODUCT_TYPE"/> | ||
+ | </Join> | ||
+ | <Level name="proTypeName" table="TPRODUCT_TYPE" | ||
+ | column="PRO_TYPE_NAME" uniqueMembers="true" type="String"/> | ||
+ | <Level name="proName" table="TPRODUCT" column="PRO_NAME" | ||
+ | uniqueMembers="true" type="String"/> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | '''Notes''': | ||
+ | *The default key is <code>ID</code>. | ||
+ | *For the reference to an entity, e.g., ''type'' in Tproduct_type, the name of the key in the database is automatically defined as <code>type_id</code>. | ||
+ | |||
+ | =====typeTime===== | ||
+ | ► '''Syntax''': | ||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension <dimension name> typeTime { | ||
+ | hierarchy <hierarchy name> [has All] [allMemberName=<all member name>] | ||
+ | [defaultMember=<default member>]{ | ||
+ | entity <entity name> key=<key name> { | ||
+ | level <level name> column=<column name> [uniqueMembers] | ||
+ | levelType=<type name>; | ||
+ | ... | ||
+ | } | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | '''Note''': | ||
+ | *<code>levelType</code> can have the values <code>Time</code>, <code>TimeSeconds</code>, <code>TimeMinutes</code>, <code>TimeHours</code>, <code>TimeDays</code>, <code>TimeWeeks</code>, <code>TimeMonths</code>, <code>TimeQuarters</code>, <code>TimeHalfYears</code>, <code>TimeYears</code> (see the definitions in the section [[#Time dimensions]]) | ||
+ | |||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension TimeTheDate typeTime { | ||
+ | hierarchy { | ||
+ | entity Ttime_by_day key=the_date { | ||
+ | level Year column=the_year uniqueMembers levelType=TimeYears; | ||
+ | level Quarter column=quarter levelType=TimeQuarters; | ||
+ | level Month column=month_of_year levelType=TimeMonths; | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | It generates the following schema code in an XML file: | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Dimension name="TimeTheDate" type="TimeDimension"> | ||
+ | <Hierarchy hasAll="false" primaryKey="THE_DATE"> | ||
+ | <Table name="TTIME_BY_DAY" schema=""/> | ||
+ | <Level name="Year" column="THE_YEAR" uniqueMembers="true" | ||
+ | levelType="TimeYears" type="String"/> | ||
+ | <Level name="Quarter" column="QUARTER" uniqueMembers="false" | ||
+ | levelType="TimeQuarters" type="String"/> | ||
+ | <Level name="Month" column="MONTH_OF_YEAR" uniqueMembers="false" | ||
+ | levelType="TimeMonths" type="Integer"/> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | =====property===== | ||
+ | ► '''Syntax''': | ||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension <dimension name> { | ||
+ | hierarchy <hierarchy name> [has All] [allMemberName=<all member name>] | ||
+ | [defaultMember=<default member>]{ | ||
+ | entity <entity name> key=<key name> { | ||
+ | level <level name> column=<column name> [uniqueMembers]{ | ||
+ | property <property name> column=<column name> | ||
+ | [type=<type name>]; | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | '''Note''': | ||
+ | *'''property type''' can be <code>Boolean</code>, <code>Date</code>, <code>Integer</code>, <code>Numeric</code>, <code>String</code>, <code>Time</code> or <code>TimeStamp</code>. | ||
+ | |||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension ProductTypes { | ||
+ | hierarchy { | ||
+ | entity TproductType key= pro_type_id { | ||
+ | level proTypeId column= pro_type_id uniqueMembers { | ||
+ | property proTypeName column= pro_type_name type=String; | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | It generates the following schema code in an XML file: | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Dimension name="ProductTypes"> | ||
+ | <Hierarchy hasAll="false" primaryKey="PRO_TYPE_ID"> | ||
+ | <Table name="TPRODUCT_TYPE" schema=""/> | ||
+ | <Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true" | ||
+ | type="Integer"> | ||
+ | <Property name="proTypeName" column="PRO_TYPE_NAME" | ||
+ | type="String" /> | ||
+ | </Level> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ====cube, dimensionUsage, measure==== | ||
+ | ► '''Syntax''': | ||
+ | <syntaxhighlight lang="java"> | ||
+ | cube <cube name> [defaultMeasure <measure name>]{ | ||
+ | entity <entity name> [key=<key name>]{ | ||
+ | dimensionUsage <dimension name> over <key name>; | ||
+ | ... | ||
+ | measure <measure name> <aggregator name> column=<column name> | ||
+ | [notVisible]; | ||
+ | ... | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | '''Note''': | ||
+ | *Aggregator functions include <code>avg</code>, <code>sum</code>, <code>min</code>, <code>max</code>, <code>count</code>, <code>distinct-count</code> (see the definitions in the section [[#Measures]] ). | ||
+ | *Mondrian takes the first measure as the default measure. You can override this behavior by setting <code>defaultMeasure</code>. | ||
+ | |||
+ | |||
+ | |||
+ | ► '''Example''': | ||
+ | |||
+ | <syntaxhighlight lang="java"> | ||
+ | dimension ProductTypes { | ||
+ | hierarchy { | ||
+ | entity TproductType key= pro_type_id { | ||
+ | level proTypeId column= pro_type_id uniqueMembers { | ||
+ | property proTypeName column= pro_type_name type=String; | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | It generates the following schema code in an XML file: | ||
+ | |||
+ | <syntaxhighlight lang="xml"> | ||
+ | <Dimension name="ProductTypes"> | ||
+ | <Hierarchy hasAll="false" primaryKey="PRO_TYPE_ID"> | ||
+ | <Table name="TPRODUCT_TYPE" schema=""/> | ||
+ | <Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true" | ||
+ | type="Integer"> | ||
+ | <Property name="proTypeName" column="PRO_TYPE_NAME" | ||
+ | type="String" /> | ||
+ | </Level> | ||
+ | </Hierarchy> | ||
+ | </Dimension> | ||
+ | </syntaxhighlight> | ||
+ | == Copyright Notice == | ||
+ | {{Copyright Notice}} |
Latest revision as of 19:22, 4 April 2017
Contents
Introduction
MDX and OLAP Cube
MDX(MultiDimensional eXpressions) is a query language for OLAP (OnLine Analytical Processing) databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas.
SQL was designed to query three-dimensional data structures, called tables, where data are organized in rows, columns and values. In OLAP, data are organized around multiple measures, dimensions, hierarchies, and levels.
Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels are the centerpieces of MDX. In this way, business models are mapped to language-specific concepts (e.g. a list of countries will be mapped as an MDX dimension). Measures in MDX are a special case of dimensions and they hold numeric values. The concrete value of a dimension (e.g. Paris) is a member of the dimension and is similar to the value contained in an SQL table.
An OLAP Cube is a collection of dimensions indexing a list of measures. For example, a geometrical cube can have three dimensions, whereas a logistical "cube" can have many more dimensions.
More information can be found at:
http://en.wikipedia.org/wiki/MultiDimensional_eXpressions
http://en.wikipedia.org/wiki/OLAP_cube
Mondrian
Mondrian is an open source OLAP server, written in Java. It supports the MDX query language and XML for analysis and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache.
Mondrian is used for:
- high performance, interactive analysis of large or small volumes of information
- dimensional exploration of data, for example analysis of sales by product line, by region, by time period
- parsing the MDX language into SQL to retrieve answers to dimensional queries
- high-speed queries through the use of aggregate tables in the RDBMS (Relational DataBase Management System)
- advanced calculations using the calculation expressions of the MDX language
Schema
A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model.
The logical model consists of the constructs used to write queries in the MDX language: cubes, dimensions, hierarchies, levels, and members.
The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database.
Schema file
The schema file is an XML file that defines the dimensions and measures for the cube and maps them to the database tables. The product and location dimensions contain several levels each. Each level defines its corresponding field in the database table. The different values of the field become the members of the level.
Note: The levels must be specified in the intended order of breakdown.
- The structure of the XML document is as follows:
<Schema> <Cube> <Table> <AggName> aggElements <AggPattern> aggElements <Dimension> <Hierarchy> relation <Closure/> <Level> <KeyExpression> <SQL/> <NameExpression> <SQL/> <CaptionExpression> <SQL/> <OrdinalExpression> <SQL/> <ParentExpression> <SQL/> <Property> <PropertyExpression> <SQL/> <DimensionUsage> <Measure> <MeasureExpression> <SQL/> <CalculatedMemberProperty/> <CalculatedMember> <Formula/> <CalculatedMemberProperty/> <NamedSet> <Formula/> <VirtualCube> <CubeUsages> <CubeUsage> <VirtualCubeDimension> <VirtualCubeMeasure> <Role> <SchemaGrant> <CubeGrant> <DimensionGrant> <HierarchyGrant> <MemberGrant/> <Union> <RoleUsage/> <UserDefinedFunction/> <Parameter/>
- relation ::=
<Table> <SQL/> <View> <SQL/> <InlineTable> <ColumnDefs> <ColumnDef> <Rows> <Row> <Value> <Join> relation
- aggElement ::=
<AggExclude> <AggFactCount> <AggIgnoreColumn> <AggForeignKey> <AggMeasure> <AggLevel>
Logical elements
Element | Description |
---|---|
<Cube> | A collection of dimensions and measures, all centered on a fact table. |
<VirtualCube> | A cube defined by combining the dimensions and measures of one or more cubes. A measure originating from another cube can be a <CalculatedMember>. |
<CubeUsages> | Base cubes that are imported into a virtual cube |
<CubeUsage> | Usage of a base cube by a virtual cube. |
<VirtualCubeDimension> | Usage of a dimension by a virtual cube. |
<VirtualCubeMeasure> | Usage of a measure by a virtual cube. |
<Dimension> | Dimension |
<DimensionUsage> | Usage of a shared dimension by a cube. |
<Hierarchy> | Hierarchy |
<Level> | Level of a hierarchy |
<KeyExpression> | SQL expression used as key of the level, in lieu of a column. |
<NameExpression> | SQL expression used to compute the name of a member, in lieu of Level.nameColumn. |
<CaptionExpression> | SQL expression used to compute the caption of a member, in lieu of Level.captionColumn. |
<OrdinalExpression> | SQL expression used to sort members of a level, in lieu of Level.ordinalColumn. |
<ParentExpression> | SQL expression used to compute a measure, in lieu of Level.parentColumn. |
<Property> | Member property. The definition is against a hierarchy or level, but the property will be available to all members. |
<Measure> | Measure |
<CalculatedMember> | A member whose value is derived using a formula, defined as part of a cube. |
<NamedSet> | A set whose value is derived using a formula, defined as part of a cube. |
Physical elements
Element | Description |
---|---|
<Table> | Fact or dimension table. |
<View> | Defines a (virtual) table using an SQL query, which can have different variants for different underlying databases. |
<Join> | Defines a (virtual) table by joining a set of queries. |
<InlineTable> | Defines a table using an inline dataset. |
<Closure> | Maps a parent-child hierarchy onto a closure table. |
Aggregate tables
Element | Description |
---|---|
<AggExclude> | Exclude a candidate aggregate table by name or pattern matching. |
<AggName> | Declares an aggregate table to be matched by name. |
<AggPattern> | Declares a set of aggregate tables by regular expression pattern. |
<AggFactCount> | Specifies name of the column in the candidate aggregate table which contains the number of fact table rows. |
<AggIgnoreColumn> | Tells Mondrian to ignore a column in an aggregate table. |
<AggForeignKey> | Maps foreign key in the fact table to a foreign key column in the candidate aggregate table. |
<AggMeasure> | Maps a measure to a column in the candidate aggregate table. |
<AggLevel> | Maps a level to a column in the candidate aggregate table. |
Access control
Element | Description |
---|---|
<Role> | An access-control profile. |
<SchemaGrant> | A set of rights to a schema. |
<CubeGrant> | A set of rights to a cube. |
<HierarchyGrant> | A set of rights to a hierarchy and to levels within that hierarchy. |
<MemberGrant> | A set of rights to a member and its children. |
<Union> | Definition of a set of rights as the union of a set of roles. |
<RoleUsage> | A reference to a Role. |
Extensions
Element | Description |
---|---|
<UserDefinedFunction> | Declares a user-defined function. |
<CellFormatter> | Cell formatter. |
<MemberFormatter> | Member formatter. |
<PropertyFormatter> | Property formatter. |
<Script> | Script fragment to implement an SPI such as a user-defined function, a member formatter, or a cell formatter. |
Miscellaneous
Element | Description |
---|---|
<Annotations> | Holder for annotations. |
<Annotation> | User-defined property attached to a metadata element. |
<Parameter> | Part of the definition of a Hierarchy; passed to a MemberReader, if present. |
<CalculatedMemberProperty> | Property of a calculated member. |
<Formula> | Holds the formula text within a <NamedSet> or <CalculatedMember>. |
<ColumnDefs> | Holder for <ColumnDef> elements. |
<ColumnDef> | Definition of a column in an <InlineTable> dataset. |
<Rows> | Holder for <Row> elements. |
<Row> | Row in an <InlineTable> dataset. |
<Value> | Value of a column in an <InlineTable> dataset. |
<MeasureExpression> | SQL expression used to compute a measure, in lieu of Measure.column. |
<SQL> | The SQL expression for a particular database dialect. |
Cubes
A Cube is a named collection of Measures and Dimensions. The one thing the Measures and Dimensions have in common is the fact table. The fact table holds the columns from which Measures are calculated, and contains references to the tables which hold the Dimensions.
► Syntax:
<Cube name="{cube name}">
<Table name="{table name}"/>
...
</Cube>
The fact table is defined using the <Table>
element. If the fact table is not in the default schema, you can provide an explicit schema using the schema
attribute, for example:
► Syntax:
<Table schema="{schema name}" name="{table name}"/>
You can also use the <View>
construct to build more complicated SQL statements.
The <Join>
construct is not supported for fact tables.
Measures
Each Measure has a name
, a column
in the fact table, and an aggregator
.
► Syntax:
<Measure name="{measure name}" column="{column name}"
aggregator="{aggregator name}" datatype="{type name}"
formatString="{format}"/>
► Example:
<Measure name="Store Sales" column="store_sales" aggregator="sum"
datatype="Numeric" formatString="#,###.00"/>
The aggregators are
- sum - Returns the sum value of the specified set expression.
- count - Returns the number of elements in the specified set.
- min - Returns the minimum value of the specified set expression.
- max - Returns the maximum value of the specified set expression.
- avg - Computes the average value of the specified set.
- distinct-count - Returns the number of unique, non-empty elements in the specified set. It has some limitations if your cube contains a parent-child hierarchy.
The optional datatype
attribute specifies how cell values are represented in Mondrian's cache, and how they are returned via XML for analysis.
The datatype
attribute can have values:
- String
- Integer, default value for "count" and "distinct-count" measures
- Numeric, default value, except for "count" and "distinct-count" measures
- Boolean
- Date
- Time
- Timestamp.
An optional formatString
attribute specifies how the value is to be printed. The ',' and '.' symbols are locale-sensitive.
A Measure can have a caption
attribute to be returned by the Member.getCaption() method instead of the name. Defining a specific caption does make sense if special letters (e.g., Σ or Π) are to be displayed.
► Example:
<Measure name="Sum X" column="sum_x" aggregator="sum" caption="Σ X"/>
Dimensions, Hierarchies, Levels
A member is a point within a Dimension determined by a particular set of attribute values. The gender Hierarchy has the two members 'M' and 'F'. 'San Francisco', 'California' and 'USA' are all members of the store Hierarchy.
A Hierarchy is a set of members organized into a structure for convenient analysis. For example, the store Hierarchy consists of the store name, city, state, and nation. The Hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of the stores in that city.
A Level is a collection of members which have the same distance from the root of the Hierarchy.
A Dimension is a collection of Hierarchies which discriminate on the same fact-table attribute (say, the day that a sale occurred).
► Syntax:
<Dimension name="{dimension name}" foreignKey="{key name}">
<Hierarchy hasAll="true" primaryKey="{key name}">
<Table name="{table name}"/>
<Level name="{level name}" column="{column name}"
uniqueMembers="true"/>
</Hierarchy>
</Dimension>
► Example:
<Dimension name="Gender" foreignKey="customer_id">
<Hierarchy hasAll="true" primaryKey="customer_id">
<Table name="customer"/>
<Level name="Gender" column="gender" uniqueMembers="true"/>
</Hierarchy>
</Dimension>
This Dimension consists of a single Hierarchy, which consists of a single Level called Gender.
The values for the Dimension come from the gender column in the customer table. The gender column contains two values, 'F' and 'M', so the Gender Dimension contains the members [Gender].[F]
and [Gender].[M]
.
For any given sale, the Gender Dimension is the gender of the customer who made that purchase. This is expressed by joining from the fact table "sales_fact_1997.customer_id" to the Dimension table "customer.customer_id".
► Syntax:
<Table schema="{schema name}" name="{table name}"/>
You can also use the <View>
construct to build more complicated SQL statements. The <Join>
construct is not supported for fact tables.
"ALL" member
By default, every Hierarchy contains a top Level called (All)
, which contains a single member called (All {hierarchyName})
. This member is parent of all other members of the Hierarchy, and thus represents a grand total. It is also the default member of the Hierarchy; that is, the member which is used for calculating cell values when the Hierarchy is not included on an axis or in the slicer. The allMemberName
and allLevelName
attributes override the default names of the all Level and all member.
If the <Hierarchy>
element has hasAll="false"
, the 'all' Level is suppressed. The default member of that Dimension will now be the first member of the first Level; for example, in a Time Hierarchy, it will be the first year in the Hierarchy. Changing the default member can be confusing, so you should generally set hasAll="true"
.
Time dimensions
Time dimensions based on year/month/week/day are coded differently in the Mondrian schema due to the MDX time-related functions:
- ParallelPeriod([level[, index[, member]]]) returns a member from a previous/next period with the same position in the level as the specified member.
- PeriodsToDate([level[, member]]) returns the set of members starting from the first (earliest) sibling of the specified member and ending with the specified member itself.
- WTD([member]) returns PeriodsToDate( Week_Level, Member ).
- MTD([member]) returns PeriodsToDate( Month_Level, Member ).
- QTD([member]) returns PeriodsToDate( Quarter_Level, Member ).
- YTD([member]) returns PeriodsToDate( Year_Level, Member ).
- LastPeriod(index[, member]) returns the Level members that stand before the specified member, and including the specified member itself.
Time dimensions have type="TimeDimension"
. The role of a Level in a time dimension is indicated by the Level's levelType
attribute, whose allowable values include:
levelType value | Meaning |
---|---|
TimeYears | Level is a year. |
TimeQuarters | Level is a quarter. |
TimeMonths | Level is a month. |
TimeWeeks | Level is a week. |
TimeDays | Level is a day. |
► Example:
<Dimension name="Time" type="TimeDimension">
<Hierarchy hasAll="true" allMemberName="All Periods" primaryKey="dateid">
<Table name="datehierarchy"/>
<Level name="Year" column="year" uniqueMembers="true"
levelType="TimeYears" type="Numeric"/>
<Level name="Quarter" column="quarter" uniqueMembers="false"
levelType="TimeQuarters"/>
<Level name="Month" column="month" uniqueMembers="false"
ordinalColumn="month" nameColumn="month_name" levelType="TimeMonths"
type="Numeric"/>
<Level name="Week" column="week_in_month" uniqueMembers="false"
levelType="TimeWeeks"/>
<Level name="Day" column="day_in_month" uniqueMembers="false"
ordinalColumn="day_in_month" nameColumn="day_name"
levelType="TimeDays" type="Numeric"/>
</Hierarchy>
</Dimension>
Multiple hierarchies
A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension.
A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension.
When generating the SQL for a join, Mondrian needs to know which column to join to. If you are joining to a join, then you need to tell it which of the tables in the join that column belongs to (usually, it will be the first table in the join).
Because shared dimensions don't belong to a Cube, you have to give them an explicit table (or other data source). When you use them in a particular Cube, you specify the foreign key. This example shows the Store Type Dimension being joined to the Sales Cube using sales_fact_1997.store_id
as the foreignKey
, and to the Warehouse Cube using warehouse.warehouse_store_id
as the foreignKey
:
► Example:
<Dimension name="Store Type">
<Hierarchy hasAll="true" primaryKey="store_id">
<Table name="store"/>
<Level name="Store Type" column="store_type"
uniqueMembers="true"/>
</Hierarchy>
</Dimension>
<Cube name="Sales">
<Table name="sales_fact_1997"/>
...
<DimensionUsage name="Store Type" source="Store Type"
foreignKey="store_id"/>
</Cube>
<Cube name="Warehouse">
<Table name="warehouse"/>
...
<DimensionUsage name="Store Type" source="Store Type"
foreignKey="warehouse_store_id"/>
</Cube>
Snowflake schema (join)
A Dimension can be based upon more than one table, provided that there is a well-defined path to join these tables to the fact table. This kind of Dimension is known as a "snowflake", and is defined using the <Join>
operator.
► Example:
<Cube name="Sales">
...
<Dimension name="Product" foreignKey="product_id">
<Hierarchy hasAll="true" primaryKey="product_id"
primaryKeyTable="product">
<Join leftKey="product_class_key" rightAlias="product_class"
rightKey="product_class_id">
<Table name="product"/>
<Join leftKey="product_type_id" rightKey="product_type_id">
<Table name="product_class"/>
<Table name="product_type"/>
</Join>
</Join>
<!-- Level declarations ... -->
</Hierarchy>
</Dimension>
</Cube>
This defines a "Product" Dimension consisting of three tables:
- The fact table joins to the
product
table viaforeignKey="product_id"
, which in turn joins to the product_class
table viaforeignKey="product_class_id"
, which in turn joins to theproduct_type
table viaforeignKey="product_type_id"
.
We require a <Join>
element nested within a <Join>
element because <Join>
takes two operands; the operands can be tables, joins, or even queries.
The arrangement of the tables seems complex, but the simple rule of thumb is to order the tables by the number of rows they contain. The product
table has the most rows, so it joins to the fact table and appears first. product_class
has fewer rows, and product_type
, at the tip of the snowflake, has fewest of all.
Note that the outer <Join>
element has a rightAlias
attribute. This is necessary because the right component of the join (the inner <Join>
element) consists of more than one table. No leftAlias
attribute is necessary in this case, because the leftKey
column unambiguously comes from the product
table.
Cube DSL
A Cube can be considered a multi-dimensional generalization of a two- or three-dimensional spreadsheet. Each cell of the cube holds a number that represents some measure of the business, such as sales, profits, expenses, budget and forecast. The elements of a dimension can be organized as a hierarchy, that is, a set of parent-child relationships, typically where a parent member summarizes its children.
CubeDSL generates the schema file.
The main semantic elements of the CubeDSL are:
- package - The root element that contains all the other elements. A model can contain multiple packages.
- schema - Defines the name of the Mondrian schema file
- import declarations - Used to import external models or even Java classes.
- cube - The abstraction of a Mondrian cube. It contains further elements such as dimension, measure.
- dimension - Defines an attribute, or set of attributes, which is a collection of hierarchies, and by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer and the store in which the product was sold; color, gender, and store are all dimensions.
- hierarchy - Defines a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation. A hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of all the stores in that city.
- entity - Defines which table and columns of the table the “hierarchy” uses.
- level - Defines a collection of members which have the same distance from the root of the hierarchy.
- property - Defines a member property. The definition is against a level, but the property is available to all members.
- dimensionUsage - Defines the shared dimension usage in a cube.
- measure - The members of a special dimension, a quantity being measured, for example, unit sales of a product, or cost price of inventory items.
Keywords
The CubeDSL generates a schema file in XML file type. It should be defined with the following keywords in a file with the .cube extension:
package, schema
► Syntax:
package <package name> schema <schema name>{
...
}
► Example:
package local.test4doku.cubes schema TEST{
...
}
It generates the following schema code in an XML file:
<Schema name="TEST">
...
</Schema>
dimension, hierarchy, level, property
► Syntax:
dimension <dimension name> {
hierarchy [hierarchy name] [has All] [allMemberName=<all member name>]
[defaultMember=<default member>]{
entity <entity name> key=<key name> {
level <level name> column=<column name> [uniqueMembers];
...
}
}
...
}
Notes:
- If the hierarchy name is not defined, the default value is the dimension name.
- By default, every hierarchy contains a top level called
(ALL)
, which contains a single member called(All {hierarchy name})
. This member is the parent of all other members of the hierarchy and also the default member of the hierarchy. UseallMemberName
to override the default name of theAll
member. - If
hasAll="false"
, the default member of the dimension will be the first member of the first level. UsedefaultMember
to override the default member of the hierarchy.
► Example:
dimension CustomerGenders {
hierarchy allMemberName="All Genders" defaultMember="[All Genders]"{
entity Tcustomer key=cus_id {
level Gender column=gender ;
}
}
}
It generates the following schema code in an XML file:
<Dimension name="CustomerGenders">
<Hierarchy hasAll="false" allMemberName="All Genders" defaultMember="[All
Genders]" primaryKey="CUS_ID">
<Table name="TCUSTOMER" />
<Level name="Gender" column="GENDER" uniqueMembers="false"
type="String"/>
</Hierarchy>
</Dimension>
join
► Syntax:
dimension <dimension name> {
hierarchy [<hierarchy name>] [has All] [allMemberName=<all member name>]
[defaultMember=<default member>] {
entity <entity name> key=<primarykey name> {
level <level name> column=<column name> [uniqueMembers];
...
entity <entity name> [key=<key name>] {
over <leftkey name>;
level <level name> column=<column name> [uniqueMembers];
...
}
}
}
...
}
► Example:
dimension ProductTypes {
hierarchy hasAll {
entity Tproduct key=pro_id {
level proName column=pro_name uniqueMembers;
entity Tproduct_type key=pro_type_id {
over type;
level proTypeName column=pro_type_name uniqueMembers;
}
}
}
}
It generates the following schema code in an XML file:
<Dimension name="ProductTypes">
<Hierarchy hasAll="true" primaryKey="PRO_ID" primaryKeyTable="TPRODUCT">
<Join leftKey="TYPE_ID" rightKey="PRO_TYPE_ID">
<Table name="TPRODUCT" />
<Table name="TPRODUCT_TYPE"/>
</Join>
<Level name="proTypeName" table="TPRODUCT_TYPE"
column="PRO_TYPE_NAME" uniqueMembers="true" type="String"/>
<Level name="proName" table="TPRODUCT" column="PRO_NAME"
uniqueMembers="true" type="String"/>
</Hierarchy>
</Dimension>
Notes:
- The default key is
ID
. - For the reference to an entity, e.g., type in Tproduct_type, the name of the key in the database is automatically defined as
type_id
.
typeTime
► Syntax:
dimension <dimension name> typeTime {
hierarchy <hierarchy name> [has All] [allMemberName=<all member name>]
[defaultMember=<default member>]{
entity <entity name> key=<key name> {
level <level name> column=<column name> [uniqueMembers]
levelType=<type name>;
...
}
}
...
}
Note:
levelType
can have the valuesTime
,TimeSeconds
,TimeMinutes
,TimeHours
,TimeDays
,TimeWeeks
,TimeMonths
,TimeQuarters
,TimeHalfYears
,TimeYears
(see the definitions in the section #Time dimensions)
► Example:
dimension TimeTheDate typeTime {
hierarchy {
entity Ttime_by_day key=the_date {
level Year column=the_year uniqueMembers levelType=TimeYears;
level Quarter column=quarter levelType=TimeQuarters;
level Month column=month_of_year levelType=TimeMonths;
}
}
}
It generates the following schema code in an XML file:
<Dimension name="TimeTheDate" type="TimeDimension">
<Hierarchy hasAll="false" primaryKey="THE_DATE">
<Table name="TTIME_BY_DAY" schema=""/>
<Level name="Year" column="THE_YEAR" uniqueMembers="true"
levelType="TimeYears" type="String"/>
<Level name="Quarter" column="QUARTER" uniqueMembers="false"
levelType="TimeQuarters" type="String"/>
<Level name="Month" column="MONTH_OF_YEAR" uniqueMembers="false"
levelType="TimeMonths" type="Integer"/>
</Hierarchy>
</Dimension>
property
► Syntax:
dimension <dimension name> {
hierarchy <hierarchy name> [has All] [allMemberName=<all member name>]
[defaultMember=<default member>]{
entity <entity name> key=<key name> {
level <level name> column=<column name> [uniqueMembers]{
property <property name> column=<column name>
[type=<type name>];
}
...
}
}
...
}
Note:
- property type can be
Boolean
,Date
,Integer
,Numeric
,String
,Time
orTimeStamp
.
► Example:
dimension ProductTypes {
hierarchy {
entity TproductType key= pro_type_id {
level proTypeId column= pro_type_id uniqueMembers {
property proTypeName column= pro_type_name type=String;
}
}
}
}
It generates the following schema code in an XML file:
<Dimension name="ProductTypes">
<Hierarchy hasAll="false" primaryKey="PRO_TYPE_ID">
<Table name="TPRODUCT_TYPE" schema=""/>
<Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true"
type="Integer">
<Property name="proTypeName" column="PRO_TYPE_NAME"
type="String" />
</Level>
</Hierarchy>
</Dimension>
cube, dimensionUsage, measure
► Syntax:
cube <cube name> [defaultMeasure <measure name>]{
entity <entity name> [key=<key name>]{
dimensionUsage <dimension name> over <key name>;
...
measure <measure name> <aggregator name> column=<column name>
[notVisible];
...
}
...
}
Note:
- Aggregator functions include
avg
,sum
,min
,max
,count
,distinct-count
(see the definitions in the section #Measures ). - Mondrian takes the first measure as the default measure. You can override this behavior by setting
defaultMeasure
.
► Example:
dimension ProductTypes {
hierarchy {
entity TproductType key= pro_type_id {
level proTypeId column= pro_type_id uniqueMembers {
property proTypeName column= pro_type_name type=String;
}
}
}
}
It generates the following schema code in an XML file:
<Dimension name="ProductTypes">
<Hierarchy hasAll="false" primaryKey="PRO_TYPE_ID">
<Table name="TPRODUCT_TYPE" schema=""/>
<Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true"
type="Integer">
<Property name="proTypeName" column="PRO_TYPE_NAME"
type="String" />
</Level>
</Hierarchy>
</Dimension>
Copyright Notice
All rights are reserved by Compex Systemhaus GmbH. In particular, duplications, translations, microfilming, saving and processing in electronic systems are protected by copyright. Use of this manual is only authorized with the permission of Compex Systemhaus GmbH. Infringements of the law shall be punished in accordance with civil and penal laws. We have taken utmost care in putting together texts and images. Nevertheless, the possibility of errors cannot be completely ruled out. The Figures and information in this manual are only given as approximations unless expressly indicated as binding. Amendments to the manual due to amendments to the standard software remain reserved. Please note that the latest amendments to the manual can be accessed through our helpdesk at any time. The contractually agreed regulations of the licensing and maintenance of the standard software shall apply with regard to liability for any errors in the documentation. Guarantees, particularly guarantees of quality or durability can only be assumed for the manual insofar as its quality or durability are expressly stipulated as guaranteed. If you would like to make a suggestion, the Compex Team would be very pleased to hear from you.
(c) 2016-2024 Compex Systemhaus GmbH