Difference between revisions of "Cube DSL"

From OS.bee documentation
Jump to: navigation, search
(Aggregate tables)
 
(89 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Copyright Notice ==
+
 
{{Copyright Notice}}
+
  
 
== Introduction ==
 
== Introduction ==
Line 6: Line 5:
 
=== MDX and OLAP Cube ===
 
=== MDX and OLAP Cube ===
  
'''MDX'''(MultiDimensional eXpressions) is a query language for '''OLAP''' (OnLine Analytical Processing) databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas.
+
'''MDX'''(MultiDimensional eXpressions) is a query language for '''OLAP''' (OnLine Analytical Processing) databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas.
  
SQL was designed to query dimensional data structures, called tables, where data are organized in rows and columns. In OLAP, data are organized around multiple measures, dimensions, hierarchies, and levels.
+
SQL was designed to query three-dimensional data structures, called tables, where data are organized in rows, columns and values. In OLAP, data are organized around multiple measures, dimensions, hierarchies, and levels.
  
Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels, are the centerpieces of MDX. They are mapping business models into language-specific concepts (e.g. a list of countries will be mapped as an MDX dimension). Measures in MDX are a special case of dimensions and they hold numeric values. The concrete value of a dimension (e.g. Paris) is a member of the dimension and is similar to the value of an SQL table.
+
Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels are the centerpieces of MDX. In this way, business models are mapped to language-specific concepts (e.g. a list of countries will be mapped as an MDX dimension). ''Measures'' in MDX are a special case of dimensions and they hold numeric values. The concrete value of a dimension (e.g. Paris) is a member of the dimension and is similar to the value contained in an SQL table.
  
An OLAP Cube is a collection of dimensions indexing a list of measures. E.g. geometrical cube can have three dimensions, whereas an logistical cube can have many more dimensions.  
+
An ''OLAP Cube'' is a collection of dimensions indexing a list of measures. For example, a geometrical cube can have three dimensions, whereas a logistical "cube" can have many more dimensions.  
  
  
  
More information:
+
More information can be found at:
  
 
http://en.wikipedia.org/wiki/MultiDimensional_eXpressions
 
http://en.wikipedia.org/wiki/MultiDimensional_eXpressions
Line 24: Line 23:
 
===Mondrian===
 
===Mondrian===
  
Mondrian is an open source OLAP server, written in Java. It supports the MDX query language and the XML for analysis and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache.
+
Mondrian is an open source OLAP server, written in Java. It supports the MDX query language and XML for analysis and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache.
  
 
Mondrian is used for:
 
Mondrian is used for:
  
*High performance, interactive analysis of large or small volumes of information
+
*high performance, interactive analysis of large or small volumes of information
*Dimensional exploration of data, for example analyzing sales by product line, by region, by time period
+
*dimensional exploration of data, for example analysis of sales by product line, by region, by time period
*Parsing the MDX language into SQL to retrieve answers to dimensional queries
+
*parsing the MDX language into SQL to retrieve answers to dimensional queries
*High-speed queries through the use of aggregate tables in the RDBMS(Relational DataBase Management System)
+
*high-speed queries through the use of aggregate tables in the RDBMS (Relational DataBase Management System)
*Advanced calculations using the calculation expressions of the MDX language
+
*advanced calculations using the calculation expressions of the MDX language
  
 
====Schema====
 
====Schema====
 
A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model.
 
A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model.
  
The logical model consists of the constructs used to write queries in MDX language: cubes, dimensions, hierarchies, levels, and members.
+
The logical model consists of the constructs used to write queries in the MDX language: cubes, dimensions, hierarchies, levels, and members.
  
 
The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database.
 
The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database.
Line 43: Line 42:
 
=====Schema file=====
 
=====Schema file=====
  
The schema file defines the dimensions and measures for the cube and maps them to the DB tables. The product and location dimensions contain several levels each. Each level defines its corresponding field in the DB table. The different values of the field become members of the level.  
+
The schema file is an XML file that defines the dimensions and measures for the cube and maps them to the database tables. The product and location dimensions contain several levels each. Each level defines its corresponding field in the database table. The different values of the field become the members of the level.  
  
 
'''Note''': The levels must be specified in the intended order of breakdown.
 
'''Note''': The levels must be specified in the intended order of breakdown.
Line 123: Line 122:
  
 
{| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5"  
 
{| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5"  
!width="250px"|Element!!width="550px"|Description
+
!width="250px"|Element!!width="750px"|Description
 
|-
 
|-
 
|<Cube>||A collection of dimensions and measures, all centered on a fact table.
 
|<Cube>||A collection of dimensions and measures, all centered on a fact table.
 
|-
 
|-
|<VirtualCube>||A cube defined by combining the dimensions and measures of one or more cubes. A measure originating from another cube can be a <CalculatedMember>.
+
|<VirtualCube>||A cube defined by combining the dimensions and measures of one or more cubes. <br>A measure originating from another cube can be a &lt;CalculatedMember&gt;.
 
|-
 
|-
 
|<CubeUsages>||Base cubes that are imported into a virtual cube
 
|<CubeUsages>||Base cubes that are imported into a virtual cube
Line 149: Line 148:
 
|<NameExpression>||SQL expression used to compute the name of a member, in lieu of Level.nameColumn.
 
|<NameExpression>||SQL expression used to compute the name of a member, in lieu of Level.nameColumn.
 
|-
 
|-
|<CaptionExpression>||SQL expression used to compute the caption of a member, in lieu ofLevel.captionColumn.
+
|<CaptionExpression>||SQL expression used to compute the caption of a member, in lieu of Level.captionColumn.
 
|-
 
|-
|<OrdinalExpression>||SQL expression used to sort members of a level, in lieu ofLevel.ordinalColumn.
+
|<OrdinalExpression>||SQL expression used to sort members of a level, in lieu of Level.ordinalColumn.
 
|-
 
|-
|<ParentExpression>||SQL expression used to compute a measure, in lieu ofLevel.parentColumn.
+
|<ParentExpression>||SQL expression used to compute a measure, in lieu of Level.parentColumn.
 
|-
 
|-
 
|<Property>||Member property. The definition is against a hierarchy or level, but the property will be available to all members.
 
|<Property>||Member property. The definition is against a hierarchy or level, but the property will be available to all members.
Line 167: Line 166:
  
 
{| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5"  
 
{| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5"  
!width="250px"|Element!!width="550px"|Description
+
!width="250px"|Element!!width="750px"|Description
 
|-
 
|-
 
|&lt;Table>||Fact or dimension table.
 
|&lt;Table>||Fact or dimension table.
 
|-
 
|-
|<View>||Defines a 'table' using a SQL query, which can have different variants for different underlying databases.
+
|<View>||Defines a (virtual) table using an SQL query, which can have different variants for different underlying databases.
 
|-
 
|-
|<Join>||Defines a 'table' by joining a set of queries.
+
|<Join>||Defines a (virtual) table by joining a set of queries.
 
|-
 
|-
 
|<InlineTable>||Defines a table using an inline dataset.
 
|<InlineTable>||Defines a table using an inline dataset.
Line 213: Line 212:
 
|<CubeGrant>||A set of rights to a cube.
 
|<CubeGrant>||A set of rights to a cube.
 
|-
 
|-
|<HierarchyGrant>||A set of rights to a hierarchy and levels within that hierarchy.
+
|<HierarchyGrant>||A set of rights to a hierarchy and to levels within that hierarchy.
 
|-
 
|-
 
|<MemberGrant>||A set of rights to a member and its children.
 
|<MemberGrant>||A set of rights to a member and its children.
Line 235: Line 234:
 
|<PropertyFormatter>||Property formatter.
 
|<PropertyFormatter>||Property formatter.
 
|-
 
|-
|<Script>||Script fragment to implement an SPI such as user-defined function, member formatter, or cell formatter.
+
|<Script>||Script fragment to implement an SPI such as a user-defined function, a member formatter, or a cell formatter.
 
|}
 
|}
  
Line 263: Line 262:
 
|<Value>||Value of a column in an <InlineTable> dataset.
 
|<Value>||Value of a column in an <InlineTable> dataset.
 
|-
 
|-
|<MeasureExpression>||SQL expression used to compute a measure, in lieu ofMeasure.column.
+
|<MeasureExpression>||SQL expression used to compute a measure, in lieu of Measure.column.
 
|-
 
|-
 
|<SQL>||The SQL expression for a particular database dialect.
 
|<SQL>||The SQL expression for a particular database dialect.
 
|}
 
|}
 +
 +
=====Cubes=====
 +
 +
A Cube is a named collection of Measures and Dimensions. The one thing the Measures and Dimensions have in common is the fact table. The fact table holds the columns from which Measures are calculated, and contains references to the tables which hold the Dimensions.
 +
 +
► '''Syntax''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Cube name="{cube name}">
 +
    <Table name="{table name}"/>
 +
    ...
 +
</Cube>
 +
</syntaxhighlight>
 +
 +
 +
The fact table is defined using the <code>&lt;Table></code> element. If the fact table is not in the default schema, you can provide an explicit schema using the <code>schema</code> attribute, for example:
 +
 +
 +
► '''Syntax''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Table schema="{schema name}" name="{table name}"/>
 +
</syntaxhighlight>
 +
 +
You can also use the <code><View></code> construct to build more complicated SQL statements.<br>
 +
The <code><Join></code> construct is not supported for fact tables.
 +
 +
=====Measures=====
 +
 +
Each Measure has a <code>name</code>, a <code>column</code> in the fact table, and an <code>aggregator</code>.
 +
 +
► '''Syntax''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Measure name="{measure name}" column="{column name}"
 +
    aggregator="{aggregator name}" datatype="{type name}"
 +
    formatString="{format}"/>
 +
</syntaxhighlight>
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Measure name="Store Sales" column="store_sales" aggregator="sum"
 +
    datatype="Numeric" formatString="#,###.00"/>
 +
</syntaxhighlight>
 +
 +
 +
The aggregators are
 +
 +
*'''sum''' - Returns the sum value of the specified set expression.
 +
*'''count''' - Returns the number of elements in the specified set.
 +
*'''min''' - Returns the minimum value of the specified set expression.
 +
*'''max''' - Returns the maximum value of the specified set expression.
 +
*'''avg''' - Computes the average value of the specified set.
 +
*'''distinct-count''' - Returns the number of unique, non-empty elements in the specified set. It has some limitations if your cube contains a parent-child hierarchy.
 +
 +
The optional <code>datatype</code> attribute specifies how cell values are represented in Mondrian's cache, and how they are returned via XML for analysis.
 +
 +
The <code>datatype</code> attribute can have values:
 +
 +
*'''String'''
 +
*'''Integer''', default value for "'''count'''" and "'''distinct-count'''" measures
 +
*'''Numeric''', default value, except for "'''count'''" and "'''distinct-count'''" measures
 +
*'''Boolean'''
 +
*'''Date'''
 +
*'''Time'''
 +
*'''Timestamp'''.
 +
 +
An optional <code>formatString</code> attribute specifies how the value is to be printed. The ',' and '.' symbols are locale-sensitive.
 +
 +
A Measure can have a <code>caption</code> attribute to be returned by the Member.getCaption() method instead of the name. Defining a specific caption does make sense if special letters (e.g., Σ or Π) are to be displayed.
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Measure name="Sum X" column="sum_x" aggregator="sum" caption="&#931; X"/>
 +
</syntaxhighlight>
 +
 +
=====Dimensions, Hierarchies, Levels=====
 +
 +
A member is a point within a Dimension determined by a particular set of attribute values. The ''gender'' Hierarchy has the two members 'M' and 'F'. 'San Francisco', 'California' and 'USA' are all members of the ''store'' Hierarchy.
 +
 +
A Hierarchy is a set of members organized into a structure for convenient analysis. For example, the ''store'' Hierarchy consists of the store name, city, state, and nation. The Hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of the stores in that city.
 +
 +
A Level is a collection of members which have the same distance from the root of the Hierarchy.
 +
 +
A Dimension is a collection of Hierarchies which discriminate on the same fact-table attribute (say, the day that a sale occurred).
 +
 +
► '''Syntax''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="{dimension name}" foreignKey="{key name}">
 +
    <Hierarchy hasAll="true" primaryKey="{key name}">
 +
        <Table name="{table name}"/>
 +
        <Level name="{level name}" column="{column name}"
 +
            uniqueMembers="true"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="Gender" foreignKey="customer_id">
 +
    <Hierarchy hasAll="true" primaryKey="customer_id">
 +
        <Table name="customer"/>
 +
        <Level name="Gender" column="gender" uniqueMembers="true"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
This Dimension consists of a single Hierarchy, which consists of a single Level called ''Gender''.
 +
 +
The values for the Dimension come from the ''gender'' column in the ''customer'' table. The ''gender'' column contains two values, 'F' and 'M', so the ''Gender'' Dimension contains the members <code>[Gender].[F]</code> and <code>[Gender].[M]</code>.
 +
 +
For any given sale, the ''Gender'' Dimension is the gender of the customer who made that purchase. This is expressed by joining from the fact table "sales_fact_1997.customer_id" to the Dimension table "customer.customer_id".
 +
 +
 +
 +
► '''Syntax''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Table schema="{schema name}" name="{table name}"/>
 +
</syntaxhighlight>
 +
 +
You can also use the <code><View></code> construct to build more complicated SQL statements. The <code><Join></code> construct is not supported for fact tables.
 +
 +
 +
======"ALL" member======
 +
 +
By default, every Hierarchy contains a top Level called <code>(All)</code>, which contains a single member called <code>(All {hierarchyName})</code>. This member is parent of all other members of the Hierarchy, and thus represents a grand total. It is also the default member of the Hierarchy; that is, the member which is used for calculating cell values when the Hierarchy is not included on an axis or in the slicer. The <code>allMemberName</code> and <code>allLevelName</code> attributes override the default names of the all Level and all member.
 +
 +
If the <code><Hierarchy></code> element has <code>hasAll="false"</code>, the 'all' Level is suppressed. The default member of that Dimension will now be the first member of the first Level; for example, in a ''Time'' Hierarchy, it will be the first year in the Hierarchy. Changing the default member can be confusing, so you should generally set <code>hasAll="true"</code>.
 +
 +
======Time dimensions======
 +
 +
Time dimensions based on year/month/week/day are coded differently in the Mondrian schema due to the MDX time-related functions:
 +
*ParallelPeriod([level[, index[, member]]]) returns a member from a previous/next period with the same position in the level as the specified member.
 +
*PeriodsToDate([level[, member]]) returns the set of members starting from the first (earliest) sibling of the specified member and ending with the specified member itself.
 +
*WTD([member]) returns PeriodsToDate( Week_Level, Member ).
 +
*MTD([member]) returns PeriodsToDate( Month_Level, Member ).
 +
*QTD([member]) returns PeriodsToDate( Quarter_Level, Member ).
 +
*YTD([member]) returns PeriodsToDate( Year_Level, Member ).
 +
*LastPeriod(index[, member]) returns the Level members that stand before the specified member, and including the specified member itself.
 +
 +
Time dimensions have <code>type="TimeDimension"</code>. The role of a Level in a time dimension is indicated by the Level's <code>levelType</code> attribute, whose allowable values include:
 +
 +
{| class="wikitable sortable" border="0" cellspacing="0" cellpadding="5"
 +
!width="250px"|levelType value!!width="750px"|Meaning
 +
|-
 +
|TimeYears||Level is a year.
 +
|-
 +
|TimeQuarters||Level is a quarter.
 +
|-
 +
|TimeMonths||Level is a month.
 +
|-
 +
|TimeWeeks||Level is a week.
 +
|-
 +
|TimeDays||Level is a day.
 +
|}
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="Time" type="TimeDimension">
 +
    <Hierarchy hasAll="true" allMemberName="All Periods" primaryKey="dateid">
 +
        <Table name="datehierarchy"/>
 +
        <Level name="Year" column="year" uniqueMembers="true"
 +
            levelType="TimeYears" type="Numeric"/>
 +
        <Level name="Quarter" column="quarter" uniqueMembers="false"
 +
            levelType="TimeQuarters"/>
 +
        <Level name="Month" column="month" uniqueMembers="false"
 +
            ordinalColumn="month" nameColumn="month_name" levelType="TimeMonths"
 +
            type="Numeric"/>
 +
        <Level name="Week" column="week_in_month" uniqueMembers="false"
 +
            levelType="TimeWeeks"/>
 +
        <Level name="Day" column="day_in_month" uniqueMembers="false"
 +
            ordinalColumn="day_in_month" nameColumn="day_name"
 +
            levelType="TimeDays" type="Numeric"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
======Multiple hierarchies======
 +
 +
A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension.
 +
 +
======Shared dimensions======
 +
 +
A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension.
 +
 +
When generating the SQL for a join, Mondrian needs to know which column to join to. If you are joining to a join, then you need to tell it which of the tables in the join that column belongs to (usually, it will be the first table in the join).
 +
Because shared dimensions don't belong to a Cube, you have to give them an explicit table (or other data source). When you use them in a particular Cube, you specify the foreign key. This example shows the ''Store Type'' Dimension being joined to the ''Sales'' Cube using <code>sales_fact_1997.store_id</code> as the <code>foreignKey</code>, and to the ''Warehouse'' Cube using <code>warehouse.warehouse_store_id</code> as the <code>foreignKey</code>:
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="Store Type">
 +
    <Hierarchy hasAll="true" primaryKey="store_id">
 +
        <Table name="store"/>
 +
        <Level name="Store Type" column="store_type"
 +
        uniqueMembers="true"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
 +
<Cube name="Sales">
 +
    <Table name="sales_fact_1997"/>
 +
    ...
 +
    <DimensionUsage name="Store Type" source="Store Type"
 +
    foreignKey="store_id"/>
 +
</Cube>
 +
 +
<Cube name="Warehouse">
 +
    <Table name="warehouse"/>
 +
    ...
 +
    <DimensionUsage name="Store Type" source="Store Type"
 +
    foreignKey="warehouse_store_id"/>
 +
</Cube>
 +
</syntaxhighlight>
 +
 +
======Snowflake schema (join)======
 +
 +
A Dimension can be based upon more than one table, provided that there is a well-defined path to join these tables to the fact table. This kind of Dimension is known as a "snowflake", and is defined using the <code><Join></code> operator.
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="xml">
 +
<Cube name="Sales">
 +
...
 +
    <Dimension name="Product" foreignKey="product_id">
 +
        <Hierarchy hasAll="true" primaryKey="product_id"
 +
        primaryKeyTable="product">
 +
            <Join leftKey="product_class_key" rightAlias="product_class"
 +
            rightKey="product_class_id">
 +
                <Table name="product"/>
 +
                <Join leftKey="product_type_id" rightKey="product_type_id">
 +
                    <Table name="product_class"/>
 +
                    <Table name="product_type"/>
 +
                </Join>
 +
            </Join>
 +
            <!-- Level declarations ... -->
 +
        </Hierarchy>
 +
    </Dimension>
 +
</Cube>
 +
</syntaxhighlight>
 +
 +
 +
This defines a "Product" Dimension consisting of three tables:<br>
 +
*The fact table joins to the <code>product</code> table via <code>foreignKey="product_id"</code>, which in turn joins to the
 +
*<code>product_class</code> table via <code>foreignKey="product_class_id"</code>, which in turn joins to the
 +
*<code>product_type</code> table via <code>foreignKey="product_type_id"</code>.
 +
We require a <code><Join></code> element nested within a <code><Join></code> element because <code><Join></code> takes two operands;  the operands can be tables, joins, or even queries.
 +
 +
The arrangement of the tables seems complex, but the simple rule of thumb is to order the tables by the number of rows they contain. The <code>product</code> table has the most rows, so it joins to the fact table and appears first. <code>product_class</code> has fewer rows, and <code>product_type</code>, at the tip of the snowflake, has fewest of all.
 +
 +
Note that the outer <code><Join></code> element has a <code>rightAlias</code> attribute. This is necessary because the right component of the join (the inner <code><Join></code> element) consists of more than one table. No <code>leftAlias</code> attribute is necessary in this case, because the <code>leftKey</code> column unambiguously comes from the <code>product</code> table.
 +
 +
==Cube DSL==
 +
 +
A Cube can be considered a multi-dimensional generalization of a two- or three-dimensional spreadsheet. Each cell of the cube holds a number that represents some measure of the business, such as sales, profits, expenses, budget and forecast. The elements of a dimension can be organized as a hierarchy, that is, a set of parent-child relationships, typically where a parent member summarizes its children.
 +
 +
CubeDSL generates the schema file.
 +
 +
The main semantic elements of the CubeDSL are:
 +
*'''package''' - The root element that contains all the other elements. A model can contain multiple packages.
 +
*'''schema''' - Defines the name of the Mondrian schema file
 +
*'''import''' declarations - Used to import external models or even Java classes.
 +
*'''cube''' - The abstraction of a Mondrian cube. It contains further elements such as dimension, measure.
 +
*'''dimension''' -  Defines an attribute, or set of attributes, which is a collection of hierarchies, and by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer and the store in which the product was sold;  color, gender, and store are all dimensions.
 +
*'''hierarchy''' - Defines a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation. A hierarchy allows you to form intermediate sub-totals:  the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of all the stores in that city.
 +
*'''entity''' - Defines which table and columns of the table the “hierarchy” uses.
 +
*'''level''' - Defines a collection of members which have the same distance from the root of the hierarchy.
 +
*'''property''' - Defines a member property. The definition is against a level, but the property is available to all members.
 +
*'''dimensionUsage''' - Defines the shared dimension usage in a cube.
 +
*'''measure''' - The members of a special dimension, a quantity being measured, for example, unit sales of a product, or cost price of inventory items.
 +
 +
===Keywords===
 +
 +
The CubeDSL generates a schema file in XML file type. It should be defined with the following keywords in a file with the ''.cube'' extension:
 +
 +
====package, schema====
 +
► '''Syntax''':
 +
<syntaxhighlight lang="java">
 +
package <package name> schema <schema name>{
 +
  ...
 +
}
 +
</syntaxhighlight>
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="java">
 +
package local.test4doku.cubes schema TEST{
 +
  ...
 +
}
 +
</syntaxhighlight>
 +
 +
It generates the following schema code in an XML file:
 +
 +
<syntaxhighlight lang="xml">
 +
<Schema name="TEST">
 +
  ...
 +
</Schema>
 +
</syntaxhighlight>
 +
 +
====dimension, hierarchy, level, property====
 +
► '''Syntax''':
 +
<syntaxhighlight lang="java">
 +
dimension <dimension name> {
 +
    hierarchy [hierarchy name] [has All] [allMemberName=<all member name>]
 +
    [defaultMember=<default member>]{
 +
        entity <entity name> key=<key name>  {
 +
            level <level name> column=<column name> [uniqueMembers];
 +
            ...
 +
        }
 +
    }
 +
    ...
 +
}
 +
</syntaxhighlight>
 +
 +
'''Notes:'''
 +
*If the hierarchy name is not defined, the default value is the dimension name.
 +
*By default, every hierarchy contains a top level called <code>(ALL)</code>, which contains a single member called <code>(All {hierarchy name})</code>. This member is the parent of all other members of the hierarchy and also the default member of the hierarchy. Use <code>allMemberName</code> to override the default name of the <code>All</code> member.
 +
*If <code>hasAll="false"</code>, the default member of the dimension will be the first member of the first level. Use <code>defaultMember</code> to override the default member of the hierarchy.
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="java">
 +
dimension CustomerGenders {
 +
    hierarchy allMemberName="All Genders" defaultMember="[All Genders]"{
 +
        entity Tcustomer key=cus_id  {
 +
            level Gender column=gender ;
 +
        }
 +
    }
 +
}
 +
</syntaxhighlight>
 +
 +
It generates the following schema code in an XML file:
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="CustomerGenders">
 +
    <Hierarchy hasAll="false" allMemberName="All Genders" defaultMember="[All
 +
    Genders]" primaryKey="CUS_ID">
 +
        <Table name="TCUSTOMER" />
 +
        <Level name="Gender" column="GENDER" uniqueMembers="false"
 +
        type="String"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
 +
=====join=====
 +
► '''Syntax''':
 +
<syntaxhighlight lang="java">
 +
dimension <dimension name> {
 +
    hierarchy [<hierarchy name>] [has All] [allMemberName=<all member name>]
 +
    [defaultMember=<default member>] {
 +
        entity <entity name> key=<primarykey name>  {
 +
            level <level name> column=<column name> [uniqueMembers];
 +
            ...
 +
            entity <entity name> [key=<key name>]  {
 +
                over <leftkey name>;
 +
                level <level name> column=<column name> [uniqueMembers];
 +
                ...
 +
            }
 +
        }
 +
    }
 +
    ...
 +
}
 +
</syntaxhighlight>
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="java">
 +
dimension ProductTypes {
 +
    hierarchy hasAll {
 +
        entity Tproduct key=pro_id {
 +
            level proName column=pro_name uniqueMembers;
 +
               
 +
            entity Tproduct_type key=pro_type_id {
 +
                over type;
 +
                level proTypeName column=pro_type_name uniqueMembers;
 +
            }
 +
        }
 +
    }
 +
}
 +
</syntaxhighlight>
 +
 +
It generates the following schema code in an XML file:
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="ProductTypes">
 +
    <Hierarchy hasAll="true" primaryKey="PRO_ID" primaryKeyTable="TPRODUCT">
 +
        <Join leftKey="TYPE_ID" rightKey="PRO_TYPE_ID">
 +
            <Table name="TPRODUCT" />
 +
            <Table name="TPRODUCT_TYPE"/>
 +
        </Join>
 +
        <Level name="proTypeName" table="TPRODUCT_TYPE"
 +
        column="PRO_TYPE_NAME" uniqueMembers="true" type="String"/>
 +
        <Level name="proName" table="TPRODUCT" column="PRO_NAME"
 +
        uniqueMembers="true" type="String"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
'''Notes''':
 +
*The default key is <code>ID</code>.
 +
*For the reference to an entity, e.g., ''type'' in Tproduct_type, the name of the key in the database is automatically defined as <code>type_id</code>.
 +
 +
=====typeTime=====
 +
► '''Syntax''':
 +
<syntaxhighlight lang="java">
 +
dimension <dimension name> typeTime {
 +
    hierarchy <hierarchy name> [has All] [allMemberName=<all member name>]
 +
    [defaultMember=<default member>]{
 +
        entity <entity name> key=<key name>  {
 +
            level <level name> column=<column name> [uniqueMembers]
 +
            levelType=<type name>;
 +
            ...
 +
        }
 +
    }
 +
    ...
 +
}
 +
</syntaxhighlight>
 +
'''Note''':
 +
*<code>levelType</code> can have the values <code>Time</code>, <code>TimeSeconds</code>, <code>TimeMinutes</code>, <code>TimeHours</code>, <code>TimeDays</code>, <code>TimeWeeks</code>, <code>TimeMonths</code>, <code>TimeQuarters</code>, <code>TimeHalfYears</code>, <code>TimeYears</code> (see the definitions in the section [[#Time dimensions]])
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="java">
 +
dimension TimeTheDate typeTime {
 +
    hierarchy {
 +
        entity Ttime_by_day key=the_date {
 +
            level Year column=the_year uniqueMembers levelType=TimeYears;
 +
            level Quarter column=quarter levelType=TimeQuarters;
 +
            level Month column=month_of_year levelType=TimeMonths;
 +
        }
 +
    }
 +
}
 +
</syntaxhighlight>
 +
 +
It generates the following schema code in an XML file:
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="TimeTheDate" type="TimeDimension">
 +
    <Hierarchy hasAll="false" primaryKey="THE_DATE">
 +
        <Table name="TTIME_BY_DAY" schema=""/>
 +
        <Level name="Year" column="THE_YEAR" uniqueMembers="true"
 +
            levelType="TimeYears" type="String"/>
 +
        <Level name="Quarter" column="QUARTER" uniqueMembers="false"
 +
            levelType="TimeQuarters" type="String"/>
 +
        <Level name="Month" column="MONTH_OF_YEAR" uniqueMembers="false"
 +
            levelType="TimeMonths" type="Integer"/>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
=====property=====
 +
► '''Syntax''':
 +
<syntaxhighlight lang="java">
 +
dimension <dimension name> {
 +
    hierarchy <hierarchy name> [has All] [allMemberName=<all member name>]
 +
    [defaultMember=<default member>]{
 +
        entity <entity name> key=<key name>  {
 +
            level <level name> column=<column name> [uniqueMembers]{
 +
                property <property name> column=<column name>
 +
                [type=<type name>];
 +
            }
 +
            ...
 +
        }
 +
    }
 +
    ...
 +
}
 +
</syntaxhighlight>
 +
'''Note''':
 +
*'''property type''' can be <code>Boolean</code>, <code>Date</code>, <code>Integer</code>, <code>Numeric</code>, <code>String</code>, <code>Time</code> or <code>TimeStamp</code>.
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="java">
 +
dimension ProductTypes {
 +
    hierarchy {
 +
        entity TproductType key= pro_type_id {
 +
            level proTypeId column= pro_type_id uniqueMembers {
 +
                property proTypeName column= pro_type_name type=String;
 +
            }               
 +
        }
 +
    }
 +
}
 +
</syntaxhighlight>
 +
 +
It generates the following schema code in an XML file:
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="ProductTypes">
 +
    <Hierarchy  hasAll="false" primaryKey="PRO_TYPE_ID">
 +
        <Table name="TPRODUCT_TYPE" schema=""/>
 +
        <Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true"
 +
        type="Integer">
 +
            <Property name="proTypeName" column="PRO_TYPE_NAME"
 +
            type="String" />
 +
        </Level>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
 +
====cube, dimensionUsage, measure====
 +
► '''Syntax''':
 +
<syntaxhighlight lang="java">
 +
cube <cube name> [defaultMeasure <measure name>]{
 +
    entity <entity name> [key=<key name>]{
 +
        dimensionUsage <dimension name> over <key name>;
 +
        ...
 +
        measure <measure name> <aggregator name> column=<column name>
 +
        [notVisible];
 +
        ...
 +
    }
 +
    ...
 +
}
 +
</syntaxhighlight>
 +
'''Note''':
 +
*Aggregator functions include <code>avg</code>, <code>sum</code>, <code>min</code>, <code>max</code>, <code>count</code>, <code>distinct-count</code> (see the definitions in the section [[#Measures]] ).
 +
*Mondrian takes the first measure as the default measure. You can override this behavior by setting <code>defaultMeasure</code>.
 +
 +
 +
 +
► '''Example''':
 +
 +
<syntaxhighlight lang="java">
 +
dimension ProductTypes {
 +
    hierarchy {
 +
        entity TproductType key= pro_type_id {
 +
            level proTypeId column= pro_type_id uniqueMembers {
 +
                property proTypeName column= pro_type_name type=String;
 +
            }               
 +
        }
 +
    }
 +
}
 +
</syntaxhighlight>
 +
 +
It generates the following schema code in an XML file:
 +
 +
<syntaxhighlight lang="xml">
 +
<Dimension name="ProductTypes">
 +
    <Hierarchy  hasAll="false" primaryKey="PRO_TYPE_ID">
 +
        <Table name="TPRODUCT_TYPE" schema=""/>
 +
        <Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true"
 +
        type="Integer">
 +
            <Property name="proTypeName" column="PRO_TYPE_NAME"
 +
            type="String" />
 +
        </Level>
 +
    </Hierarchy>
 +
</Dimension>
 +
</syntaxhighlight>
 +
== Copyright Notice ==
 +
{{Copyright Notice}}

Latest revision as of 19:22, 4 April 2017


Introduction

MDX and OLAP Cube

MDX(MultiDimensional eXpressions) is a query language for OLAP (OnLine Analytical Processing) databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas.

SQL was designed to query three-dimensional data structures, called tables, where data are organized in rows, columns and values. In OLAP, data are organized around multiple measures, dimensions, hierarchies, and levels.

Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels are the centerpieces of MDX. In this way, business models are mapped to language-specific concepts (e.g. a list of countries will be mapped as an MDX dimension). Measures in MDX are a special case of dimensions and they hold numeric values. The concrete value of a dimension (e.g. Paris) is a member of the dimension and is similar to the value contained in an SQL table.

An OLAP Cube is a collection of dimensions indexing a list of measures. For example, a geometrical cube can have three dimensions, whereas a logistical "cube" can have many more dimensions.


More information can be found at:

http://en.wikipedia.org/wiki/MultiDimensional_eXpressions

http://en.wikipedia.org/wiki/OLAP_cube

Mondrian

Mondrian is an open source OLAP server, written in Java. It supports the MDX query language and XML for analysis and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache.

Mondrian is used for:

  • high performance, interactive analysis of large or small volumes of information
  • dimensional exploration of data, for example analysis of sales by product line, by region, by time period
  • parsing the MDX language into SQL to retrieve answers to dimensional queries
  • high-speed queries through the use of aggregate tables in the RDBMS (Relational DataBase Management System)
  • advanced calculations using the calculation expressions of the MDX language

Schema

A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model.

The logical model consists of the constructs used to write queries in the MDX language: cubes, dimensions, hierarchies, levels, and members.

The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database.

Schema file

The schema file is an XML file that defines the dimensions and measures for the cube and maps them to the database tables. The product and location dimensions contain several levels each. Each level defines its corresponding field in the database table. The different values of the field become the members of the level.

Note: The levels must be specified in the intended order of breakdown.

  • The structure of the XML document is as follows:
    <Schema>
       <Cube>
          <Table>
             <AggName>
                aggElements
             <AggPattern>
                aggElements
             <Dimension>
                <Hierarchy>
                   relation
                <Closure/>
                <Level>
                <KeyExpression>
                   <SQL/>
                <NameExpression>
                   <SQL/>
                <CaptionExpression>
                   <SQL/>
                <OrdinalExpression>
                   <SQL/>
                <ParentExpression>
                   <SQL/>
                <Property>
                   <PropertyExpression>
                      <SQL/>
             <DimensionUsage>
             <Measure>
                <MeasureExpression>
                   <SQL/>
                <CalculatedMemberProperty/>
             <CalculatedMember>
                <Formula/>
                <CalculatedMemberProperty/>
             <NamedSet>
                <Formula/>
             <VirtualCube>
                <CubeUsages>
                   <CubeUsage>
                <VirtualCubeDimension>
                <VirtualCubeMeasure>
             <Role>
                <SchemaGrant>
                   <CubeGrant>
                      <DimensionGrant>
                      <HierarchyGrant>
                         <MemberGrant/>
          <Union>
             <RoleUsage/>
       <UserDefinedFunction/>
       <Parameter/>
    

  • relation ::=
       <Table>
          <SQL/>
       <View>
          <SQL/>
       <InlineTable>
          <ColumnDefs>
             <ColumnDef>
          <Rows>
             <Row>
                <Value>
       <Join>
          relation
    

  • aggElement ::=
       <AggExclude>
       <AggFactCount>
       <AggIgnoreColumn>
       <AggForeignKey>
       <AggMeasure>
       <AggLevel>
    

Logical elements
Element Description
<Cube> A collection of dimensions and measures, all centered on a fact table.
<VirtualCube> A cube defined by combining the dimensions and measures of one or more cubes.
A measure originating from another cube can be a <CalculatedMember>.
<CubeUsages> Base cubes that are imported into a virtual cube
<CubeUsage> Usage of a base cube by a virtual cube.
<VirtualCubeDimension> Usage of a dimension by a virtual cube.
<VirtualCubeMeasure> Usage of a measure by a virtual cube.
<Dimension> Dimension
<DimensionUsage> Usage of a shared dimension by a cube.
<Hierarchy> Hierarchy
<Level> Level of a hierarchy
<KeyExpression> SQL expression used as key of the level, in lieu of a column.
<NameExpression> SQL expression used to compute the name of a member, in lieu of Level.nameColumn.
<CaptionExpression> SQL expression used to compute the caption of a member, in lieu of Level.captionColumn.
<OrdinalExpression> SQL expression used to sort members of a level, in lieu of Level.ordinalColumn.
<ParentExpression> SQL expression used to compute a measure, in lieu of Level.parentColumn.
<Property> Member property. The definition is against a hierarchy or level, but the property will be available to all members.
<Measure> Measure
<CalculatedMember> A member whose value is derived using a formula, defined as part of a cube.
<NamedSet> A set whose value is derived using a formula, defined as part of a cube.
Physical elements
Element Description
<Table> Fact or dimension table.
<View> Defines a (virtual) table using an SQL query, which can have different variants for different underlying databases.
<Join> Defines a (virtual) table by joining a set of queries.
<InlineTable> Defines a table using an inline dataset.
<Closure> Maps a parent-child hierarchy onto a closure table.
Aggregate tables
Element Description
<AggExclude> Exclude a candidate aggregate table by name or pattern matching.
<AggName> Declares an aggregate table to be matched by name.
<AggPattern> Declares a set of aggregate tables by regular expression pattern.
<AggFactCount> Specifies name of the column in the candidate aggregate table which contains the number of fact table rows.
<AggIgnoreColumn> Tells Mondrian to ignore a column in an aggregate table.
<AggForeignKey> Maps foreign key in the fact table to a foreign key column in the candidate aggregate table.
<AggMeasure> Maps a measure to a column in the candidate aggregate table.
<AggLevel> Maps a level to a column in the candidate aggregate table.
Access control
Element Description
<Role> An access-control profile.
<SchemaGrant> A set of rights to a schema.
<CubeGrant> A set of rights to a cube.
<HierarchyGrant> A set of rights to a hierarchy and to levels within that hierarchy.
<MemberGrant> A set of rights to a member and its children.
<Union> Definition of a set of rights as the union of a set of roles.
<RoleUsage> A reference to a Role.
Extensions
Element Description
<UserDefinedFunction> Declares a user-defined function.
<CellFormatter> Cell formatter.
<MemberFormatter> Member formatter.
<PropertyFormatter> Property formatter.
<Script> Script fragment to implement an SPI such as a user-defined function, a member formatter, or a cell formatter.
Miscellaneous
Element Description
<Annotations> Holder for annotations.
<Annotation> User-defined property attached to a metadata element.
<Parameter> Part of the definition of a Hierarchy; passed to a MemberReader, if present.
<CalculatedMemberProperty> Property of a calculated member.
<Formula> Holds the formula text within a <NamedSet> or <CalculatedMember>.
<ColumnDefs> Holder for <ColumnDef> elements.
<ColumnDef> Definition of a column in an <InlineTable> dataset.
<Rows> Holder for <Row> elements.
<Row> Row in an <InlineTable> dataset.
<Value> Value of a column in an <InlineTable> dataset.
<MeasureExpression> SQL expression used to compute a measure, in lieu of Measure.column.
<SQL> The SQL expression for a particular database dialect.
Cubes

A Cube is a named collection of Measures and Dimensions. The one thing the Measures and Dimensions have in common is the fact table. The fact table holds the columns from which Measures are calculated, and contains references to the tables which hold the Dimensions.

Syntax:

<Cube name="{cube name}">
    <Table name="{table name}"/>
    ...
</Cube>


The fact table is defined using the <Table> element. If the fact table is not in the default schema, you can provide an explicit schema using the schema attribute, for example:


Syntax:

<Table schema="{schema name}" name="{table name}"/>

You can also use the <View> construct to build more complicated SQL statements.
The <Join> construct is not supported for fact tables.

Measures

Each Measure has a name, a column in the fact table, and an aggregator.

Syntax:

<Measure name="{measure name}" column="{column name}" 
    aggregator="{aggregator name}" datatype="{type name}" 
    formatString="{format}"/>


Example:

<Measure name="Store Sales" column="store_sales" aggregator="sum" 
    datatype="Numeric" formatString="#,###.00"/>


The aggregators are

  • sum - Returns the sum value of the specified set expression.
  • count - Returns the number of elements in the specified set.
  • min - Returns the minimum value of the specified set expression.
  • max - Returns the maximum value of the specified set expression.
  • avg - Computes the average value of the specified set.
  • distinct-count - Returns the number of unique, non-empty elements in the specified set. It has some limitations if your cube contains a parent-child hierarchy.

The optional datatype attribute specifies how cell values are represented in Mondrian's cache, and how they are returned via XML for analysis.

The datatype attribute can have values:

  • String
  • Integer, default value for "count" and "distinct-count" measures
  • Numeric, default value, except for "count" and "distinct-count" measures
  • Boolean
  • Date
  • Time
  • Timestamp.

An optional formatString attribute specifies how the value is to be printed. The ',' and '.' symbols are locale-sensitive.

A Measure can have a caption attribute to be returned by the Member.getCaption() method instead of the name. Defining a specific caption does make sense if special letters (e.g., Σ or Π) are to be displayed.

Example:

<Measure name="Sum X" column="sum_x" aggregator="sum" caption="&#931; X"/>
Dimensions, Hierarchies, Levels

A member is a point within a Dimension determined by a particular set of attribute values. The gender Hierarchy has the two members 'M' and 'F'. 'San Francisco', 'California' and 'USA' are all members of the store Hierarchy.

A Hierarchy is a set of members organized into a structure for convenient analysis. For example, the store Hierarchy consists of the store name, city, state, and nation. The Hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of the stores in that city.

A Level is a collection of members which have the same distance from the root of the Hierarchy.

A Dimension is a collection of Hierarchies which discriminate on the same fact-table attribute (say, the day that a sale occurred).

Syntax:

<Dimension name="{dimension name}" foreignKey="{key name}">
    <Hierarchy hasAll="true" primaryKey="{key name}">
        <Table name="{table name}"/>
        <Level name="{level name}" column="{column name}" 
            uniqueMembers="true"/>
    </Hierarchy>
</Dimension>


Example:

<Dimension name="Gender" foreignKey="customer_id">
    <Hierarchy hasAll="true" primaryKey="customer_id">
         <Table name="customer"/>
         <Level name="Gender" column="gender" uniqueMembers="true"/>
    </Hierarchy>
</Dimension>

This Dimension consists of a single Hierarchy, which consists of a single Level called Gender.

The values for the Dimension come from the gender column in the customer table. The gender column contains two values, 'F' and 'M', so the Gender Dimension contains the members [Gender].[F] and [Gender].[M].

For any given sale, the Gender Dimension is the gender of the customer who made that purchase. This is expressed by joining from the fact table "sales_fact_1997.customer_id" to the Dimension table "customer.customer_id".


Syntax:

<Table schema="{schema name}" name="{table name}"/>

You can also use the <View> construct to build more complicated SQL statements. The <Join> construct is not supported for fact tables.


"ALL" member

By default, every Hierarchy contains a top Level called (All), which contains a single member called (All {hierarchyName}). This member is parent of all other members of the Hierarchy, and thus represents a grand total. It is also the default member of the Hierarchy; that is, the member which is used for calculating cell values when the Hierarchy is not included on an axis or in the slicer. The allMemberName and allLevelName attributes override the default names of the all Level and all member.

If the <Hierarchy> element has hasAll="false", the 'all' Level is suppressed. The default member of that Dimension will now be the first member of the first Level; for example, in a Time Hierarchy, it will be the first year in the Hierarchy. Changing the default member can be confusing, so you should generally set hasAll="true".

Time dimensions

Time dimensions based on year/month/week/day are coded differently in the Mondrian schema due to the MDX time-related functions:

  • ParallelPeriod([level[, index[, member]]]) returns a member from a previous/next period with the same position in the level as the specified member.
  • PeriodsToDate([level[, member]]) returns the set of members starting from the first (earliest) sibling of the specified member and ending with the specified member itself.
  • WTD([member]) returns PeriodsToDate( Week_Level, Member ).
  • MTD([member]) returns PeriodsToDate( Month_Level, Member ).
  • QTD([member]) returns PeriodsToDate( Quarter_Level, Member ).
  • YTD([member]) returns PeriodsToDate( Year_Level, Member ).
  • LastPeriod(index[, member]) returns the Level members that stand before the specified member, and including the specified member itself.

Time dimensions have type="TimeDimension". The role of a Level in a time dimension is indicated by the Level's levelType attribute, whose allowable values include:

levelType value Meaning
TimeYears Level is a year.
TimeQuarters Level is a quarter.
TimeMonths Level is a month.
TimeWeeks Level is a week.
TimeDays Level is a day.


Example:

<Dimension name="Time" type="TimeDimension">
    <Hierarchy hasAll="true" allMemberName="All Periods" primaryKey="dateid">
        <Table name="datehierarchy"/>
        <Level name="Year" column="year" uniqueMembers="true" 
            levelType="TimeYears" type="Numeric"/>
        <Level name="Quarter" column="quarter" uniqueMembers="false" 
            levelType="TimeQuarters"/>
        <Level name="Month" column="month" uniqueMembers="false" 
            ordinalColumn="month" nameColumn="month_name" levelType="TimeMonths" 
            type="Numeric"/>
        <Level name="Week" column="week_in_month" uniqueMembers="false" 
            levelType="TimeWeeks"/>
        <Level name="Day" column="day_in_month" uniqueMembers="false" 
            ordinalColumn="day_in_month" nameColumn="day_name" 
            levelType="TimeDays" type="Numeric"/>
    </Hierarchy>
</Dimension>
Multiple hierarchies

A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension.

Shared dimensions

A Dimension can contain more than one Hierarchy. By default, a Hierarchy has the same name as its Dimension.

When generating the SQL for a join, Mondrian needs to know which column to join to. If you are joining to a join, then you need to tell it which of the tables in the join that column belongs to (usually, it will be the first table in the join). Because shared dimensions don't belong to a Cube, you have to give them an explicit table (or other data source). When you use them in a particular Cube, you specify the foreign key. This example shows the Store Type Dimension being joined to the Sales Cube using sales_fact_1997.store_id as the foreignKey, and to the Warehouse Cube using warehouse.warehouse_store_id as the foreignKey:


Example:

<Dimension name="Store Type">
    <Hierarchy hasAll="true" primaryKey="store_id">
        <Table name="store"/>
        <Level name="Store Type" column="store_type" 
        uniqueMembers="true"/>
    </Hierarchy>
</Dimension>

<Cube name="Sales">
    <Table name="sales_fact_1997"/>
    ...
    <DimensionUsage name="Store Type" source="Store Type" 
    foreignKey="store_id"/>
</Cube>

<Cube name="Warehouse">
    <Table name="warehouse"/>
    ...
    <DimensionUsage name="Store Type" source="Store Type" 
    foreignKey="warehouse_store_id"/>
</Cube>
Snowflake schema (join)

A Dimension can be based upon more than one table, provided that there is a well-defined path to join these tables to the fact table. This kind of Dimension is known as a "snowflake", and is defined using the <Join> operator.

Example:

<Cube name="Sales">
...
    <Dimension name="Product" foreignKey="product_id">
        <Hierarchy hasAll="true" primaryKey="product_id" 
        primaryKeyTable="product">
            <Join leftKey="product_class_key" rightAlias="product_class" 
            rightKey="product_class_id">
                <Table name="product"/>
                <Join leftKey="product_type_id" rightKey="product_type_id">
                    <Table name="product_class"/>
                    <Table name="product_type"/>
                </Join>
            </Join>
            <!-- Level declarations ... -->
        </Hierarchy>
    </Dimension>
</Cube>


This defines a "Product" Dimension consisting of three tables:

  • The fact table joins to the product table via foreignKey="product_id", which in turn joins to the
  • product_class table via foreignKey="product_class_id", which in turn joins to the
  • product_type table via foreignKey="product_type_id".

We require a <Join> element nested within a <Join> element because <Join> takes two operands; the operands can be tables, joins, or even queries.

The arrangement of the tables seems complex, but the simple rule of thumb is to order the tables by the number of rows they contain. The product table has the most rows, so it joins to the fact table and appears first. product_class has fewer rows, and product_type, at the tip of the snowflake, has fewest of all.

Note that the outer <Join> element has a rightAlias attribute. This is necessary because the right component of the join (the inner <Join> element) consists of more than one table. No leftAlias attribute is necessary in this case, because the leftKey column unambiguously comes from the product table.

Cube DSL

A Cube can be considered a multi-dimensional generalization of a two- or three-dimensional spreadsheet. Each cell of the cube holds a number that represents some measure of the business, such as sales, profits, expenses, budget and forecast. The elements of a dimension can be organized as a hierarchy, that is, a set of parent-child relationships, typically where a parent member summarizes its children.

CubeDSL generates the schema file.

The main semantic elements of the CubeDSL are:

  • package - The root element that contains all the other elements. A model can contain multiple packages.
  • schema - Defines the name of the Mondrian schema file
  • import declarations - Used to import external models or even Java classes.
  • cube - The abstraction of a Mondrian cube. It contains further elements such as dimension, measure.
  • dimension - Defines an attribute, or set of attributes, which is a collection of hierarchies, and by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer and the store in which the product was sold; color, gender, and store are all dimensions.
  • hierarchy - Defines a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation. A hierarchy allows you to form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of all the stores in that city.
  • entity - Defines which table and columns of the table the “hierarchy” uses.
  • level - Defines a collection of members which have the same distance from the root of the hierarchy.
  • property - Defines a member property. The definition is against a level, but the property is available to all members.
  • dimensionUsage - Defines the shared dimension usage in a cube.
  • measure - The members of a special dimension, a quantity being measured, for example, unit sales of a product, or cost price of inventory items.

Keywords

The CubeDSL generates a schema file in XML file type. It should be defined with the following keywords in a file with the .cube extension:

package, schema

Syntax:

package <package name> schema <schema name>{
   ...
}

Example:

package local.test4doku.cubes schema TEST{
   ...
}

It generates the following schema code in an XML file:

<Schema name="TEST">
   ...
</Schema>

dimension, hierarchy, level, property

Syntax:

dimension <dimension name> {
    hierarchy [hierarchy name] [has All] [allMemberName=<all member name>]
    [defaultMember=<default member>]{
        entity <entity name> key=<key name>  {
            level <level name> column=<column name> [uniqueMembers];
            ...
        }
    }
    ...
}

Notes:

  • If the hierarchy name is not defined, the default value is the dimension name.
  • By default, every hierarchy contains a top level called (ALL), which contains a single member called (All {hierarchy name}). This member is the parent of all other members of the hierarchy and also the default member of the hierarchy. Use allMemberName to override the default name of the All member.
  • If hasAll="false", the default member of the dimension will be the first member of the first level. Use defaultMember to override the default member of the hierarchy.


Example:

dimension CustomerGenders {
    hierarchy allMemberName="All Genders" defaultMember="[All Genders]"{
        entity Tcustomer key=cus_id  {
            level Gender column=gender ;
        }
    }
}

It generates the following schema code in an XML file:

<Dimension name="CustomerGenders">
    <Hierarchy hasAll="false" allMemberName="All Genders" defaultMember="[All 
    Genders]" primaryKey="CUS_ID">
        <Table name="TCUSTOMER" />
        <Level name="Gender" column="GENDER" uniqueMembers="false" 
        type="String"/>
    </Hierarchy>
</Dimension>


join

Syntax:

dimension <dimension name> {
    hierarchy [<hierarchy name>] [has All] [allMemberName=<all member name>] 
    [defaultMember=<default member>] {
        entity <entity name> key=<primarykey name>  {
            level <level name> column=<column name> [uniqueMembers];
            ...
            entity <entity name> [key=<key name>]  {
                over <leftkey name>;
                level <level name> column=<column name> [uniqueMembers];
                ...
            }
        }
    }
    ...
}


Example:

dimension ProductTypes {
    hierarchy hasAll {
        entity Tproduct key=pro_id {
            level proName column=pro_name uniqueMembers;
                
            entity Tproduct_type key=pro_type_id {
                over type;
                level proTypeName column=pro_type_name uniqueMembers;
            }
        }
    }
}

It generates the following schema code in an XML file:

<Dimension name="ProductTypes">
    <Hierarchy hasAll="true" primaryKey="PRO_ID" primaryKeyTable="TPRODUCT">
        <Join leftKey="TYPE_ID" rightKey="PRO_TYPE_ID">
            <Table name="TPRODUCT" />
            <Table name="TPRODUCT_TYPE"/>
        </Join>
        <Level name="proTypeName" table="TPRODUCT_TYPE" 
        column="PRO_TYPE_NAME" uniqueMembers="true" type="String"/>
        <Level name="proName" table="TPRODUCT" column="PRO_NAME" 
        uniqueMembers="true" type="String"/>
    </Hierarchy>
</Dimension>

Notes:

  • The default key is ID.
  • For the reference to an entity, e.g., type in Tproduct_type, the name of the key in the database is automatically defined as type_id.
typeTime

Syntax:

dimension <dimension name> typeTime {
    hierarchy <hierarchy name> [has All] [allMemberName=<all member name>] 
    [defaultMember=<default member>]{
        entity <entity name> key=<key name>  {
            level <level name> column=<column name> [uniqueMembers] 
            levelType=<type name>;
            ...
        }
    }
    ...
}

Note:

  • levelType can have the values Time, TimeSeconds, TimeMinutes, TimeHours, TimeDays, TimeWeeks, TimeMonths, TimeQuarters, TimeHalfYears, TimeYears (see the definitions in the section #Time dimensions)


Example:

dimension TimeTheDate typeTime { 
    hierarchy { 
        entity Ttime_by_day key=the_date {
            level Year column=the_year uniqueMembers levelType=TimeYears;
            level Quarter column=quarter levelType=TimeQuarters;
            level Month column=month_of_year levelType=TimeMonths;
        }
    }
}

It generates the following schema code in an XML file:

<Dimension name="TimeTheDate" type="TimeDimension">
    <Hierarchy hasAll="false" primaryKey="THE_DATE">
        <Table name="TTIME_BY_DAY" schema=""/>
        <Level name="Year" column="THE_YEAR" uniqueMembers="true" 
            levelType="TimeYears" type="String"/>
        <Level name="Quarter" column="QUARTER" uniqueMembers="false" 
            levelType="TimeQuarters" type="String"/>
        <Level name="Month" column="MONTH_OF_YEAR" uniqueMembers="false" 
            levelType="TimeMonths" type="Integer"/>
    </Hierarchy>
</Dimension>
property

Syntax:

dimension <dimension name> {
    hierarchy <hierarchy name> [has All] [allMemberName=<all member name>] 
    [defaultMember=<default member>]{
        entity <entity name> key=<key name>  {
            level <level name> column=<column name> [uniqueMembers]{
                property <property name> column=<column name> 
                [type=<type name>];
            }
            ...
        }
    }
    ...
}

Note:

  • property type can be Boolean, Date, Integer, Numeric, String, Time or TimeStamp.


Example:

dimension ProductTypes {
    hierarchy {
        entity TproductType key= pro_type_id {
            level proTypeId column= pro_type_id uniqueMembers {
                property proTypeName column= pro_type_name type=String;
            }                
        }
    }
}

It generates the following schema code in an XML file:

<Dimension name="ProductTypes">
    <Hierarchy  hasAll="false" primaryKey="PRO_TYPE_ID">
        <Table name="TPRODUCT_TYPE" schema=""/>
        <Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true" 
        type="Integer">
            <Property name="proTypeName" column="PRO_TYPE_NAME" 
            type="String" />
        </Level>
    </Hierarchy>
</Dimension>

cube, dimensionUsage, measure

Syntax:

cube <cube name> [defaultMeasure <measure name>]{
    entity <entity name> [key=<key name>]{
        dimensionUsage <dimension name> over <key name>;
        ...
        measure <measure name> <aggregator name> column=<column name> 
        [notVisible];
        ...
    }
    ...
}

Note:

  • Aggregator functions include avg, sum, min, max, count, distinct-count (see the definitions in the section #Measures ).
  • Mondrian takes the first measure as the default measure. You can override this behavior by setting defaultMeasure.


Example:

dimension ProductTypes {
    hierarchy {
        entity TproductType key= pro_type_id {
            level proTypeId column= pro_type_id uniqueMembers {
                property proTypeName column= pro_type_name type=String;
            }                
        }
    }
}

It generates the following schema code in an XML file:

<Dimension name="ProductTypes">
    <Hierarchy  hasAll="false" primaryKey="PRO_TYPE_ID">
        <Table name="TPRODUCT_TYPE" schema=""/>
        <Level name="proTypeId" column="PRO_TYPE_ID" uniqueMembers="true" 
        type="Integer">
            <Property name="proTypeName" column="PRO_TYPE_NAME" 
            type="String" />
        </Level>
    </Hierarchy>
</Dimension>

Copyright Notice

All rights are reserved by Compex Systemhaus GmbH. In particular, duplications, translations, microfilming, saving and processing in electronic systems are protected by copyright. Use of this manual is only authorized with the permission of Compex Systemhaus GmbH. Infringements of the law shall be punished in accordance with civil and penal laws. We have taken utmost care in putting together texts and images. Nevertheless, the possibility of errors cannot be completely ruled out. The Figures and information in this manual are only given as approximations unless expressly indicated as binding. Amendments to the manual due to amendments to the standard software remain reserved. Please note that the latest amendments to the manual can be accessed through our helpdesk at any time. The contractually agreed regulations of the licensing and maintenance of the standard software shall apply with regard to liability for any errors in the documentation. Guarantees, particularly guarantees of quality or durability can only be assumed for the manual insofar as its quality or durability are expressly stipulated as guaranteed. If you would like to make a suggestion, the Compex Team would be very pleased to hear from you.

(c) 2016-2024 Compex Systemhaus GmbH