SSAS 04: Data Mining Tutorials

Microsoft SQL Server Analysis Services makes it easy to create data mining solutions using wizards and integrated visualizations. Particularly if you are new to machine learning, the tools in Analysis Services are an easy way to design, train, and explore data mining models. The data in your models can be stored in a cube, relational database, or any other source support by Analysis Services. After creating a model, you can put it into production by accessing the model to create predictions using prediction multiple clients, including Integration Services and ASP.NET.

Tutorials

Basic Data Mining Tutorial (SQL Server 2014) – This tutorial walks you through a targeted mailing scenario. It demonstrates how to use the data mining algorithms, mining model viewers, and data mining tools that are included in Analysis Services. You will build three data mining models to answer practical business questions while learning data mining concepts and tools.

Intermediate Data Mining Tutorial (SQL Server 2014) – This tutorial contains a collection of lessons that introduce more advanced data mining concepts and techniques such as, forecasting, market basket analysis, neural networks and logistic regression, and sequence clustering.

DMX Tutorials (SQL Server 2014) – The Data Mining Extensions (DMX) query language has syntax like that of SQL but can be used to create, query, and manage predictive models stored in Analysis Services. These tutorials demonstrate how to create a new mining structure and mining models by using the DMX language, and how to create DMX prediction queries for use in applications.

See Also

Data Mining Solutions
Microsoft SQL Server Data Mining resources
Creating and Querying Data Mining Models with DMX: Tutorials (Analysis Services – Data Mining)

SSAS 03: Multidimensional Modeling Tutorial

This tutorial describes how to use SQL Server Data Tools to develop and deploy an Analysis Services project, using the fictitious company Adventure Works Cycles for all examples.

Lessons

This tutorial includes the following lessons.

Lesson Estimated time to complete
Lesson 1: Defining a Data Source View within an Analysis Services Project 15 minutes
Lesson 2: Defining and Deploying a Cube 30 minutes
Lesson 3: Modifying Measures, Attributes and Hierarchies 45 minutes
Lesson 4: Defining Advanced Attribute and Dimension Properties 120 minutes
Lesson 5: Defining Relationships Between Dimensions and Measure Groups 45 minutes
Lesson 6: Defining Calculations 45 minutes
Lesson 7: Defining Key Performance Indicators (KPIs) 30 minutes
Lesson 8: Defining Actions 30 minutes
Lesson 9: Defining Perspectives and Translations 30 minutes
Lesson 10: Defining Administrative Roles 15 minutes

What you learn

In this tutorial, you will learn the following:

  • How to define data sources, data source views, dimensions, attributes, attribute relationships, hierarchies, and cubes in an Analysis Services project within SQL Server Data Tools.
  • How to view cube and dimension data by deploying the Analysis Services project to an instance of Analysis Services, and how to then process the deployed objects to populate them with data from the underlying data source.
  • How to modify the measures, dimensions, hierarchies, attributes, and measure groups in the Analysis Services project, and how to then deploy the incremental changes to the deployed cube on the development server.
  • How to define calculations, Key Performance Indicators (KPIs), actions, perspectives, translations, and security roles within a cube.

A scenario description accompanies this tutorial so that you can better understand the context for these lessons. For more information, see Analysis Services Tutorial Scenario.

Prerequisites

You will need sample data, sample project files, and software to complete all of the lessons in this tutorial. For instructions on how to find and install the prerequisites for this tutorial, see Install Sample Data and Projects for the Analysis Services Multidimensional Modeling Tutorial.

Additionally, the following permissions must be in place to successfully complete this tutorial:

  • You must be a member of the Administrators local group on the Analysis Services computer or be a member of the server administration role in the instance of Analysis Services.
  • You must have Read permissions in the AdventureWorksDW sample database. This sample database is valid for the SQL Server 2017 release.

SSAS 02: Tabular Modelling Tutorial

This tutorial provides lessons on how to create and deploy a tabular model at the 1400 compatibility level. If you’re new to Analysis Services and tabular modeling, completing this tutorial is the quickest way to learn how to create and deploy a basic tabular model by using Visual Studio. Once you have the prerequisites in-place, it should take two to three hours to complete.

Lessons

This tutorial includes the following lessons:

Supplemental lessons

These lessons are not required to complete the tutorial, but can be helpful in better understanding advanced tabular model authoring features.

Lesson Estimated time to complete
Detail Rows 10 minutes
Dynamic security 30 minutes
Ragged hierarchies 20 minutes

What you learn

  • How to create a new tabular model project at the 1400 compatibility level in Visual Studio with SSDT.
  • How to import data from a relational database into a tabular model project workspace database.
  • How to create and manage relationships between tables in the model.
  • How to create calculated columns, measures, and Key Performance Indicators that help users analyze critical business metrics.
  • How to create and manage perspectives and hierarchies that help users more easily browse model data by providing business and application-specific viewpoints.
  • How to create partitions that divide table data into smaller logical parts that can be processed independent from other partitions.
  • How to secure model objects and data by creating roles with user members.
  • How to deploy a tabular model to an Azure Analysis Services server or SQL Server 2017 Analysis Services server by using SSDT.

Prerequisites

To complete this tutorial, you need:

Scenario

This tutorial is based on Adventure Works Cycles, a fictitious company. Adventure Works is a large, multinational manufacturing company that produces and distributes bicycles, parts, and accessories for commercial markets in North America, Europe, and Asia. The company employs 500 workers. Additionally, Adventure Works employs several regional sales teams throughout its market base. Your project is to create a tabular model for sales and marketing users to analyze Internet sales data in the AdventureWorksDW database.

To complete the tutorial, you must complete various lessons. In each lesson, there are tasks. Completing each task in order is necessary for completing the lesson. While in a particular lesson there may be several tasks that accomplish a similar outcome, but how you complete each task is slightly different. This method shows there is often more than one way to complete a task, and to challenge you by using skills you’ve learned in previous lessons and tasks.

The purpose of the lessons is to guide you through authoring a basic tabular model by using many of the features included in SSDT. Because each lesson builds upon the previous lesson, you should complete the lessons in order.

 

 

SSAS 01 : SQL Server Analysis Services Tutorial Scenario

 


 

This tutorial is based on Adventure Works Cycles, a fictitious company. Adventure Works Cycles is a large, multinational manufacturing company that produces and distributes metal and composite bicycles to commercial markets in North America, Europe, and Asia. The headquarters for Adventure Works Cycles is Bothell, Washington, where the company employs 500 workers. Additionally, Adventure Works Cycles employs several regional sales teams throughout its market base.

In recent years, Adventure Works Cycles bought a small manufacturing plant, Importadores Neptuno, which is located in Mexico. Importadores Neptuno manufactures several critical subcomponents for the Adventure Works Cycles product line. These subcomponents are shipped to the Bothell location for final product assembly. In 2005, Importadores Neptuno became the sole manufacturer and distributor of the touring bicycle product group.

Following a successful fiscal year, Adventure Works Cycles now wants to broaden its market share by targeting advertising to its best customers, extending product availability through an external Web site, and reducing the cost of sales by reducing production costs.

Current Analysis Environment

To support the data analysis needs of the sales and marketing teams and of senior management, the company currently takes transactional data from the AdventureWorks2012 database, and non-transactional information such as sales quotas from spreadsheets, and consolidates this information into the AdventureWorksDW2012 relational data warehouse. However, the relational data warehouse presents the following challenges:

  • Reports are static. Users have no way to interactively explore the data in the reports to obtain more detailed information, such as they could do with a Microsoft Office Excel pivot table. Although the existing set of predefined reports is sufficient for many users, more advanced users need direct query access to the database for interactive queries and specialized reports. However, because of the complexity of the AdventureWorksDW2012 database, too much time is needed for such users to master how to create effective queries.
  • Query performance is widely variable. For example, some queries return results very quickly, in only a few seconds, while other queries take several minutes to return.
  • Aggregate tables are difficult to manage. In an attempt to improve query response times, the data warehouse team at Adventure Works built several aggregate tables in the AdventureWorksDW2012 database. For example, they built a table that summarizes sales by month. However, while these aggregate tables greatly improve query performance, the infrastructure that they built to maintain the tables over time is fragile and prone to errors.
  • Complex calculation logic is buried in report definitions and is difficult to share between reports. Because this business logic is generated separately for each report, summary information sometimes is different between reports. Therefore, management has limited confidence in the data warehouse reports.
  • Users in different business units are interested in different views of the data. Each group is distracted and confused by data elements that are irrelevant to them.
  • Calculation logic is particularly challenging for users who need specialized reports. Because such users must define the calculation logic separately for each report, there is no centralized control over how the calculation logic is defined. For example, some users know that they should use basic statistical techniques such as moving averages, but they do not know how to construct such calculations and so do not use these techniques.
  • It is difficult to combine related sets of information. Specialized queries that combine two sets of related information, such as sales and sales quotas, are difficult for business users to construct. Such queries overwhelmed the database, so the company requires that users request cross-subject-area sets of data from the data warehouse team. As a result, only a handful of predefined reports have been defined that combine data from multiple subject areas. Additionally, users are reluctant to try to modify these reports because of their complexity.
  • Reports are focused primarily on business information in the United States. Users in the non-U.S. subsidiaries are very dissatisfied with this focus, and want to be able to view reports in different currencies and different languages.
  • Information is difficult to audit. The Finance department currently uses the AdventureWorksDW2012 database only as a source of data from which to query in bulk. They then download the data into individual spreadsheets, and spend significant time preparing the data and manipulating the spreadsheets. The corporate financial reports are therefore difficult to prepare, audit, and manage across the company.

The Solution

The data warehouse team recently performed a design review of the current analysis system. The review included a gap analysis of current issues and future demands. The data warehouse team determined that the AdventureWorksDW2012 database is a well-designed dimensional database with conformed dimensions and surrogate keys. Conformed dimensions enable a dimension to be used in multiple data marts, such as a time dimension or a product dimension. Surrogate keys are artificial keys that link dimension and fact tables and that are used to ensure uniqueness and to improve performance. Moreover, the data warehouse team determined that there currently are no significant problems with the loading and management of the base tables in the AdventureWorksDW2012 database. The team has therefore decided to use Microsoft Analysis Services to accomplish the following:

  • Provide unified data access through a common metadata layer for analytical analysis and reporting.
  • Simplify users’ view of data, speeding the development of both interactive and predefined queries and predefined reports.
  • Correctly construct queries that combine data from multiple subject areas.
  • Manage aggregates.
  • Store and reuse complex calculations.
  • Present a localized experience to business users outside the United States.

The lessons in the Analysis Services tutorial provide guidance in building a cube database that meets all of these goals. To get started, continue to the first lesson: SSAS 02: Create a New Tabular Model Project.


Analysis Services is an analytical data engine used in decision support and business analytics. It provides enterprise-grade semantic data models for business reports and client applications such as Power BI, Excel, Reporting Services reports, and other data visualization tools.

A typical workflow includes creating a tabular or multidimensional data model project in Visual Studio, deploying the model as a database to a server instance, setting up recurring data processing, and assigning permissions to allow data access by end-users. When it’s ready to go, your semantic data model can be accessed by client applications supporting Analysis Services as a data source.

Analysis Services is available in two different platforms:

  1. Azure Analysis Services – Supports tabular models at the 1200 and higher compatibility levels. DirectQuery, partitions, row-level security, bi-directional relationships, and translations are all supported. To learn more, see Azure Analysis Services.
  2. SQL Server Analysis Services – Supports tabular models at all compatibility levels, multidimensional models, data mining, and Power Pivot for SharePoint.

The series of articles are mostly for SQL Server Analysis Services. However, at least for tabular models, how you create and deploy your tabular model projects is much the same, regardless of the platform you’re using. Check out these sections to learn more:

In general, Azure Analysis Services documentation is included with Azure documentation. If you’re interested in having your tabular models in the cloud, it’s best to start there.