Start with 7 free days of training.

Gain instant access to our entire IT training library, free for your first week.
Train anytime on your desktop, tablet, or mobile devices.

Implementing a Data Warehouse with Microsoft SQL Server 2012

This SQL Server video training course with Garth Schulte covers Microsoft’s database management system including designing and implementing dimensions, troubleshooting data integration issues, and implementing master data services....
This SQL Server video training course with Garth Schulte covers Microsoft’s database management system including designing and implementing dimensions, troubleshooting data integration issues, and implementing master data services.

SQL Server business intelligence developers are rock stars in the eyes of an organization's decision makers. Managing and providing high-level data through a data warehouse helps businesses make accurate decisions that save time and money.

This course provides you with the skills you need to create and manage a data warehouse in SQL Server 2012. Learn how to design and implement a data warehouse, implement and automate an ETL solution using SQL Server Integration Services to load warehouse data, and utilize Master Data Services along with Data Quality Services to standardize, enrich, and evolve data over time.

Exam 70-463 is the final step toward attaining the Microsoft MCSA: SQL Server 2012 certification, which completes the pre-requisites toward both MCSE certifications: Data Platform and Business Intelligence. This course helps prepare you for the 70-463 exam and beyond with real-world examples and live demonstrations.

Note: The exam associated with this course was retired December 1, 2015. However, this course still retains value as a training resource. Furthermore, an instructional PDF has been added to the Supplemental Files for your use within the Virtual Lab.

Recommended Experience
  • Experience creating Business Intelligence solutions
  • Experience with data cleansing
  • Experience with Exact Transform Load and data warehouse implementation
Recommended Equipment
  • Microsoft SQL Server 2012
Related Certifications
  • MCSA: SQL Server 2012
  • MCSE: Data Platform
  • MCSE: Business Intelligence
  • MCSM: Data Platform
Related Job Functions
  • IT professionals
  • Database developers
  • Database analysts
  • ETL developers
  • Data warehouse developers
Garth Schulte has been a CBT Nuggets trainer since 2001 and holds a variety of Microsoft and Google certifications spanning system administration, development, databases, cloud, and big data.
 show less
1. Introduction to 70-463: Implementing a Data Warehouse with SQL Server 2012 (21 min)
2. Designing and Implementing Dimensions (45 min)
3. Designing and Implementing Fact Tables (35 min)
4. Introduction to SSIS and Connection Managers (54 min)
5. Designing Data Flow (35 min)
6. Implementing Data Flow (46 min)
7. Managing SSIS Package Execution (34 min)
8. Implementing Script Tasks (31 min)
9. Designing Control Flow (22 min)
10. Implementing Control Flow (47 min)
11. Implementing Package Logic (44 min)
12. Data Loading Options (48 min)
13. Implementing Script Components (38 min)
14. Troubleshooting Data Integration Issues (32 min)
15. Installing and Maintaining SSIS Components (22 min)
16. Implementing Auditing, Logging and Event Handlers (21 min)
17. Deploying SSIS Solutions (26 min)
18. Configuring SSIS Security (24 min)
19. Installing and Maintaining Data Quality Services (24 min)
20. Implementing Data Quality Services (29 min)
21. Implementing Master Data Services (23 min)

Introduction to 70-463: Implementing a Data Warehouse with SQL Server 2012

00:00:00

Introduction to 70-463. Implementing a data warehouse with SQL Server 2012. Hey Everyone. Garth Schulte here. Thank you for letting me be your guide through this series. This is going to be an introductory Nugget to talk about everything 70-463. So we'll start with a series overview and just get a feel for what's going to be required of us, some of the technologies we're going to need to understand and work with in order to be successful in creating data warehouses in SQL Server 2012.

00:00:29

So we'll get familiar with what is a data warehouse. What is OLTP versus OLAP-- OLTP versus OLAP, as we call it. What's relational OLAP versus multi-dimensional OLAP? And then, again, we'll talk about all of the technologies that are going to be required of us to pass the 70-463 exam and work with data warehousing, things like SQL Server Integration Services, master data services, data quality services.

00:00:52

And those are going to be, really, the big ones. That's really the meat of the entire series right there. So once we get a good overview of all those technologies and data warehousing, we'll head over to Microsoft's website and look at the detailed objectives that they have laid out for us and also show you how we can relate those objectives back into the Nuggets in this series.

00:01:08

At the end of this Nugget, we're going to head over to the Virtual Nugget Lab and get our environment set up. So we're just going to install the sample databases. Now, if you're not familiar with the Virtual Nugget Lab, it is where we're going to all our demonstrations.

00:01:20

For this series, it's just a single machine over in the Nugget Lab. For other series, it could be many machines networked together. But the idea is you can work on the same environment that I'm working on. You can gain access to the same exact world that I'll be doing the demonstrations on.

00:01:33

And what's really cool about this is it takes you from just watching these Nuggets to being a part of them. It's like you're sitting next to me. Because now, at the beginning of every Nugget our environments are going to be synced. They're going to be the same.

00:01:45

You see me do a demonstration. You can pause the Nugget and try to do it yourself. Or you could do it at will. It's a great environment for really anything. It's a sandbox for you to learn in, to play in, to follow me in the Nuggets, and really do whatever you want.

00:01:58

So head over to nuggetlab.com or cbtnuggets.com to find out more information on how you can gain access to the virtual Nugget Lab and work on the same machine that I'll be working on in this series. So the 70-463 exam is really focused on data warehousing and the surrounding technologies we can use to load and manage the data inside of a data warehouse.

00:02:18

So first off here, what is a data warehouse? And really, to fully understand what a data warehouse is, we need to look at the big picture. And the big picture really starts down at the detail level in what we call our operational data store. And this is in the OLTP database, an online transaction processing database.

00:02:34

This is a relational database where all the detailed data flows in all day long. So it could be from a website where customers are ordering things all day long. That's all your detailed data, all those orders' detailed data. It could be an internal application where you have a bunch of data entry users that are popping data into the system all day long.

00:02:53

So OLTP databases are our transactional detail data. The other side of the database world is OLAP, O-L-A-P. And that stands for online analytical data processing. And there's actually a couple of different flavors of OLAP out there. The first one is the one that this series really focuses on, which is actually ROLAP, which stands for relational online analytical processing.

00:03:17

And the other one we have is a MOLAP, which stands for multidimensional online analytical processing. First of all, OLAP in general is really the summarized data. It is taking all of our detailed data, rolling it up, and storing it as summarized data in either of these worlds.

00:03:38

So why do we need to do that? Well, imagine your OLTP database is a terabyte, or even a petabyte in size that would equate to millions, if not hundreds of millions of rows. Who wants their users running reports against that? Nobody. Especially not the user.

00:03:55

Because that will make for some seriously long wait times generating reports that have to roll up millions of rows of data. So one of the big reasons to have a data warehouse is to pre-process all that information, load it up into a data warehouse, and run reports against the data warehouse where all that data is already aggregated.

00:04:17

So reports will be much faster, and then we also take that burden off of our OLTP databases. Now, back to the different flavors of OLAP here. ROLAP, relational data warehousing, is much like constructing your standard OLTP database. We're going to use SQL Server Management Studio.

00:04:33

We're going to build tables, and we're going to define columns. We're going to have multiple tables with relationships in them. And so it's very similar to constructing an OLTP database. The difference lies in the design. Transaction processing databases are complex.

00:04:46

We They need to be complex. They need to be highly normalized so they work efficiently with lots of data the data warehousing world with relational data warehouses is simplified. And it needs to be for reporting purposes, so our business intelligence developers and users can make sense of it.

00:05:03

And we're only working with the rolled-up data anyway. So the structures are actually fairly simple. And again, we want them to be that way. Now, with multi-dimensional OLAP, you're going to generate your data warehouse using a tool known as SQL Server Analysis Services.

00:05:18

So this is your multi-dimensional platform in SQL Server. So you'll fire up SQL Server Data Tools. You'll create Analysis Services projects where you'll design your cubes, your measures, your dimensions, your hierarchies, attributes-- the list goes on and on.

00:05:30

And then you'll deploy your project as a database inside of Analysis Services. So the big difference between these two is how the data's stored. Is it going to be stored in the relational world, or is it going to be stored in specialized objects in Analysis Services?

00:05:44

And when it comes time for you to implement a data warehouse in your organization, how do you determine which one to do? And that's going to come down to a lot of factors, because there's pros and cons to both of these worlds. The big pro for this world, the relational world, is that it's easy to do.

00:05:59

And every DBA should know how to build a transaction processing database. So building a data warehouse, if they don't know how, isn't hard to learn, because there's not a lot to it. It's just a different design philosophy. Outside of that, you're still working with tables and columns and relationships, and it's very familiar.

00:06:17

Finding someone in the Analysis Services world will be a little more challenging, because it's a specialized area, and it's a little different to work with than the relational world. On the other side, there's performance. Performance is going to be paramount to those dealing with ridiculously large databases with hundreds of millions, if not billions of rows and petabyte-sized databases.

00:06:39

Because Analysis Services was built for OLAP, so it's going to outperform the relational world. And that's just scratching the service. Again, there's pros and cons with everything, especially when it comes to data warehousing. Now, this series focuses solely in this exam on relational data warehousing.

00:06:58

Analysis Services, since it is specialized, it's down in the MCSE Business Intelligence Certification track, and that is exam 70-466 that focuses solely on Analysis Services and building multi-dimensional models. So the first couple of Nuggets in this series are going to focus on designing and implementing a data warehouse, getting familiar with the concepts and the design philosophies of getting a relational data warehouse up and running.

00:07:23

Then, as I mentioned, a big chunk of this series focuses on SQL Server Integration Services, which is SQL Server's ETL tool. ETL stands for extract, transform, and load. So designing an ETL solution is a big part of a data warehouse, because we need a way to take the data from our transactional systems, our operational data store, and load it into our data warehouse, and then not only that, but automate that process so we can keep the data refreshed inside of our data warehouse.

00:07:53

So the whole extract part of ETL is getting the data out of our system, out of our transaction processing system, which could be an in one database. It could be in multiple databases. It could span flat files on the file system. It could be in Excel files.

00:08:08

It could be in other relational data storage. It could be anywhere. So the extract part is getting the data out of all of those sources and bringing it into SQL Server Integration Services, which we can then use to transform the data into the format that we need to get it into to push it into or load our data warehouse.

00:08:27

So that's what ETL is all about-- extract it, transform it, and push it into our data warehouse. So we'll get very familiar with Integration Services in this series. We're also going to get familiar with how to implement Master Data Services, which is SQL Servers' way to implement master data management.

00:08:42

What is master data management? Well, it's a way that we can create master lists from non-transactional data, such as customers, employees, products. Those are all your non-transactional lists of data. And the whole reason that we would want that kind of stuff is, let's say that we have multiple databases in our application, and they all implement their own customer module.

00:09:05

Well, that's going to cause problems with inconsistencies, right? So Master Data Services gives us a way to create a master listing, keep it consistent across all of our databases. And there's some neat components in Master Data Services that make it easy to work with.

00:09:19

For us on the administration side, we have a web interface that we can use to manage our models and to come up with our business rules surrounding our master lists. For the user there's an Excel add-in that we can have them work with so they can manage their lists from Excel rather than us needing to give them access to SQL Server.

00:09:37

We're also going to get familiar with a brand-new feature here in SQL 2012 called Data Quality Services. And this is a really cool feature. What this allows us to do is enrich our data. So it's one of those things where we're going to make our data better, and our data is going to make us smarter.

00:09:53

And we can do this by correction and deduplication, a couple of ways that we can clean up our data. And it really runs as a service. So it's always running in the background, monitoring our data, profiling our data, correcting our data. So this is one of the things that we're going to create a knowledge base, and we're going to improve our data over time.

00:10:10

And our data is going to improve us, because we're going to understand it better, and we're going to be able to create better business rules around that data. We can also utilize cloud-based reference providers. So with Data Quality Services, we can tap into already-standardized data out there in the cloud, pull it in, and work with it as part of our knowledge base.

00:10:30

We can also have any Data Quality Services from Master Data Services so our lists adhere to our data quality standards. So a couple of great technologies. Really, all of these are great technologies that we're going to get very familiar with here in this series.

00:10:43

Let's take a look at what Microsoft expects from us as far as exam objectives and requirements go. So if you want to get to this page, this is the 70-463 exam page, just Google 70-463. It'll be the first link that pops up. Or you can see the URL up here and just copy that.

00:10:59

And the first tab here just gives us an overview of what the audience profile is, some of the things we'll be expecting to know, and then at the bottom, what kind of credit we get towards what certification. So if you're doing the exams sequentially, 70-461, 462, and 463, this will be the last exam that you'll need to successfully attain your MCSA in SQL Server 2012.

00:11:25

It's also a credit towards the two MCSE certifications, Data Platform and Business Intelligence. If we get into the Skills Measured tab, this will give us a better idea of what to expect. Now, how the Nuggets are laid out in this series, each major bullet point is a Nugget.

00:11:40

So that's how you can relate. If you need to work in a specific area and you're looking at these exam objectives, just look for something like implementing data flow, and then look for that title in the series, implementing data flow. And you'll be able to quickly find what Nugget to watch that will cover all of these objectives.

00:11:58

Now, just to briefly go over the sections here, the beginning of this series, as I mentioned, is going to focus on designing and implementing a data warehouse. We're going to get familiar with dimensions and fact tables and designing our data warehouse structure.

00:12:10

So from there, we're going to start getting familiar with SQL Server Integration Services, where we have a big chunk here on extracting and transforming data. We're going to start with Connection Manager. So I'll get you familiar with SQL Server data tools, packages, and how to define our source and our destination connections.

00:12:26

And I want to get you familiar with a lot of different data providers. Because in the real world or to be working with flat files. You're going to be working with Excel spreadsheets. You're going to be working with Access databases. You're going to be working with Oracle databases.

00:12:38

There's just a ton of databases out there. So rather than just do SQL Server SQL Server, we'll definitely get familiar with multiple data providers. It's good, because every data provider has a different connection string, so it's good to get familiar with a lot of these different connection managers.

00:12:56

So we'll get very familiar with them there. And then we'll take a big look at data flows. Data flow's going to be the main task for transforming data, and taking data when it comes into our package, and trying to get it into a format that we can utilize inside of our data warehouse, where it's going to go into the destination.

00:13:14

So we'll get a good look here at all the different things we can do with the data flow, a lot of the tasks that we can utilize in the data flow to transform data. We'll also take a look, then, at managing SSI's package execution, all the different ways that we can execute a package, from SQL Server Management Studio, to SQL Server Agent, to dtexec, to the dtexec UI, and a lot of different ways here, and even PowerShell here.

00:13:35

So there's a lot of different ways that we can execute packages. And it's good to understand all those ways, because the requirements may call for something different depending on your situation. We'll also take a look at script tasks. And then we'll take a big look at how to load data.

00:13:49

So we'll look at control flow, which is outside of your data flow, and how we control all of our data flows. Because we're not going to have just one data flow. We're going to have many data flows. So how do we handle our flow between all of those points of data?

00:14:03

Well, let's take a look at how to implement package logic using variables and parameters. It's good to keep our packages dynamic. Again, outside of that, it's good to keep our packages' configuration information outside of the packages on the file system.

00:14:18

I love storing all variables and connection string information, everything, outside of the package in an XML file. That way, if we ever need to make changes on the fly, we don't have to open it up in SQL Server Data Tools. We may not even have SQL Server Data Tools if we're on somebody else's machine that we developed a package for.

00:14:36

So it's easier to store all that information in a configuration file where we can just edit the XML. And then the next time the package runs, it'll grab all those values. So we'll take a look at how to do that stuff. We'll also look at, then, how to implement control flow, work with some data load options.

00:14:51

We've got some different strategies that we can use for loading data, so we'll look at how to utilize those, and also work with the script components. So we've got script tasks and script components. We'll look at both of them and how to work with them.

00:15:02

We're also going to take a look at how to configure and deploy SSIS solutions. So we're going to start with troubleshooting data integration issues. Probably one of the more difficult things to do is troubleshoot data issues-- then, on top of that, also package execution issues.

00:15:17

And I say that because I used to work with DTS before it was SSIS. It was Data Transformation Services. And that was even worse to work with. Thankfully, with SSIS we've got a lot of nice tools that we can utilize to help us pinpoint where the issue's coming from.

00:15:33

And the reason is difficult to pinpoint. Data issues-- well, data issues are data issues, and the error messages are never that descriptive. But when it comes to package execution issues, we've got the network to be concerned about, because a lot of times our sources and our destinations aren't on the same machine.

00:15:46

We've got security to worry about. We've got SQL Server Agent and SQL Server to worry about. So there's a lot of different areas that are outside of our control and outside of the package's control. So I'll give you some good real-world tips in here when we get into dealing with data integration and package execution issues.

00:16:03

We'll also take a look here at how to install and maintain SSIS and its components, implementing auditing, logging, and event handling. These are some of those tools that'll really help us pinpoint issue where it's truly coming from and get some more descriptive information on the errors.

00:16:16

We'll also take a look here at how to deploy our SSIS solutions. So do we want to deploy them to the file system? Do we want to deploy them into SQL Server? We'll take a look at a couple of different ways that we can do that. We'll look at security settings, how we can configure package security, amongst other things.

00:16:33

And at the end here, we'll take a look at the data quality solutions, where we'll get very familiar with Data Quality Services and Master Data Services. Why don't we head on over to the Virtual Nugget Lab and get familiar with the machine that we're going to be working on in this series and also get our sample databases up and running.

00:16:50

So here we are in the Virtual Nugget Lab. And if you do have access to the Virtual Nugget Lab and you were to log in for the first Nugget in the series, you would have a clone of where I am right now. And this is a Windows Server 2008 R2 machine. And I installed SQL Server 2012 developer edition, a full installation.

00:17:08

And you can see that by heading into the Start menu, into All Programs, into SQL Server 2012, and you can see all of our tools in here for us. Now, most of this series, we're going to focus on and in SQL Server Management Studio in SQL Server Data Tools.

00:17:22

So I added those, I pinned those into the Start menu. And then as we go through the series, if we work with any more tools, as we will, then we'll add them into here as we go. So we'll be able to access all the tools quickly. The other area that we'll be working heavily in on this virtual machine is in the C drive, the 70-463 Support Files folder.

00:17:44

So as we go through this series, we'll add a folder in here for the specific Nugget that we're working in. And I'll put any support files, scripts, and such inside of there. And you can see right now, I have two data files in the root. One is the AdventureWorks2012 OLTP database.

00:18:00

And the other one is the AdventureWorks2012 data warehouse OLAP database. So our big task for this Nugget is going to be getting these databases attached so we can work with them. So let's fire up SQL Server Management Studio. We'll head down to Start.

00:18:15

And let's just fire it right up from the Start Menu there. And as soon as this gets loaded, we'll get a connection into our default instance. I just installed SQL Server on the default instance, so whenever we connect it, we can just use the name of the PC that we're on here, the virtual machine, which is SQL Nugget.

00:18:31

So we'll hit Connect. And you'll see here, if we expand databases, that we have nothing, just a clean installation with nothing on it. And if we want to attach these databases, we can right click on Database, choose Attach, and now we'll just need to browse-- by hitting the Add button here-- browse to our MDF files, our primary data files.

00:18:50

So let's head up here into 70-463 Support Files. And let's first use the OLTP database. We'll hit OK here. We do not have a LOG file associated with it, so we can remove it. And if we hit OK, that's going to attach it and then create the log file for us for the database.

00:19:07

There we go. So we have the first OLTP database attached. Now let's attach the data warehouse. So we'll right click again. We'll choose Attach. We'll hit Add. We'll browse back into our 70-463 support files, and this time hit our AdventureWorksDW2012 primary data file, hit OK, remove the log file once again, hit OK again, and look at that.

00:19:26

We're up and running. If you're not familiar with the AdventureWorks databases, they're Microsoft-supplied samples that model a fictitious bicycle company. So in the OLTP database, we have tables that track everything from the internals of the company through employees, through customer information, down to the product and inventory information, down to purchasing, and also sales of those products.

00:19:50

In the AdventureWorks data warehouse, we have all of our dimensions and fact tables. So there's a lot of different ways that we provide that we can view the data. So these are good universal samples. And we'll get very familiar with both sides of this world.

00:20:05

And we'll create the ETL process that takes the data from the OLTP database and loads up our data warehouse. In this CBT Nugget, we got an introduction to 70-463, implementing a data warehouse with SQL Server 2012. We started off with a series overview.

00:20:20

I wanted to get you familiar with all the components that we're going to be working with, the bit one being a relational data warehouse, how to design and implement it, and then how to load it by extracting information from an OLTP database, transforming it, and loading our data warehouse, and also how we're going to utilize Master Data Services and Data Quality Services to bring clarity, consistency, standardization, and enrichment to our data.

00:20:46

I also showed you where you could find the exam objectives for the 70-463 exam over Microsoft's website, so you can get some good, detailed information about what's going to be expected of you as far as exam goes. Finally, we jumped into the Virtual Nugget Lab and got familiar with the environment that we're going to be doing all of our demonstrations on and talked about how you can also gain access to the same exact environment that I'll be working in so you can follow along.

00:21:08

Once we got familiar with the environment and the tools that we're going to be working with, we set up those sample databases so that we're ready to go for 70-463. I hope this has been informative for you, and I thank you for viewing.

Designing and Implementing Dimensions

Designing and Implementing Fact Tables

Introduction to SSIS and Connection Managers

Designing Data Flow

Implementing Data Flow

Managing SSIS Package Execution

Implementing Script Tasks

Designing Control Flow

Implementing Control Flow

Implementing Package Logic

Data Loading Options

Implementing Script Components

Troubleshooting Data Integration Issues

Installing and Maintaining SSIS Components

Implementing Auditing, Logging and Event Handlers

Deploying SSIS Solutions

Configuring SSIS Security

Installing and Maintaining Data Quality Services

Implementing Data Quality Services

Implementing Master Data Services

Please help us improve by sharing your feedback on training courses and videos. For customer service questions, please contact our support team. The views expressed in comments reflect those of the author and not of CBT Nuggets. We reserve the right to remove comments that do not adhere to our community standards.

comments powered by Disqus
Intermediate 13 hrs 21 videos

COURSE RATING

Training Features


Practice Exams
These practice tests help you review your knowledge and prepare you for exams.

Virtual Lab
Use a virtual environment to reinforce what you are learning and get hands-on experience.

Offline Training
Our iOS and Android mobile apps offer the ability to download videos and train anytime, anywhere offline.

Accountability Coaching
Develop and maintain a study plan with one-to-one assistance from coaches.

Supplemental Files
Files/materials that supplement the video training.

Speed Control
Play videos at a faster or slower pace.

Bookmarks
Included in this course
Pick up where you left off watching a video.

Notes
Included in this course
Jot down information to refer back to at a later time.

Closed Captions
Follow what the trainers are saying with ease.
Garth Schulte
Nugget trainer since 2002