IT professionals, data analysts, and anyone whose work involves making sense of data rely on tools to get things done. At one end are files in a folder on your Windows desktop with Excel spreadsheets and Word documents. On the other end are complex data models forecasting the future being fed by terabytes of data loaded in real-time by multi-million dollar investments in infrastructure. Your job is to figure out how best to get the job done.
There are many options for tools in the database and business intelligence industry. To name a few, there’s Microsoft SQL Server/BIDS, Oracle/BI, IBM/Cognos, Teradata, and Sybase/IQ. The choice of which to choose, for the IT and data professional, comes down to experience and market demand. I’ve chosen to learn as much as I can about Microsoft’s offerings because of my five years of experience with the suite, the active community surrounding the tools, and the potential to get more work in the field using these skills.
Why SQL Server?
Microsoft SQL Server is undoubtedly one of the major players in the database industry. It has steadily improved since its Sybase roots in the early 90s and has attracted an involved community of users, developers, and administrators. Bundled with many packages of SQL Server is the Microsoft Business Intelligence Developer Studio, abbreviated as BIDS, which serves numerous important functions in managing, processing, analyzing, and reporting on data stored both in SQL Server databases and elsewhere. BIDS offers great value for business and provides a vast opportunity for professionals to develop important skills.
The Components of BIDS
BIDS is comprised of three major types of projects – Integration Services (SSIS), Reporting Services (SSRS), and Analysis Services (SSAS).
SSIS is a set of tools that allows developers to move and process data in an intelligent and timely manner. The basic idea is to take data from one location and transfer it to another while making necessary transformations to the data in between. This process of extracting from a source, transforming in between, and loading the data to a destination is abbreviated as an often used term called ETL. One or more ETL procedures that can grouped together into a package and these packages can then be run to perform the ETLs’ functions.
Furthermore, database administrators can use SQL Server’s built-in scheduling tool, called SQL Agent, to schedule SSIS packages to run at certain times of the day. This allows for scheduling of a batch of ETL packages during off-business hours, such as in the middle of the night.
SSRS is a set of tools to develop and view reports based off data sources. Because reports, whether tabular or graphical, are so important to decision makers in businesses, developing reports and supplying them in an accurate and timely manner is a cornerstone of business intelligence. SSRS provides an interface to pull data from data sources into what are called datasets. Reports are then formatted based on one or more datasets. Reporting Services also provides a website-based viewing functionality where reports can be viewed based on a user’s reporting access permissions.
It should be noted that while the SSRS report viewing functionality is very useful, many business opt for other user portals such as custom built solutions or Microsoft’s SharePoint suite.
SSAS provides functionality which I like to group into two sections – multidimensional modeling and analysis, and data mining. Multidimensional modeling boils down to the development of measures and dimensions organized into what is called a cube. If you consider a database table to be a square, with its two dimensions row and column and its value per row & column intersection, then a multidimensional cube is a table with more than two dimensions. For example, sales for a particular product can be seen in a particular month, in a particular state, split out by all stores in that state. It’s this “slice and dice” functionality that makes SSAS appealing to many users.
Users can access these cubes via interfaces such as Excel, report writers can use the MDX query language to pull data into datasets, and statisticians can use them to create data mining models. The data mining functionality allows for numerous models to be performed on a data set, such as correlation. This functionality is one that I am still studying and will be certain to explain once I learn more.
The Whole is Greater than the Sum of Its Parts
Learning and using SQL Server can be very useful for your company and for your professional development. The addition of SSIS, SSRS, and SSAS to your environment can greatly improve your company’s ROI on technology, your daily productivity, and your value in the industry marketplace. It’s worth your while to check out these tools and get a feel for their capabilities.
How to get Started
In the next article in the Technical category of posts I’ll explain how to get evaluation copies of SQL Server and BIDS, as well as a development version, and how to put together a virtual machine to run SQL Server 2008 R2 evaluation edition and SQL Server 2008 R2.