SQL Server 2008: Database Infrastructure and Scalability: Topics for Discussion

December 23, 2009 Visitor Blogs Leave a comment

Essential SQL Server 2008: Database Infrastructure and Scalability

Database Mirroring enhancements, automatic page repair, log stream

Backup compression and using Transparent Data Encryption (TDE) to encrypt data

Hot-add CPU, DBCC CHECKDB enhancements, failover clustering enhancements

Peer-to-peer replication enhancements

Policy-Based Management and Multi-Server Administration

Central management server metadata, multi-server script execution

Policy-based management overview, conditions, facets and policies

Applying facets and conditions to policies, policy enforcement

Policy based management architecture and execution

Applying policies to multiple servers

Policy based management advanced topics

Performance Data Collection

Creating and configuring a management data warehouse

Data collection overview and architecture

Configuring system data collection on an instance

Transact-SQL custom data collection

Data collection reports and module summary

Resource Management and Troubleshooting Enhancements

Resource Governor

Extended events, predicates, actions, targets, types, maps, sessions

Security Enhancements

Transparent data encryption

Backups and transparent data encryption, transparent data encryption limitations

Integrated authentication enhancements, SQL Server 2008 auditing improvements

SQL Server 2008 all actions audited

Security Summary

Management Implications of New Features Part 1

spatial indexes, integrated full-text search

Problems addressed by sparse columns

Sparse columns and column sets

Sparse columns and filtered indexes, best practices for sparse columns

Constructing tabular data sets with sparse columns

Limitations of sparse columns and interaction with other features

Overview of filtered indexes

Using filtered indexes with sparse columns

Filtered statistics, monitoring filtered indexes and statistics

Using filtered indexes for improved index coverage

Management Implications of New Features Part 2

Filtered index best practices

Filtered index system metadata

Filtered index limitations

Change tracking overview, change tracking best practices

Change data capture (CDC) overview

Blob storage with Filestream overview, enabling Filestream

Filestream security, filestream and transactions, monitoring filestream storage

Filestream manageability

Scalability Enhancements

Data compression overview, data compression trade-offs

Estimating data compression space savings

Estimating data compressing space savings, choosing row compression

Enabling compression and comparing performance

Enabling and disabling compression, data compression options and syntax

Service broker improvements, Service Broker conversation priority

Query optimizer enhancements, MERGE overview

Forcing query optimizer plans with plan guides

Using plan guides

Star join optimizer improvements

Partitioning enhancements overview, partition-aligned indexed views

Partitioning and lock escalation enhancements

Improving concurrency with partition lock escalation

Lifecycle of a QA Professional: Testing Steps

December 21, 2009 Visitor Blogs Leave a comment

Quality Engineer: Segmentation: Skills

Microsoft SQL Server 2008 Reporting Services

December 21, 2009 Visitor Blogs Leave a comment

Reporting Services 2008 is no longer dependant on Internet Information Services (IIS). Reports are hosted within reporting services itself, levaraging HTTP.sys from the operating system.

This removal reduces the total cost of ownership for maintaining reporting services and makes it easier to configure. Furthermore, it reduces the attack surface of the report server.

Thresholds can be defined on whether Reporting Services is under low-, medium-, or heavy-memory pressure. This control allows the report server to respond in a more timely fashion to periods of high-reporting demands

Exploring Report Designer Components

Taken from the Microsoft Office 2007 product range, ribbons within the Report Designer provide three context sensitive Ribbons.

The Home ribbon provides common office and formatting functions such as Clipboard tools and Formatting options such as Fonts, Alignments and Borders tab. There is also an Arrange tab that allows that management of objects on the designer canvass such as ‘Send to Front ’ and ‘Send to Back’ buttons. There is finally a Preview button.

On the Insert Ribbon there is a notable absence of the Tablix button on the Data Region tab. The Tablix data region is a combination of the best features of the Table and Matrix data region. Therefore Tablix capabilities are available once a Table or Matrix data region is placed on the canvass designer. This can be seen when right clicking over a table or matrix data region as Tablix properties is shown in the menu.  A Chart and List data region also appear on this tab. An Additional tab include Report Items tab containing line, rectangle, image and textbox – which now support rich text formatting –  items. There is also a Subreport and Header & Footer tab.

The View ribbon consists of the Report Views tab which allows the user to switch between design and preview view as shown by the buttons. There is also a Show\Hide tab that allows the user to control the elements that can be shown in the Report Designer such as rulers and properties pane.

Features of Microsoft SQL Server 2008 Reporting Services Authoring

A grouping pane is located at the bottom of the Report Designer. This provides a quick way of managing groups of data in the canvas designer, without the need for entering into dialog boxes

Enhanced Visualization using Gauges

Gauges provide high visualization for key performance indicators (KPIs) that can be embedded not only in reports, but also within individual cells of a data region, providing useful reports that impart key information to the consumer in an instant.

There are four types of gauges that  are supported in Reporting Services 2008 as shown in the following exhibit.

·         Circular gauges

·         Linear gauges

·         Numeric indicators

·         State indicators

The gauges are fully editable to the smallest detail depending on the gauge that is chosen. An expression can be set to define the goal within the gauge that specifies the KPI to measure and whether performance is deemed to be good or bad at a simple level. Like charts, right clicking an area of the gauge allows you to set properties on that particular attribute of the gauge.

Microsoft SQL Server 2008 Reporting Services Management and Delivery

Memory configuration

Configuration settings can now be specified to set an upper limit on the amount of memory that is used by Reporting Services application giving greater control over server resources. Furthermore, thresholds can be set that cause the report server to change how it prioritizes and processes requests depending on whether it is under low, medium, or heavy memory pressure . Configuration involves setting the boundaries for each of the memory pressure bandings in the RSReportServer.config file. This control allows the report server to respond in a more timely fashion to periods of high reporting demands.

Configuration settings that control memory allocation for the report server include

  • WorkingSetMaximum
  • WorkingSetMinimum
  • MemorySafetyMargin
  • MemoryThreshold

The WorkingSetMinimum and WorkingSetMaximum are used to define the availbale memory to Report Server application. This is a useful feature if the server hosts multiple applications

The MemorySafetyMargin and MemoryThreshold set the boundaries for low, medium, and high levels of memory pressure is displayed in the graphic.

SharePoint Integration

Reporting services installation now includes the option for SharePoint Integration mode. This option provides the deep integration that allows users to browse SharePoint libraries for reports. The reports can take advantage of the document management features such as versioning control and document change alerts. The rendering, data processing and delivery is dealt by the report server.

The followin g additional functionality is available when Reporting Services is deployed in SharePoint Integrated Mode

  • Use document management and collaboration features, including alerts and version control.
  • Use SharePoint permissions and authentication providers to control access to reports, models, and other items.
  • Subscribe to and deliver reports to a SharePoint library using a new SharePoint delivery extension.
  • Create ad hoc reports using report models and Report Builder.
  • Manage reports, models, data sources, schedules, and report history in custom application pages on a SharePoint site.
  • View a report in a new customizable Report Viewer Web Part.
  • Publish or upload reports, report models, resources, and shared data source files to a SharePoint library, including Report Center in Office SharePoint Server.
  • Connect a Report Viewer Web Part to a Filter Web Part or a Library Web Part in dashboards or Web Part pages.
  • Use SharePoint deployment topologies to distribute reports over an Internet connection outside the firewall.

Within Microsoft SQL Server 2008 Reporting Services, reports are stored within which database object?

Select the one best answer.

Stored Procedure

No. Stored procedures are pre-compiled objects available to applications to access a relational database.

Indexed View

No. Indexed views materialize the results of a view into it’s own data structure. It is used to improve performance of select statements.

SQL CLR

Correct! SQL CLR (Common Language Runtime) object are used to store .Net objects within SQL Server. As IIS is no longer required to store the web pages. They are stored in the SQLCLR object itself

SQL Table

In which Reporting Services configuration file can the settings be defined to control how much memory Microsoft SQL Server 2008 Reporting Services uses?

Select the one best answer.

Within the web.config file

No. A web.config file is a configuration file for ASP.Net web applications.

Within the rsreportserver.config file

Correct. The rsreportserver.config file contains configuration settings for the report server. Memory used by the report server is controlled in this file using the following four settings. WorkingSetMaximum, WorkingSetMinimum, MemorySafetyMargin and MemoryThreshold

Within the rswebapplication.config file

No. The rswebapplication.config file is used to store configuration information pertaining to Report Manager.

Within the Master database.

Which Tool can be used to for configuring Microsoft SQL Server 2008 Reporting Services after installing the product?

Select the one best answer.

Microsoft Report Builder

No. Introduced in Microsoft SQL Server 2005 Reporting Services for the creation of ad-hoc reports against a report model data source

Microsoft Report Designer

No. A new tool introduced in Microsoft SQL Server 2008 Reporting Services that allows business users to connect to any data source and produce richly formatted reports

Report model

No. A report model provides an abstration of data that allows a Report Builder to access for the creation ad-hoc reports

Reporting Services Configuration Tool

Correct. a GUI that provides a comprehensive range of settings that allows the configuration of the reporting server. Batch configurations can be done with rsconfig.exe, rskeymgmt.exe and rs.exe through command prompt and are provided for backward compatability.

Which of the following is a new render format in Microsoft SQL Server 2008 Reporting Services?

Select the one best answer.

Microsoft Office Word

Correct. Microsoft Office Word has been introduced in Microsoft SQL Server 2008 Reporting Services

Microsoft Office Powerpoint

No. This is not a supported render format in any version of Reporting services.

Microsoft Office Excel

No. The Microsoft Office Excel render format is available in all versions of Reporting services.

CSV

Pre Installation Considerations

Security Considerations

Follow Security Best Practise.

Best practises for security considerations for Microsoft SQL Server 2008 should be adhered to in the following areas

  • Physical Security
    • Ensuring that the SQL Server and backup devices are physically secure
  • Firewall Configuration
    • Ensuring that only the relevant ports are open and inside a Windows domain, configure interior firewalls to permit Windows Authentication
  • Service Isolation
    • Avoid installing SQL Server on a domain controller and use different service accounts for each SQL Server components
  • Disable NetBIOS and Server Message Block
    • Within environments that use just DNS, security threats can be mitigated by disabling NetBIOS and Server Message Blocks

Report Server Service Account

Microsoft SQL Server 2008 Reporting Services is now managed as a single service that defines the security context under which Report Manager, Report Server Web service,  and a background processing application run under.

The Report Server service account can be defined during the installation and can also be changed using Reporting Services Configuration tool. As the service account is stored in many different locations within Reporting Services always use this tool for changing the service account as it will automatically place the defined service account in the relevant locations.

One of four account types can be chosen for defining the Reporting Server service account. Each type has its own advantages and limitations.

  • A dedicated domain account.
  • Local system account.
  • Local service account.
  • Network Service account.

Best practise for a production environment is to use a dedicated domain account as this account can be centrally audited, have the required permissions manually assigned to the account and allows access to resources beyong the SQL Server box.

Reporting Services Components

Microsoft SQL Server 2008 Report Services consist of both server-side and client-side components that can be installed multiple times.

The server-side report server must be installed on a machine that hosts a SQL Server database and have a valid license as this hosts

  • The Report Server database.
  • Report Manager, now held as a SQL CLR object within the Report Server
  • Report Builder
  • Report Server Configuration Tool
  • Command line tools RSconfig.exe, RS.exe and RSkeymgmt.exe
  • Background processing applications.

The client-side tools can be hosted on any machine and does not require a valid license. These tools include

  • SQL Server Management Studio
  • Business Intelligence Development studio.

All components are available as options in the Components to Install screen in setup. Server components are installed by selecting Reporting Services and the client tools are installed by selecting Workstation components, Books Online and development tools. Within this area there is an Advanced button to allow the selection of specific client tools.

Post Installation Tasks

Verify that the report server is installed and running.

  • Run the Reporting Services Configuration tool, connect to the report server instance you just installed, and review the status indicator for each setting to verify that it is configured.
  • Open the Services window and verify that the Report Server Windows service is running. To view the status of the Report Server Windows service, click Start, point to Control Panel, double-click Administrative Tools, and then double-click Services. When the list of services appears, scroll to Report Server (MSSQLSERVER). The status should be Started.
  • Open a browser and type the report server URL in the address bar. The address consists of the server name and the virtual directory name that you specified for the report server during setup. By default, the report server virtual directory is named ReportServer. You can use the following URL to verify report server installation: http://<computer name>/ReportServer<instance name>. The URL will be different if you installed the report server as a named instance.
  • Run reports to test report server operations. For this step, you can install and publish the sample reports if you do not have any reports available. For instructions on how to install sample databases, applications, and data files that are used in Reporting Services tutorials.

Verify that Report Manager is installed and running

  • Open a browser and type the Report Manager URL in the address bar. The address consists of the server name and the virtual directory name that you specified for the Report Manager during setup. By default, the Report Manager virtual directory is Reports. You can use the following URL to verify Report Manager installation: http://<computer name>/Reports<instance name>.
  • Use Report Manager to create a new folder or upload a file to test whether definitions are passed back to the report server database. If these operations are successful, the connection is functional.

Verify that Report Designer is installed and running

  • Open Business Intelligence Development Studio, and create a new project based on a Report Server project type.
  • If you installed report samples, open the sample report project files and publish the reports to a report server.

Reporting Services Configuration

For post installation configuration, the Reporting Services Configuaration tool will provide easier management, particularly for those moving from Microsoft SQL Server 2000 Reporting Services.

Introduced in Microsoft SQL Server 2005 Reporting Services, the notable change to this version; apart from the look, is the fact that there is only one service account that needs to be configured now the dependancy on Internet Information Services (IIS) is removed. Previous version required a windows service account and web service account to be configured. Below is a summary of options.

Server Node Provides the ability to stop and start the reporting services.

Service Account Provides the ability the change the security context under which the Report Service runs.

Web Service URL Configures the URL access to the Report Server by creating a URL reservation. An URL reservation defines the virtual directory, IP address, TCP port and optionally a SSL certificate. HTTP.sys uses the URL reservation to direct users requests and allow access to the URL.

Database Provides the ability to change the location of the Report Server database and to change the credentials for connecting to the Report Server database.

Report Manager Creates an URL reservation to access the Report Manager website.

Email Settings Provides the ability to integrate with email by defining a Server address and an email address that has the ability to send email from the Report Server.

Execution Account Provides the ability to specify an account to use against report data souces that do not require credentials or to access remote servers.

Encryption Keys Allows the management of the keys that are used to encrypt the sensitive data stored in the reporting services database. There is the ability to backup and restore the encryption keys. Change encryption keys and delete the encrypted content within the database.

Scale-out Deployment Provides the ability to manage multiple instances of Reporting Services that share the same Reportserver database such as a clustered instanced. Microsoft SQL Server 2005 Reporting Services refered to scale-out deployment as Initialization

Glossary

Ad-hoc reports

The ability to create reports on the fly against a report model. Report Builder is used for the purpose.

Business Intelligence

The process of improving methods of making business decisions using technology.

Business Intelligence Developmeny Studio

The Microsoft Application based on Visual Studio that enbles the creation of Business Intellegence solutions.

Chart

A data region within report designer that enables the visual display of information.

Cubes

A multidimension structure that contains aggregated data at each intersection.

Data Pane

The area of the Report Designer that provides the ability to add Global parameters, images and fields to a report.

Data Region

The area of a report that holds the results of a data set. Data regions can include a table, matrix, tablix, chart or list.

Data Source

Holds connection specific information enbling a report to connect to a database.

Dataset

The results of a table, view are stored procedure that is used to populate a data region.

Encryption Keys

used to encrypt the sensitive data stored in the reporting services database.

Execution Account

a pre-defined account used against report data souces that do not require credentials or to access remote servers.

Gauge

A visual component added to Microsoft SQL Server 2008 Reporting Services that provides a measure of KPI’s.

Grouping Pane

An area of Report Designer tht enables the management of hierarchical data within a data region.

HTTP.SYS

A file within the Microsoft Windows Operating system that enables the hosting of web pages without the need for IIS.

Installation

The act of placing an application on the operating system.

Internet Information Services

The Microsoft web server products for hosting web pages.

Jewel

refer to Microsoft Office Button.

Key Performance Indicator

A measure that is used by an organization to assess the performance of critical business operations. Known as a KPI.

List

A data region in reporting services.

Matrix

A data region within report designer that enables the arrangement of information on both column and rows.

Microsoft Office Button

Known a the Jewel button. The area at the top left of Report Designer that provides the ability to add new reports, open reports save and publish reports.

Microsoft Word

Microsoft Word processing application introduced as a new rendering option in Microsoft SQL Server 2008 Reporting Services.

Parallel Dynamic Groups

A benefit of the tablix data region that enables the multiple grouping of fields in both rows and columns.

Properties Pane

The area at the right of Report Designer that provides the ability to set additional attributes and propoerties against the report.

Query

A search request that is submitted against a database.

rsreportserver.config

A file that is used to set configuration of reporting services such as memory configuration.

RDBMS

Relational DataBase Management System that organizes data into multiple tables that are related together by common fields.

Rendering

The process of turning the result of a dataset in a complete report within a application such as Microsoft Office Word or Excel.

Report

A publication of information presented in a variety of data regions.

Report Builder

The Microsoft application used for the creation of ad-hoc reports against a report model.

Report Canvass

The area of the Report Designer that provides the ability to create the the report.

Report Designer

The Microsoft application that allows both report developers and business users to create reports from a wide variety of data sources.

Report Manager

Web pages that enables the viewing and management of published reports.

Report Model

Provides an abstraction layer on top of the data that enables users of Report Builder and Report Designer the ability to create reports against.

Report Services Configuration Tool

Used to configure setting that enables Reporting Services to run.

Report Viewer Control

A Microsoft .Net object that provides the ability to embed reports within a custom application.

Reporting Services

A server based platform that supports the management, distribution and authoring of reports.

Ribbons

The area of the Report Designer that provides shortcuts for the common functions required to author a report.

Scale Breaks

A new feature of the chart data region that automatically adjusts the visual representation of reports with widely differing values to ease report consumption.

Scale Out Deployment

An option in Reporting Services Configuration Tool that provides the ability to manage multiple instances of Reporting Services that share the same Report Server database.

Service Account

The account that defines the security context under which an application will run under.

SharePoint Services

The Microsoft application that facilitates collaboration and provides content management features.

SQL Server

Microsoft’s RDBMS that provides a data platform for an organization.

SQL Server Management studio

Microsoft SQL Server’s management application that enables the management of the database engine and the report services engine.

Tablix

A new data region introduced to allow the creation of flexible reporting structures.

Managing Data in Excel

December 18, 2009 Visitor Blogs Leave a comment

Creating Drop-Down Lists

Data Validation can create a drop-down list of values, which restricts any other values from being entered and prevents data entry errors.

1. On the Data menu, select Validation.
2. On the Settings tab, in the Allow drop-down menu, select List.
3. Click into the Source field, and select the cells in the spreadsheet that contain the list values.
4. Click OK.
Result: Drop-down menus will display what cells with data validation lists are selected.

Arranging Data

Data Entry Shortcuts

Today’s date………………………………………………………………………………………………………… CTRL + ;
Copy value from cell above…………………………………………………………………………………… CTRL + “
Fill down…………………………………………………………………………………………………………….. CTRL + D
Fill to the right…………………………………………………………………………………………………….. CTRL + R
Insert AutoSum formula………………………………………………………………………………………….. ALT + =
Edit the active cell………………………………………………………………………………………………………….. F2
Select current region ……………………………………………………………………………………. CTRL SHIFT *
Fill range with current entry…………………………………………………………………………..CTRL + ENTER
Cut……………………………………………………………………………………………………………………. CTRL + X
Copy…………………………………………………………………………………………………………………. CTRL + C
Paste once…………………………………………………………………………………………………………….. ENTER
Paste multiple times……………………………………………………………………………………………. CTRL + V
Paste formatting only………………………………………………………………………………. CTRL + SHIFT + V
Clear cell contents………………………………………………………………………………………………… DELETE
Delete selection……………………………………………………………………………………….. CTRL + HYPHEN
Undo………………………………………………………………………………………………………………….. CTRL + Z
Repeat last command………………………………………. F4 (Note: does not work when running Mekko)
Insert a new worksheet ……………………………………………………………………………………. SHIFT + F11
Create a chart using selected range………………………………………………………………………………. F11
Insert cell comment…………………………………………………………………………………………… SHIFT + F2

AutoFilter

AutoFilter organizes columns of data with filtering buttons, allowing you to examine certain data in a list.  AutoFilter can also be used to locate inconsistent data in a list.  With the filtering buttons at the top of a list, you have the option of filtering for a specific value, a top 10 value, or a custom set of specified criteria.
1. On the Data menu, select Filter | AutoFilter.

The Top 10 AutoFilter displays only the records with the top 10 values or in the top 10 percent.
1. Click an AutoFilter button, and select Top 10.

2. Adjust the Top 10 filter as desired.  In the first field, select Top or Bottom.  In the second field, adjust the number by using the arrows, or typing directly into the field.  In the last field, select Items or Percent.

Grouping Data

The Grouping feature consolidates data by adding buttons that expand and collapse the data for quick viewing.
1. Select the desired rows or columns to be grouped.
2. On the Data menu, select Group and Outline | Group.
To expand and collapse the groups, click the plus or minus sign at the top or left side of the screen, or the corresponding number buttons at the top left of the screen.
Minus sign button to collapse group

Manual Subtotals That Change With Filtered Data

Subtotals can be manually inserted into a filtered list.  Manual subtotals will re-calculate the data based on the way it is filtered, so you don’t need to re-calculate your cells each time you filter the list differently.  To subtotal an Excel list, insert the SUBTOTAL formula at the bottom of the list.
Syntax for SUBTOTAL function:
=SUBTOTAL(function_num,ref1)
Function_num is the type of function to be used in the subtotal, such as SUM, AVERAGE, etc.  The table below lists the type of function_nums used within the SUBTOTAL function:
Function_num (includes hidden values)     Function_num (ignores hidden values)     Function
1 101 AVERAGE
2 102 COUNT
3 103 COUNTA
4 104 MAX
5 105 MIN
6 106 PRODUCT
7 107 STDEV
8 108 STDEVP
9 109 SUM
10 110 VAR
11 111 VARP

Ref1 is the range of cells to apply the subtotal function to.  SUBTOTAL functions can calculate up to 29 separate ranges of cells.

Database Formulas (Database, Field, Criteria)

DAVERAGE

The DAVERAGE function averages all numbers in a column in a list, based on a given criteria.
Syntax for DAVERAGE function:
=DAVERAGE(range, field, criteria)
Range is the list you want to apply the criteria against
Field is the column to average the values. You can either specify the numerical position of the column in the list or the column label in double quotation marks
Criteria is the range of cells that contains your criteria.  To create a criteria range, copy and paste the field names at the top of the list to another location in the same worksheet

Example:

If you wanted to calculate the average weekly salary for employees that work 30 hours per week, use the DAVERAGE function.

The DAVERAGE function for the above example would look like this: =DAVERAGE(A1:B10,”Gross Weekly Salary”,D1:E2).

DCOUNTA

The DCOUNTA function counts cells in a column or list that contain nonblank values, and meets a given criteria.
Syntax for DCOUNTA function:
=DCOUNTA(range, field, criteria)
Range is the list you want to apply the criteria against.
Field is the column to average the values. You can either specify the numerical position of the column in the list or the column label in double quotation marks.
Criteria is the range of cells that contains your criteria.  To create a criteria range, copy and paste the field names at the top of the list to another location in the same worksheet.

Example:
If you wanted to count the employees who had Medical benefits, use the DCOUNTA function.

DSUM

The DSUM function sums the numbers in a column or database that meets a given criteria.
Syntax for DSUM function:
=DSUM( range, field, criteria )
Range is the range of cells that you want to apply the criteria against.
Field is the column to sum the values. You can either specify the numerical position of the column in the list or the column label in double quotation marks.
Criteria is the range of cells that contains your criteria.
Example:
If you wanted to If you wanted to total all the salaries of Water Rides employees, use the DSUM function.

SUMIF

The SUMIF function adds all numbers in a range of cells, based on a given criteria.
Syntax for SUMIF function:
=SUMIF( range, criteria, sum_range )
Range is the range of cells that you want to apply the criteria against.
Criteria is used to determine which cells to add.
Sum_range are the cells to sum.

Example:
If you wanted to total the salary of all Germany employees, use the SUMIF function.

VLOOKUP

The VLOOKUP function searches for a value in the left-most column of a table array and returns the value in the same row based on the index number.

Syntax for VLOOKUP function:

=VLOOKUP( value, table_array, index_number, range_lookup )

Value is the value to search for in the first column of the table_array.

Table_array is two or more columns of data that are sorted in ascending order.

Index_number is the column number in table_array from which the matching value must be returned. The first column is 1.

Range_lookup determines if you are looking for an exact match based on value.

Enter FALSE to find an exact match.

Enter TRUE to find an approximate match. This means that if an exact match is not found, then the VLOOKUP function will look for the next largest value that is less than the Value.

Example of a VLOOKUP Using an Exact Match (FALSE):

If you wanted to calculate a bonus based on whether or not an employee works in Shows and Children’s attractions, use the VLOOKUP function.

Notes:

  • If index_number is less than 1, the VLOOKUP function will return #VALUE!.
  • If index_number is greater than the number of columns in table_array, the VLOOKUP function will return #REF!.
  • If an exact match is not found, then the VLOOKUP function will return #N/A.

Example of a VLOOKUP Using an Approximate Match (TRUE):

If you wanted to calculate the bonus based on the amount of the Gross Weekly Salary, use the VLOOKUP function.

In this case, the VLOOKUP function returns a 10% bonus for employees earning a $100.00 – $299.00 weekly salary, and a 20% bonus for employees earning a salary of $300.00 or above.

The Value is the salary to compare to the table_array (C2).

The Table_array is the table of salaries and their corresponding bonuses ($F$5:$G$6).

The Index_number is 2, which returns the 2nd column of the table_array.

The Range_lookup is TRUE, which returns an approximate match if no exact match is found.

Notes:

  • If index_number is less than 1, the VLOOKUP function will return #VALUE!.
  • If index_number is greater than the number of columns in table_array, the VLOOKUP function will return #REF!.
  • If you enter FALSE for the range_lookup parameter and no exact match is found, then the VLOOKUP function will return #N/A.

Relative Cell Reference Macros

Macros recorded in Excel use absolute cell referencing by default. Absolute cell referencing means the macro works only with the exact cell addresses that were used when the macro was recorded. Macros with relative cell referencing are different. Relative reference macros record cell addresses that are relative to the starting cell location.

Relative reference macros are recorded by clicking the Relative Reference button on the Stop Recording toolbar once the macro has been named. From this point on, the macro will record actions on cells in relation to the starting cell.

Notes:

If you want to record a macro that performs an action on selected cells only (e.g., changing the font, making cells bold), click Relative Reference and perform the actions on one cell. To run the relative reference macro, select the desired cells before running. The macro will perform the actions on the selected cells only, because Relative Reference was turned on while recording the macro.

Relative referencing can also be used in combination with absolute referencing while recording a macro. To switch back to absolute reference, while recording click Relative Reference on the Stop Recording toolbar to turn off relative referencing.

Recording a Macro with Relative Cell Referencing

Like the ‘Blue’ macro previously, this macro will change the fill color of cells to blue. Since it is a Relative Reference macro, it will only change the selected cells to blue.

  1. Launch Microsoft Excel.
  2. On the Tools menu, select Macro | Record New Macro.
  3. Type the name “Blue_Relative” in the Macro name field.
  4. Click OK. The Stop Recording toolbar displays on the screen. Note: At this point, all of your actions in Excel are being recorded. Ensure that all of the actions have been practiced, and are deliberate. Clicking the Undo button will record that action within the macro, and may cause problems when you run the macro later. If you need to undo an action, start over and re-record the macro.
  5. Click Relative Reference. Note: this will record actions relative to the starting cell location.

Excel 2007 training courses – Training – Microsoft Office Online

StringBuilder versus String

December 17, 2009 Visitor Blogs Leave a comment

Both String and StringBuilder can be used to handle strings. If the program uses only a few string objects, it probably doesn’t matter which class you use. (Actually, using Stringwould be better because the Stringobject initializes faster than StringBuilder). However, if you are creating many String objects, you need to think about it.

Use the Stringclass for strings that do not change; otherwise, use StringBuilder.

Specifically, if you are performing multiple string concatenation operations, definitely use StringBuilder rather than String. If you use String, a new object will be created each time string addition (+) is performed. On the other hand, StringBuliderwill use a single object for concatenation. The following code snippet illustrates the difference. Two label objects, label1and label2, are used to display the elapse times of using Stringand StringBuilder.

This code uses String and StringBuilderfor the same task: concatenating strings 5000 times. Both strings are initialized with the same string literal: “This is a test string”. We measure the elapsed time in both cases. Here is the result of one run on a Smartphone emulator:

❑ String elapse time: 4 milliseconds

❑ StringBuilder elapse time: 2 milliseconds

As you can see, using StringBuilder for a large number of string concatenations is much faster than using String. Using String.Concat() instead of the addition operator of String yields similar results. By turning on performance counters (see below), you can obtain more detailed information about managed string objects. The following example shows the result of using the selected performance counters introduced early in this chapter to compare String and StringBuilder performance.

Using only String for the concatenation:

Managed String Objects Allocated 5125

Bytes of String Objects Allocated 50232832

Garbage Collections (GC) 53

Using only StringBuilderfor the concatenation:

Managed String Objects Allocated 134

Bytes of String Objects Allocated 35580

Garbage Collections (GC) 0

In the first case, there are more than 5000 Stringobjects allocated, and the garbage collector has run 53 times. The overhead of creating and garbage-collecting those String objects is probably the leading cause of poor performance. When using StringBuilder, only 134 String objects are created, and because not too much heap space has been taken, the garbage collection does not even run.

Note that the preceding sample code uses Debug.Writeln() to display text strings in the console. The Debugclass is in the System.Diagnosticsnamespace. In fact, you can build a console project for Windows Mobile Smartphone devices and use Debug.Writeln()to output debug information to standard output. (If you debug the program in Visual Studio, standard output can be viewed in the output window.)

Introduction to SQL Server 2005 Integration Services

December 10, 2009 Visitor Blogs Leave a comment

with the release of SQL Server 2005, Microsoft introduced a replacement to its Data Transformation Services (DTS) known as SQL Server Integration Services (SSIS). Familiarity with D’IS, fortunately, is not prerequisite knowledge for understanding the concepts in this chapter This chapter is desied to help you understand the features and management of SSIS. Although this chapter will not (and realistically cannot) be an exhaustive review of SSIS, it should give you a good foundation of understanding how data can be moved between various sources and destinations, and how it can he modified in the process. This chapter covers the following topics:

  • A general introduction to SSIS and its features
  • The import and export tools used to move data around
  • The different options for transforming data using SSIS

About SSIS
As mentioned earliet SSIS is designed to be a replacement for DTS, Although many database administrators found DTS to be an extremely useful tool for manipulating and automating the transfer of data from one source to anothet it was prohibitively difficult for many administrators who lacked significant programming or scripting skills to be able to perform complex transformalions. SSIS, or simply Integration Services, changes all that.

Integration Services is part of a suite of tools included in the Business Intelligence (BE) Development Studio, The BI Development Studio is an instance of Visual Studio 2005, which includes the add-ins for designing solutions for Integration Services, Analysis Services, and Reporting Services, One of the benefits of using the BE Development Studio is that it allows you to develop Integration Services solutions without having to maintain an active connection to an existing SQL Server This gives you the ability to design solutions that can run on multiple servers, or can be executed from a file system,

Integration Services is an Extract, Transform, and Load (ETL) tool. This means that with Integration Services, you define a data source, and, from thai data source, you define which data you are interested in copying to a new destination. After having extracted the data, you then perform any optional transformations on the data to prepare it for its destination. For example, you may want to take a column that stores a string value that is either “True” or “False” into a Boolean value, “1” or “0,’ respectively’. This allows you to match the current data type of the destination. Finally, the load sequence takes the transformed data and injects it into the appropriate destination.

SQL Server Integration Services is actually made up of four different components:

  • Integration Services itself
  • Integration Services oect model
  • Integrated Services run-time
  • Integrated Services data flow

Each of these components is used to create a robust experience for designing, managing, and executing packages built for SSIS. In the next few sections, you learn about each of these components.

Integration Service

The Integration Service itself is actually managed through SQL Server Management Studio, not unlike many of your other server components. This component is used to handle the management and monitoring of both stored and running packages. Packages can he stored either in the file system or they can be stored in the nsdb database on a running instance of SQL Server 2005.

Integration Services, when installed, assumes that the local default instance contai as the rnsdb database that will be used [or the package repository. However, because SQL Server 2005 n-,d SQL Server 2000 can bofh be installed on the sante niaclune, it is possible that your delault instance is running a legacy version of SQL Server ([this is the case, you must manually edit the <serverflame> element in the MsDtsSrvr. mi ,xmi file. This file is in the 90\DTS\Binn directory of your SQL Server installation folder.

You can use SQL Server Managemmit Studio to connect to an instance of SSIS, as shown in Figure 13-1. The following is a list of manageable features of the Integration Service in SQL Server Management Studio:

  • Connect to multiple Integration Services servers
  • Manage package storage
  • Customize storage folders
  • Import and export packages
  • Start local and remote stored packages
  • Stop local and remote running packages
  • Monitor local and remote running packages
  • View the Windows Event log

Integration Services Object Model

Integration Services includes a new object model for including both native and managed application programming interfaces (APIs) for customizing the behavior of your Integration Services solutions. You can use these APIs for accessing SSIS tools, command-line functions, or custom applications. You can also use the object model for executing SSIS tools and packages from within your own applications.

Integration Services Run-time

The Integration Services Run—time engine is responsible for saving the control flow logic and execution of SSIS packages. Integration Services run-time executables include packages, containers, predefined and custom tasks, and event handlers, The run-time handles execution order, logging, variables, and event handling. Programming the Integration Services Run-time engine allows you to automate the creation, configuration, and execution of packages through the object model,

Integration Services Packages

Packages are units of execution that are composed of a series of other elements, including containers, tasks, and event handlers You can create and manage packages through the 81 Development Studio, or programmatically, using the Integration Services object model. Each package contains a control flow, which is a series of tasks (related or not) that will execute as a unit. Similar to jobs in the SQL Server Agent service (see Chapter 8), Integration Services packages use a customizable logic flow that controls the timed or constrained execution of individual tasks.

Integration Services Tasks

Tasks are the basic unit of work within an Integration Services package. Each task defines an action that will be taken as part of the execution of this package. Some of the basic task types include Execute SQL tasks, in which a T-SQL script will be executed; file system tasks, which interact with a local or remote file system; and data flow tasks, which control how data is copied between a source and destination. Many other types of tasks are discussed later in this chapter.

Integration Services Containers

Containers are objects that exist within the Integration Services environment to allow you to define one or more tasks as a unit of work. You can use containers to define parameters for the execution of these tasks. Four types of containers am available, and you learn more about them later in this chapter.

Integration Services Event Handlers

Event handlers are similar to packages, in that within them, you can define tasks and containers. One major difference, though, is that event handlers are reactionary. This means that the tasks defined within an event handler will only be executed when a specific event occurs. These events are defined on tasks, containers, or the package itself, and include events that are fired before, during, and after the execution of the package.

Integration Services Data Flow

One of the most significant benefits of the SSIS features is the separation of the control flow from the data flow. Each package that contains a data flow task (such as an import or export) identifies the data flow task to the run-time engine, but then a separate data flow engine is invoked for that operation. The data flow engine manages what is typically the whole point of an SSIS package, and that is extracting, transforming, and loading data. The data flow engine will extract data from data files or relational databases, manage any and all transforms that manipulate that data, and then provide that transfonned data to the destination, A package may have more than one data flow task, and each task will execute its own data flow process for moving and manipulating data.

Importing and Exporting Data

One of the easiest ways to understand SSIS, and to see it in action, is through the Import/Export Wizard, which can he run from the Management Studio, The process is essentially the same for both operations. The primal’v difference between the import operation and the export operation is whether your SQL Server is the source or the destination. It should he noted, however, that SSIS doesn’t need to use a SQL Server as either the source or the destination! You can use SSIS to import data from a fiat-file source  (such a comma-separated value file) into a Microsoft Access database.

Transforming Data with SSIS

Now the fun really begins. By now you should have a pretty good understanding of the concepts of how Integration Services can manage control and data flow, and you’ve seen a simple example of how to get data into and out of the SQL Server using the basic tools, In this section, you’re going to see how those components can be expanded on to provide a more complete scenario for working with Integration Services.

You should first become familiar with the Integrated Development Environment (IDE). Integration Services, along with Analysis Services and Reporting Services, relies heavily on the 81 Development Studio. The Bl Development Studio is really just a fancy name for Visual Studio, and is, in fact the Visual Studio 2005 IDE, but it includes only those add-ins for SQL-based services. If you install the full version of Visual Studio 2005, or just language add-ins like Visual Basic and c#, you’ll find that it starts the same environment. For the sake of simplicity, let’s just refer to the IDE as Visual Studio.

To begin creating a new SQL Server Integration Services packages, simply launch Visual Studio and create a new project. Of the types of projects available, Integration Services is available under Business Intelligence Projects (see Figure 13-11). Once you’ve created your new project, notice that like other Visual Studio pmjects, you have a toolbox that contains controls and rsourees for your projects. The development environment is broken up into four different sections, each of which allows you to control different aspects of your project. These sections include a management area for the Control Flow, management of the Data Flow, Event Handlers, and a Package Exploret During the execution or debugging of a package, a fifth tab appears, allowing you to view package execution progress. In the next few seclions, you learn about each of the different management areas, how they’re used, and what options are available when working in those areas.

Package Elements
When creating a new 5515 package, it’s important that you be familiar with the different elements available. Familiarizing yourself with the variety of tools available will help you create mote robust packages, which can execute a complex series of tasks. This section introduces you to these resources.

Control Flow
Your main environment is the Control Flow section, shown on the left side of Figure 13-12. The control flow environment allows you to define one or more tasks that will he executed for this Integration Services project, and specify the order in which those tasks are executed (hence the term “Control Flow”). You can choose to define tasks that are serialized, meaning that one task must reach a completion state before the next task begins. You can also execute tasks in parallel, allowing multiple operations to be executed simultaneously As long as there are no dependencies between these tasks, this can take advantage of your system resources, and dramatically decrease the execution time of your packages. You can reduce the total time the package takes to execute significantly with parallel execution.

To add items to your package’s control flow, simply choose the appropriate item and drag it into the Control Flow pane, which is the first tab on the left in the middle of Figure 13-12. Once you’ve dragged an item into the Control Flow pane, you can then configure that item. Some tasks may display with an error symbol (the red circle with the white X) or a warning symbol (the yellow triangle with the black exclamation point) to indicate that further configuration is needed for that task to be able to execute properly In some cases, simply configuring the task can “fix” the problem. You can also view the Error List (by pressing CtrlE or selecting View — Error List) and review the available errors and warnings. You can also double-click on an error or warning to have the IDE take you straight to where you need to go to fix the event.

Control Flow Tasks

This section briefly identifies each of the control flow tasks that can be used to build your packages.

Also listed are Database Maintenance tasks that may be useful to you as well.

  1. For Loop Container—Containers are interesting in that they are both a task and collection of tasks at the same time, In this case, the For Loop container allows you to execute one or more tasks that will continually execute until the result of the executed task returns a Boolean “false” value.
  2. ForEach Loop Container—Similar to the For Loop container, ForEach Loop containers allow you to execute tasks for each instance of a type of object. The ForEach loop includes enumerators for files, items, ADO record sets, ADO.NET scheinas, variables, XML nodes, or SQL Management Objects.
  3. Sequence Container—These containers allow you to define a series of tasks that will execute in sequence. It is similar in many regards to simply grouping tasks, which is covered later in this section, but allows additional functionality (such as limiting the scope of a variable to only the tasks within this container).

Each of the following tasks is also a container unto itself This is referred to as the Task Host Container, and is not a container h/pc you need to separately manage or add to your package.


  1. ActiveX Script Task—The ActiveX script task allows you to run scripts that use VBScript or JavaScript as a step in your process flow This has been largely superseded by the Script task, which uses VisualBasic,Net scripts. ActiveX scripts are primarily used with older packages that have been upgraded from DTS 2000.
  2. Analysis Services Execute DDL Task—This task allows you to execute a Data Definition Language (DDL) statement in Analysis Services.
  3. Analysis Services Processing Task—This task contains configuration options for processing Analysis Services objects.
  4. Bulk Insert Task— This task is used to import a large amount of data from a flat file source.
  5. Data Flow Task—This task is pretty much the bread and butter of Integration Services. It allows you to define how data is processed as it moves from source to destination,
  6. Data Mining Query Task —You can use the Data Mining Query task to run prediction queries using a Data Mining Extension (DMX) statement,
  7. Execute DTS 2000 Package Task—This task is used to execute packages created in DTS from SQL 2000, but have not yet been upgraded or migrated to Integration Services,
  8. Execute Package Task—This task can he used to launch another Integration Services task. U Execute Process Task—You can use this to execute a Windows application or batch file.
  9. Execute SQL Task—Use this task to execute a SQL script. Second only to the Data Flow task, this is probably one of the more commonly used tasks.
  10. File System. Task—File system tasks are used to interact with the file system, such as creating files and directories.
  11. FTP Task — You can use the File Transfer Protocol (FTP) to upload or download files to FTP servers.
  12. Message Queue Task— Using this task allows you to configure your package to interact with Microsoft Message Queuing (MSMQ) services.
  13. Script Task— This task is used to execute a Microsoft Visual BasicNet script.
  14. Send Mail Task—Use this task to send email messages during the execution of a package.
  15. Transfer Database Task—This task is used to transfer entire databases between different instances of SQL Server.

Precedence Constraints

Now that you’ve been introduced to the different tasks available for your control flow, you must understand how you can arrange or use these tasks together. When you add tasks to the control flou you must specify how and when to execute these tasks, If you were to just add a bunch of tasks into the control flow, and then run the package, all of the tasks would try to execute at the same time. That may be desirable in some instances, hut in most cases, you want to ensure that there is a defined logic to how and when the different tasks will execute. In more complex packages, some steps may not be executed at all unless there is a problem.

Precedence constraints are used to control the order in which tasks are executed, and whether or not they are executed based on the prior task failing, succeeding, or either, Tasks may also have more than one precedence constraint defined. When defining multiple precedence constraints on a task, you can also specify whether they are evaluated using the AND operator, which requires both constraints to evaluate to true; or the OR operator, which will execute the task as long as one of the constraints is met.

Remember that using multiple precedence constraints on a task means that either all conditions must be met, or just one. The conditions, however, could be completely different. For example, Task C has two precedence constraints defined, The first one requires that Task A succeeds, and the second one requires that Task B fails. If the AND operator is specified, then Task A must succeed, and Task B must fail. If Task B executes successfully, Task C will not run. If the OR operator is used, then Task C will run if Task A succeeds ,regardless of the outcome of Task B, or if Task B fails, regardless of the outcome of Task A.

Figure shows three tasks included in the control flow of a package. The first task will execute a  SQl statement and if it succeeds, the data flow task will execute. If the SQL task fails, then the Send  Mail task will execute, and notify the appropriate personnel.

  1. Transfer Error Messages Task— This task allows you to copy user-defined error messages (with an error number above 50,000) between instances of SQL Server. The data is stored in the sysmessages table of the Master database.
  2. Transfer lobs Task—This task is used to copy jobs between instances of SQL Server.
  3. Transfer Logins Task—Use the Transfer Logins task to copy logins between SQL Servers. This can be useful when creating a redundant server for fault-tolerance.
  4. Transfer Master Stored Procedures Task— You can use this task if you’ve created user-defined stored procedures in the Master database that you want to copy to another instance of SQL Server.
  5. Transfer SQL Server Object Task—Use this task to transfer other SQL objects such as tables, views, stored procedures, and triggers between instances.
  6. Web Service Task—The Web Service task is used to initiate a connection to a Web site, and can be used to return information to a variable or file.
  7. WMI Data Reader Task — The WMI Data Reader task can be used to query Windows Management Instrumentation (WMO namespaces to return information about the computer system.
  8. WA4I Event Watcher Task—This task can be used to query WMI for events relating to system behavior or performance.
  9. XML Task— You can add XML tasks to work with XML files and data sets.
  10. Back Up Database Task—The Back Up Database task does exactly what the name suggests, and allows you to configure a database backup as part of a package.
  11. Check Database integrity Task— Use the Check Database Integrity task to check the structural integrity and space allocation of all objects within a specified database.
  12. Execute SQL Server Agent Job Task—You can also configure an Integration Services package to launch a SQL Server Job as part of its process flow.
  13. Execute T-SQL Statement Task—Similar to the Execute SQL task, this requires that you specifically use the Transact-SQL dialect of the SQL language.
  14. History Cleanup Task— This task can be used to remove extemporaneous data from the rnsdb database, specifically historical information about backups, restores, jobs, and maintenance plans.
  15. Maintenance Cleanup Task—Use this task to remove leftover data from maintenance plans, such as backup files or text reports.
  16. Notify Operator Task— During the execution of a package, you may want to notify an operator that a certain step has completed or failed.
  17. Rebuild Index Task—You can use this task to rebuild an index during the execution of your package.
  18. Reorganize index Task—Rather than rebuilding, you can also reorganize one or more indices from one or more databases.
  19. Shrink Database Task—Use this task to shrink one or more databases.
  20. Update Statistics Task—This task allows you to execute a controlled UPDATE STATISTICS task for one or more databases.

Task Grouping

There is also a feature in Integration Services that allows you to add multiple tasks and containers to a group. Unlike containers, groups are not treated as a unit of execution, but they can help clean up the logic flow. As a UI enhancement more than anything else, when you group tasks together, you can “hide” the tasks from view by collapsing the group. This can be helpful when you have a complex package, and want to simplify the view, Also, note that precedence constraints cannot he defined on a group, but are defined on the tasks within that group.

Data Flow
As mentioned earIiei one of the biggest improvements to managing Integration Services packages was removing the data flow logic from the control flow, This allows you to create more complex transformations that are easier to design and are more manageable. Data flow tasks appear as a single unit of execution in the conlwl flow, but may have many complex steps in the data flow view.
When you switch to the data flow view, a drop-down list appears with all data flow tasks in the package. Each data flow task will allow you to configure at least a source and destination. You can optionally apply one or more bansforms that can modify or prepare the data before it reaches its destination.

Data Sources
Out of the box, SQL Server 2005 includes six data sources you can use in your data flow. You can, in fact, use multiple data sources in your task. This allows you to use a merge transformation to join the data sets before it reaches the destination. The following data sources are available:

  1. Data Reader Saurce—Use the Data Reader Source with an ADONET connection manager to connect to a NET provider.
  2. Excel Source—Use with Microsoft Excel spreadsheets.
  3. Flat File Source—This allows you to import from a variety of flat file formats (such as comma- separated, tab-separated, and fixed-length fields).
  4. OLE DB Source—Use with any OLE DB data source.
  5. Raw File Saarce—This is a specialized fiat file format that is optimized for quick use. This is typically created by Raw File deslinations to allow fast processing of the data, because minimal translation is required.
  6. XML Source —This allows you to use an XML file as a data source. You must ensure that elements within your XML file can be mapped to SQL fields, You can do this by creating a schema mapping file, using an inline schema within the file, or allowing Integration Services to try to create an XSD mapping file for this data source.

Transformations

Integration Services includes a number of built-in transforms to help modify or improve your data during the transfer process. With DTS 2000, many of these transforms had to be built programmatically, making the process of performing complex transformations burdensome. Afthough there are still programmatic options that let you build your own custom transfonns, you can use the following transforms to create complex packages with minimal programming ability:

  1. Aggregate—This transform is used to perform aggregate calculations, such as AVERAGE, GROUP BY, and COUNT.
  2. Audit— Use the Audit transfonn to output additional data about the environment.
  3. Character Map —The Character Map transfonn lets you transform string data. You can use this to convert uppercase to lowercase, and traditional Chinese to simplified Chinese, for example.
  4. Conditional Split —The conditional split transform is used to send different data sets from the same soune to different destinations.
  5. Data Conversion — Use the Data Conversion transform when you need to change the data type between source and destination.
  6. Data Minitg Query —This can he used to perform prediction queries against data mining models.
  7. Derived Column — Derived Columns can be used to modify the data itself during transformation.
  8. Export Column—This transform imports data into a file.
  9. Fuzzy Cronping — You can use this transform to help standardize your data. It allows you to look for string data that is similai and replaces the variants with a standard value.
  10. Fuzzy Lookup—Similar to the lookup transform, the fuzzy lookup uses values in a reference table, but accepts variants on the data.
  11. Import Column—This transform reads data from a file, and adds it to the data flow.
  12. Lookup—Viith a Lookup transform, existing data is joined to data being imported. This references values in a lookup table, but with an exact match.
  13. Merge— As the name suggests, the Merge transform merges data between two data sets. Use the Union All transform to join more than two data sources.
  14. Merge Join — Similar to the Merge transform, this transform uses JOIN statements to combine the data.
  15. Multicast —This transform allows the data to be sent to multiple destinations.
  16. OLE DR Command— Use this to execute a SQL command for each row in the input flow.
  17. Percentage Sampling —You can use this transform to return a random sampling of data from the input. When using this transform, specify a percentage of the total rows that will be sent to the output.
  18. Pivot — This can he used to denormalize data.
  19. Row Count — Use this transform to return a count of the total number of rows passed through, and return the count to a variable.
  20. Row Sampling —Similar to Percentage Sampling, Row Sampling outputs random rows, hut you specify the total number of rows to be returned.
  21. Script Component— This transform allows you to execute a custom script task that will transform the data.
  22. Slowly Changing Dimension — Used with dimension tables to perform changing attribute, historical attribute, fixed attribute, and inferred member changes.
  23. Sort—This transform allows you to sort the data on one or more import columns.
  24. Term Exiraction — You can use this transform to extract English nouns or noun phrases and redirect the terms to the output.
  25. Tern? Lookup —This transform uses a reference table and returns a count of the items in the reference table that appear in the data flow.
  26. Union All — This transform can be used to merge multiple data sets.
  27. Unpivot—Use the Unpivot transform to normalize data.

Data Destinations

Integration Services allows you to specify one or more destinations in a data flow task, The following is a list of data flow destinations that are available with SQL Server out-of-the-box:

  1. Data Mining Model Training—This destination passes data through data mining model algorithms to train the data mining model.
  2. DotaReader Destination — This uses the DataReader interface in ADO.NET.
  3. Dimension Processing— This destination loads data into an Analysis Services dimension.
  4. Exo’l Destination — Use this to output your data to a Microsoft Excel file.
  5. Flat File Destination — This can be used to output the data to comma-separated, tab-separated, or fixed-length file.
  6. OLE DB Destination — This destination uses an OLE DB provider.
  7. Partition Proo’ssing—This destination is used to output the data to an Analysis Services partition.
  8. Raw File Destination — Used as an intermediary output between data flow tasks, this format allows for quick processing as minimal formatting options need to be defined.
  9. Recordset Destination — This outputs the data to an ADO recordset,
  10. SQL Server Destinat ion—This inserts data into a Microsoft SQL Server destination.
  11. SQL Server Mobile Destination —This destination is used for SQL Server Mobile Edition clients.

Event Handling

Event handling is another feature of Integration Services that provides more granular control over the execution of your packages, and the tasks within them, Frequently associated with error handling, event handling allows you to execute additional tasks before a task executes, during task execution, and after task execution.

Any well-designed package includes the ability to control or monitor the execution of the tasks within. Using efficient error handling and event handling is the cornerstone for creating packages that require minimal maintenance and hands-on execution. The Event Handling tab includes options for configuring tasks that execute for the following 12 package- or task-level events:

  1. OnError
  2. OnaxecstatusChanged
  3. Onlnformation
  4. OnPostExecute
  5. OnPostvalidate
  6. OnPreExecute
  7. OnPreV’aIidate
  8. OnProgress
  9. OnQueryCancel
  10. OnTaskpaiiect
  11. onvariablevaiuechanged
  12. Onwarning

Configuring additional tasks on these events can improve error handling and provide you with more precise control over execution of your packages. Be careful, though. Too much granularity can cause more administrative work than necessary.

Package Explorer
The Package Explorer is a useful utility that allows you to view the different elements of your package in an organized, hierarchical structure, Though this view isn’t representative of the control or data flow, it can help you quickly find an element of either, You can view or modify the properties, or delete unused elements from your package.

Creating a Simple Package
So, now that you have a fairly good understanding of the different elements of an Integration Services package, it’s time to put it to use, In this scenario, the employees at Adventureworks have decided to pool some of their resources and decided to keep a DVD library of titles in the office to share among themselves. Anyone is welcome to participate, but if you’re going to borrow, you’re going to contribute!

Up until now, one of the employees had been keeping track of their collection using an InfoPath form, However, over the last couple of months, the collection has grown significantly, and the core group who started the library wants a better way to manage it, They’ve decided to store the data in SQL, hut rather than having to re-enter all the data by hand, they would prefer to import the XML file into the database,

For this example, you can download an XML file from http:J /p2p.wrox, com (search for the ISBN— 04700047046—or title of the book), or you could build your own sample file.

Create the Connection
You’ll begin by ciating a new Integration Services project called AdventureWorks IS Project. When you create the project, a new package named Package. dtsx is created. Rename the package XMLImporc . dtsx.

Follow these steps:
1. Although you’ll only be creating one package in this exercise, create a new Data Source that will be available to all packages that are part of this project. Right-click Data Sources and select New Data Source to launch the Wizard.
2. In the Wizard, click Next on the introduction page. Then, on the next page, “Select how to define the connection,” click New. In the Connection Manager dialog box, either enter your server name, or select it from the drop-down list. Use Windows Authentication, and select AdventureWorks as the database. Click OK when these options have been selected.
3. Click Next to go to the Completing the Wizard page, and leave the default Data Source name. Click Finish to complete the wizard.
4. Now, below the Confrol Flow pane, you should see the Connection Managers pane. Right-click anywhere in the pane and select New Connection from Data Source.
5. Select Adventure WOrks and click OK.

Create the Data Flow Task
Now it’s time to put the package to work, The first thing to do is create a new Data Flow task, From your Toolbox, drag the Data Flow task into the control flow window, and rename it XML Transfer.
Click the Data Flow tab, You should see XML Transfer in the drop-down list, If you had more than one data flow task in your package, you could navigate through them without having to go back to the control flow task to manage it.
The first thing to do to build the data flow is to specify the source, Drag the XML Source from the Toolbox into the data flow pane, and then double-click it to open up the XML Source Editor.
For the Data Access mode, ensure that XML file location is selected, Type the path or browse to the DVDLib . xml file. Because there is no schema mapping file, you can have the SQL Server generate one. Click the Generate XSD button, and use the default filename and location.
You can navigate to the Columns page to view the columns that will be imported into your database, but you should not need to change anything in there. Click OK to exit the editor.

Summary
In summary, SQL Server 2005 Integration Services is a very powerful tool for controlling data transformation operations between SQL Servers and other data stores. It allows you to create very simple data flow models, or very complex ones. It is also extensible, allowing you to programmatically build additional transformations and conbols, This chapter provided you with a high-level overview of how the Integration Service works, how to use the Import and Export wizards, and how to build simple packages using the SSIS designer. If you would like to learn more about Integration Services, check out Profrssional SQL Server 2005 iniegration Service (Indianapolis: Wiley, 2006).

SQL Server™2005 Administration: Introduction to Replication

December 10, 2009 Visitor Blogs Leave a comment

Replication Overview

SQLServer replication operates similarly to print media. In fact, many of the terms used in replication are also used with newspaper and magazine distribution. Replication begins at the server known as the publisher. Just as a newspaper collects articles its readership is interested in, the publisher compiles and arranges data into articles, and one or more articles are included in a publication. To get data to subscribers, the newspaper needs to employ a distributor. In SQLServer replication, just as in the newspaper business, the subscribers can choose to have the distributor deliver the publication to them, or they can go to the distributor to pick up the publication. This is not unlike going to a newsstand or your favorite coffee shop and picking up a newspaper. SQLServer also allows you to specify whether the publisher is also the distributor, or if another server is used as an intermediary distributor. This is not unlike the newspaper hiring carriers directly, versus outsourcing the work to a third party. Here’s a review of some of the key terms used in the last paragraph, and a preview of some in the SQLServer Replication Agents section, to help provide a clearer understanding of the way replication works:

❑Publisher—The server providing the source data that will be made available to subscribers.

❑Article—Acollection of data that exists as part of a replication publication.

❑Publication—One or more articles that act as a unit of replication.

❑Distributor—The server that is responsible for providing data to subscribers.

❑Subscriber—Aserver or client that receives a publication.

❑Push Replication—Amethod of replication whereby the distributor delivers the data to the subscriber.

❑Pull Replication—Amethod of replication whereby the subscriber requests the data from the distributor.

❑Publisher/Distributor—Apublishing server that acts as its own distributor.

SQLServer replication can be used to provide data to another database (which can be used for Analysis and Reporting Services), to a remote office for local application use, or to client applications for offline or asynchronous use. When choosing a replication strategy, you must have a solid understanding of how the data will be used. Consider the following questions:

❑Do the users need access to up-to-the-minute data, or can they work with data that is provided on a regular schedule?

❑Do the users need to be able to make changes to the data, and should those changes be synchronized with the original source material?

❑Will the data originate from a single source, or will multiple sources be used to provide data to a centralized database?

Finding answers to these questions (and more) can help you make the appropriate decisions about your replication topology. You can use this information to help build solutions to make your database applications more responsive and more useful to your users. Also note that when replication is configured, a new system database named distributionis created. This database is used for storing metadata and history information about your replication topology. One distribution database is created for a single instance of SQLServer and is shared among all databases.

SQL Server Replication Agents

Now that you’ve got the terminology down, and have an idea of what your users’ needs are, it’s time to take a look at the mechanics behind replication. This section introduces the various replication agents that are available for your solution. Replication agentsare programs that run certain tasks to control preparation and distribution of your data. The agents that are used depend on the type of replication used in your solution. Some of these agents are available to all replication types, and some are specific tojust one type of replication. Replication relies on the SQLServer Agent service when the server is a publisher or distributor to automate the replication process. Though it is possibleto manually execute the replication agents, it would be more work than it’s probably worth. Ensure that your SQLServer Agent service is running, and, in most cases, it should be configured to start when the server starts.

Snapshot Agent

With SQLServer replication, the Snapshot Agent is used for almost all replication types. The Snapshot Agent is executed at the distributor, and is responsible for preparing the initial data files and schema information about the articles that will be published. The data is written to a snapshot folder on the file system and synchronization data is written to the distribution database. It often acts as the baseline for other replication agents. In this way, it’s not unlike the way a full backup works when using differential or transaction log backups.

Log Reader Agent

The Log Reader Agent is used specifically with transactional replication. When a transaction that is part of a publication is written to the transaction log, the Log Reader Agent copies those transactions from the publisher to the distribution database on the distributor. This allows those transactions to then be executed on the subscriber databases. Each database that participates in transactional replication has its own Log Reader Agent.

Distribution Agent

Used with both snapshot and transactional replication, the Distribution Agent applies snapshots to subscribers, and, in the case of transactional replication, moves the transactions to the subscribers. If pull replicationis used, the Distribution Agent is executed at the subscriber. If push replicationis used, it is run at the distributor.

Merge Agent

When merge replicationis used, the Merge Agent provides the initial snapshot to the subscriber, not unlike the Distribution Agent. Each subscription has its own Merge Agent that handles the reconciliation of data between the publisher and the subscriber. Another similarity it shares with the Distribution Agent is that it runs at the distributor for push subscriptions, and at the subscriber for pull subscriptions. When communicating between publishers and the subscribers, the Merge Agent typically downloads the changes to the subscriber, and uploads changes to the publisher.

Queue Reader Agent

The Queue Reader Agent is used with a specific type of transactional replication that allows updatable subscriptions. When updates from the subscriber are provided to the publisher, these updates can be queued, and then processed as a unit. This agent runs at the distributoi and only one instance is required for all publications in the distribution database.

SQL Server Replication Types

The term replication tqpc refers to the logical model of a replication topology, and although there are only three main types (transactional, snapshot, and merge), each one offers additional configuration options that allow you to have more granular control over how data geta from source to destination, and what happens to it when it gets there. Consideration for the different replication types is based on the way the data is going to be used, and how important it is that the data be current.

Two key terms to remember when choosing a replication type are autonomy and latency Autononnj refers to the amount of” hands-off-ness” of the data, Latency refers to the amount of time that elapses between when a change is made to the data and when the data is replicated to the subscriber, In the different replication types available, there is a direct correlation between autonomy and latency

For example, AdventureWorks has a regional office that uses quarterly updates on sales figures for an employee incentive program. Because this program is only in place at the regional office, there’s no need to provide updates back to the home office. Once the home office provides the data to the regional office, they don’t do anything else with it. Any changes to the data will not be reflected back in the corporate database, much to the relief of the company’s corporate auditors. In this scenario, a high amount of autonomy and latency are evident. The regional office gets the updated data once every three months; and once they get it, they own the data for their own needs,

Conversely a retail chain that requires real-time inventory tracking from their Point-of-Sale system that would not only update the local inventory database, but also ensure that each retail and warehouse location are aware of the product inventory at the other locations. This is an example of an application that requires very’ low latency. In this scenario, when a customer is looking for a DVD player that’s out of stock at the Seattle location, a sales clerk can inform the customer of the availability of that model at the Bellevue and Tacoma locations,

Distributed Transactions

Although distributed transactions aren’t part of SQL Server replication, per se, it’s important to know how distributed transactions fit into the distributed data model, First and foremost, any transaction that executes across more than one database, even if it is attached to the same instance, is considered a distributed transaction, This is because the scope of the transaction exists outside of the context of the current database.

For example, the following SQL script executes in the context of the Adventureworks database, but only one of the tables is actually created in the tempdb database. Because both CREATE TABLE statements are wrapped in the BEGIN TRANSACTION and COtn’IIT TRAIJSACTION statements, they must both execute, or the entire transaction fails.

This is a very simplified example of hou distributed transaction ale dcsigned to work, but you get the idea. When an application executes di tubuted transaction again t multiple server , n additional step may be taken in order to ensure the availability of the target servers. This is what’s knows as a two-phase commit (2PC).

The first phase is the preparation phase. This step prepares the destination servers, known as resource managers, by sending out a command to inform them that a transaction is coming. The resource managers take every possible precaution to ensure that when the transaction is received, it can be processed without failure, This helps ensure the stability and reliability of the distributed transaction. The resource manager then informs the transaction manager, usually the Microsoft Distributed Transaction Coordinator (MS DTCL whether or not the preparations were successful.

The second phase is executed when all the resource managers have reported successful preparation. In this phase, the transaction manager expects to receive a successful commit from each of the resource managers. When it does, then it can report the transaction as having been committed to the application. If just one of the resource managers reports failure, the transaction must be rolled back from all resource managers, and the transaction manager reports the failure to the application.
Distributed transactions can also be executed directly from stored procedures, or other Transact-SQL methods. You can use the BEGIN DISTRIBUTED TRANSACTION statement to explicitly invoke a dis tributed transaction.

Distributed transactions do not define a publisher, distributor, and subscriber the way replication does. Instead, it relies on the application design to control how data is processed across multiple servers. It does offer the least amount of autonomy and latency because the transactions are immediately processed on destination servers.

Transactional Replication

Transactional replication has the lowest latency and autonomy of the three standard replication types. With transactional replication, you begin with a snapshot of the data that will be used as a baseline for further transactions to be applied against. As transactions are committed, those that apply to data that participates in transactional replication are copied to the distribution database on the distributor, Then the subscribers can receive the transactions and apply the changes to the copied data.

Standard Transactional Publication

Standard transactional publication replication is used when the subscriber accepts the publication for read- only use. This prevents the subscriber from being able to update the data on the publisher, but does not prevent clients from updating data on the subscriber itself. For example, a remote server may use the rephcated data in conjunction with SQL Reporting Services to provide access to historical and trend data, and modifications may be made locall hut no changes can be submitted at the remote server that will be accepted at the original publisher.

Transactional Publications with Updatable Subscriptions in a Hierarchical Topology

This implementation of transactional replication allows a model where you have a single publisher with multiple subscribers, Periodicall the subscribers may need to make changes to the replicated data, in which case, the update is sent back to the original publisher. The original publisher provides those updates (through the disinbutod to all the subscribers. This can he helpful in an environment where a remote site receives corporate sales data hut occasionally submits updates regarding their local sales department.

Updatable subscriptions in a hierarchical topology allow both immediate and queued updates to be submitted from the subscribers. Immediate updates are processed similarly to distributed transactions, in that a two-phase commit is used, If immediate updates are not necessary subscriber updates can be stored in a queue, and then applied asynchronously whenever the publisher is available,

Transactional Publications in a Peer-to-Peer Topology

New to SQL Server with SQL Server 2005, peer-to-peer transactional replication creates an environment where all participants are both publishers and subscribers. This implementation allows you to create a distributed database environment where all SQL Servers can provide and receive updates with low latency. This further lowers the autonomy previously offered by transactional replication, by allowing any data to be changed on any server, and all participating servers will receive the updates. When using peer-to-peer transactional replication, use SQL security features to take appropriate precautions to ensure that data can only be updated from approved locations.

Snapshot Replication

Snapshot replication can be used in replication topologies where there can be significant latency between when changes are committed on the publisher, and when they are received by the subscriber. Rather than providing updates on a transaction-by-transaction basis, snapshots of the entire article are taken at the publisher periodically, and then applied to the subscriber as a unit.

When a snapshot is generated, it is saved as a file to a file system that must be accessible to the publisher, the distributor, and the subscriber. If one or more of these are on different physical servers, you should specify a Universal Naming Convention (UNC) path for the snapshot folder location. If all the components reside on the same server (which can be the case if you’re using replication to populate a separate database on the same server for Reporting Services or Analysis Services), you can use a local file system path. The files generated by snapshot replication are files that can be easily applied to the subscriber using BCE

The obvious benefit of snapshot replication is for asynchronous environments where there can be a delay between the publisher and subscriber. However, because the snapshots are copied and applied each time replication occurs, it can be prohibitively resource-consuming for large amounts of data, There are options that can allow you to compress the snapshot files, as well as to help reduce disk usage and transfer times between the distributor and subscriber, though.

As with transactional replication, snapshot replication also supports the use of immediate or queued updating subscriptions.

Merge Replication

Merge replinition is used in environments where clients and servers may be working independently of one another, but they periodically need to connect to synchronize data, Merge replication is often used in client and server environments. For example, if you have a mobile sales force where all members of the sales team need access to customer information, but will not always have connectivity to your server, an application that replicates data relevant to their customers can provide them access to the information they need, They can also use the application to submit orders, track invoices, and keep notes about their clients, When they are back in the office, or can connect to the corporate network remotely, they can synchronize their changes with their home servers.

Merge replication introduces a number of changes to your publishing databases that are used for tracking and synchronization. This includes adding a column to published tables used for tracking, as well as additional tables and triggers that are used to store row history data for published rows, These elements  are designed to be unobtrusive, and the triggers created by replication will not affect any user-defined triggers that you have created.

Merge replication offers both a blessing and a curse with its ability to handle synchronizations from multiple subscribers. Its blessings come from the ability to grant users offline access to data sets that are critical to their needs, Its curse comes from the fact that multiple offline subscribers may be trying to synchronize changes to the same data.
The change-tracking mechanism employed by merge replication allows you define how conflicts are resolved between multiple subscribers. For example, a particular subscriber may be given more weight than others, and, therefore, its changes should be considered authoritative. There are also mechanisms to programmatically build more complex resolvers. This allows you to create more granular rules about conflict resolution, giving priority to certain subscribers only if specific criteria are met.

Oracle Replication

Microsoft SQL Server 2005 supports replication of Oracle objects. To expect that this works perfectly out-of-the-box with no additional configuration is a bit naïve, but you should be familiar with some of the concepts and caveats of using SQL Server for Oracle replication. For example, SQL Server requires that your Oracle server be running at least version 8.0.5. Configuration of your Oracle server as the publisher may depend on the specific version you are using and the tools available for that version. Refer to the topic “Oracle Publishing Overview” in Books Online for a more complete list of requirements.

Snapshot replication from an Oracle database operates similarly to SQL Server snapshots. The Snapshot Agent connects to the Oracle publication, and retrieves rows and creates schema scripts for each published table. As with SQL snapshot replication, the entire dataset is created each time the table is run.

Unlike homogenous SQL Server transactional replication (which monitors the transaction log for changes), transactional replication for Oracle requires that changes be made to the Oracle database by creating tracking tables and triggers. When changes to a published Oracle table are made, the triggers fire and inseit the changes into the tracking table. This is not unlike the behavior seen in merge replication. SQL Server, again acting as the distributor, executes the Log Reader Agent to move the changes from the tracking table to the distribution database. The distributor then provides the changes to the subscribers as would be expected.

SQL Server Replication Models

Now that you have an understanding of the different types of replication available, the next consideration in building your replication topology is to identify the model that will be used. Whereas the type defined the logical flow of data, the model defines the physical implementation of how the data will be distributed Any of the aforementioned SQL Server replication types can use any of the SQL Server replication models. The overall design and topology for replication should be built around how the data is used, and the accessibility requirements for your users and your applications.

Another consideration when choosing an appropriate replication model is whether to have the publisher and the distributor on the same instance. Geographical distribution and network availability may influence your decision to use a local distributor versus a remote one,

Single Publisher / Multiple Subscribers

In cases where the data should originate from only location only, a single publisher/multiple subscriber model can be used to provide access to data for remote locations. For example, Adventureworks has offices across the United States. The corporate office, headquartered in Tacoma, might need to make its sales data available to its remote offices in Omaha, Baton Rouge, and Rochester for a reporting application. In this case, the company may want to use a single distributor in the Tacoma office to provide updates using snapshot replication to each of the field offices.

In this scenario, the AdventureWorks field office will get sales data updates through an asynchronous delivery method, Because they will not be changing the data that is replicated, there is no need for updatable subscriptions, and each office will receive the same data.

Multiple Publishers / Single Subscriber

The multiple publishers/single subscriber model can be used where multiple servers will track and update data, hut that data will need to be consolidated on a single serven Let’s look at a retail chain as an example of how this can work.

An electronics retailer uses an inventory-tracking database to keep track of product stock at each location, Each stow maintains its own inventory through its shipping department and point of sale application. Each location uses transactional replication to provide changes to its local stock to a regional warehouse, which holds the subscriber database. This helps the regional inventory manager keep track of when a specific store is running low on a certain product, and she can make arrangements to provide the items to the store, Because transactional replication is used in this topology; the inventory database at the regional warehouse gets updated with minimal delay.

Multiple Publishers / Multiple Subscribers

The multiple publishers / multiple subscribers model works well for environments whew data must be shared among peers. This can he useful in applications where the local database stores information about local and remote operations. Each publisher can provide updates made locally to all other replication partners that participate in this model, and receive the updates in turn.

For example, three friends decide to get into the fast-food business, and each buys several franchises within their cities. Each restaurant keeps track of its own inventory, as well as being able to see the inventory at the other locations. This is so that the local manager can call another store for spare ingredients, just in case the supplier cannot deliver in time. To prevent having a separate inventory table for each location, merge replication can be used. Each store updates its own inventory values daily and then synchronizes with the other locations so that each store is aware of the inventory at the other locations.

Replication Tools

When reviewing the options provided for designing the replication topology, additional considerations may be evaluated for determining what data is replicated and how, This section provides an overview of some of the available tools and procedures that can help provide a more robust replication architecture.

Filtering

It is not always appropriate to replicate entire tables from one server to another. You can, in fact, use filtering at the publisher to limit what will be available to the subscribers. Subscriptions can also employ filters at the subscriber to ensure that only data relevant to that subscriber is received and processed. Four types of filters are available:

  1. Static row filters
  2. Column filters
  3. Pararneterized row filters
  4. Join filters

Static Row Filters

Static row filters can be used with all types of replication. They are defined at the publisher, and allow you to limit which rows wilt be made available in a publication by simply using a WHERE clause. For example, you could provide regional managers with Human Resources data about the employees at only their respective locations by using a row filter based on the employee’s city field, or another location-identifying column.

Column Filters

column filtering can be used to remove certain columns from all rows in a publication. For example, if you have Human Resources data that will be made available to multiple databases for different applications, it may not always be appropriate to include confidential data such as salary information or the employees’ Social Security numbers. In this case, you can create publications that eliminate the unnecessary data from the publication without removing it from the base table.
Column filtering can be used with all types of replication. Howevei certain types of columns may be excluded from filtering depending on which replication type you are using. You can use both column filters and row filters in a single publication to narrow the scope of the published data.

Parameterized Row Filters

Paranieterized row filters are available only with merge replication, and are similar in concept to static row filters. In execution, though, they are significantly different. The purpose of parameterized row filters is to be able to create multiple data partitions that will he replicated without having to create multiple publications. For example, if you use the same base table, and you have two different subscribers that each need a different subset of that same table, using standard row filters would require you to create two publications, one for subscriber A, and the other for subscriber B.

With parameterized row filters, you can specify that for subscriber A, you are interested in rows that have the values WA, NE, and OK in the state field. For subscriber B, you are interested in providing rows that contain the values CA, OR, and AK in the state field, Each of these data sets exists as part of the same publication.

The partitions created by parameterized row filters can also overlap. Using the preceding example, if a new subscriber wanted all the rows with the values WA, OR, NV, and TX in the state column, then that is an example of an overlapping partition. You can configure overlapping partitions to allow updates from the subscriber to any column, or only any non-shared column. Nonoverlapping partitions can be made available to multiple subscribers, preventing the subscriber from updating the changes. If a nonoverlapping partition is available to only one subscriber, then that subscriber can make changes to all columns in that partition.

Join Filters

Join filters are also limited to merge replication, and are commonly used to extend the data in a publication that uses parameterized row filters. This operates similarly to a JOIN statement in Transact-SQL to combine the data from one or more tables. The data in the related tables is published only if it meets the condition of the JOIN FILTER clause.

IBM – IBM Software Newsletter

November 2, 2009 Visitor Blogs Leave a comment
  1. Effective Software Delivery (e.g. managing teams, requirements, quality)
  2. Empowering People (e.g. collaboration, IM, conferencing, eforms, portals)
  3. Leveraging Information (e.g. business intelligence, data/content mgmt.)
  4. Leveraging Middleware Investments (on Linux, System z, POWER)
  5. Optimizing IT (e.g. virtualization, cloud, Software as a Service)
  6. Product Lifecycle Management (e.g. virtual product design, digital mfg.)
  7. Service Management (e.g. security, resilience, risk, compliance)
  8. Service Oriented Architecture (SOA) and the IBM Smart SOA approach
  9. Green software (reducing energy consumption, emissions, cost)
  10. Solutions for small and midsized businesses (SMBs)

IBM – IBM Software Newsletter.

6 Advanced JavaScript Techniques You Should Know

November 2, 2009 Visitor Blogs Leave a comment

SDL Optimization Model

November 2, 2009 Visitor Blogs Leave a comment

Microsoft SDL – Developer Starter Kit.

 

 

The four security maturity levels of the SDL Optimization Model

 

MiniFuzz File Fuzzer

MiniFuzz is a basic testing tool designed to help detect code flaws that may expose security vulnerabilities in file-handling code. This tool creates multiple random variations of file content and feeds it to the application to exercise the code in an attempt to expose unexpected application behaviors.

BinScope Binary Analyzer

The BinScope Binary Analyzer is a Microsoft verification tool that analyzes binaries to ensure that they have been built in compliance with Microsoft’s Security Development Lifecycle (SDL) requirements and recommendations.  BinScope checks that SDL-required compiler/linker flags are being set, strong-named assemblies are in use, and up-to-date build tools are in place.

BinScope also reports on dangerous constructs that are prohibited or discouraged by the SDL (e.g. read/write shared sections and global function pointers).

SDL Process Template for VSTS

This template makes writing secure code easier by integrating the policy, process and tools of the SDL v4.1 directly into Visual Studio Team System 2008. It eases adoption of the SDL, enables auditable security requirements and status, and demonstrates security return on investment in a framework that is familiar to developers, testers, and program managers.

FxCop

FxCop is an application that analyzes managed code assemblies (code that targets the .NET Framework common language runtime) and reports information about the assemblies, such as possible design, localization, performance, and security improvements .

FxCop is intended for class library developers. However, anyone creating applications that should comply with the .NET Framework best practices will benefit.

SiteLock

SiteLock enables an ActiveX developer to restrict access to a control in a predetermined list of domains, or for a certain length of time. It includes a simple utility that displays the site list and expiration date of a site-locked control.

The SiteLock ATL template enables an ActiveX developer to restrict access so that a control is only deemed safe when used in a predetermined list of domains. This limits the ability of Web page authors to reuse the control for malicious purposes.

Code Analysis for C/C++ (/analyze in Visual Studio)

A static analysis tool that plows through source code one function at a time, and looks for C/C++ coding patterns and incorrect code usage that may indicate a programming error.

Code Analysis Tool .NET (CAT.NET) v1 CTP

CAT.NET is a binary code analysis tool that helps identify common variants of certain prevailing vulnerabilities that can give rise to common attack vectors such as Cross-Site Scripting (XSS), SQL Injection and XPath Injection. The tool can function as a plug-in for Visual Studio 2005/2008, FxCop custom rule, MSBuild custom task or through the command line prompt and analyzes compiled .NET binaries.

Banned.h

The banned.h header file is a sanitizing resource which supports the SDL requirement to remove banned functions from a code. It lists all banned APIs and allows any developer to locate them in a code.

SDL Process Guidance

SDL Threat Modeling Tool

SDL Tools Repository

The Amazing Adventures of Kevlarr and the SDL

 

SDL Optimization Model.