UPDATE/ADD/ALTER Column and Table in SQL

UPDATE Table

The following SQL statement updates the first customer (CustomerID = 1) with a new contact person and a new city.

Example

UPDATE Customers
SET ContactName = 'Alfred Schmidt', City= 'Frankfurt'
WHERE CustomerID = 1;

UPDATE Multiple Records

It is the WHERE clause that determines how many records that will be updated.

The following SQL statement will update the contactname to “Juan” for all records where country is “Mexico”:

Example

UPDATE Customers
SET ContactName='Juan'
WHERE Country='Mexico';

Update Warning!

Be careful when updating records. If you omit the WHERE clause, ALL records will be updated!

Example

UPDATE Customers
SET ContactName='Juan';

ALTER TABLE – ADD Column

To add a column in a table, use the following syntax:

ALTER TABLE table_name
ADD column_name datatype;

ALTER TABLE – ALTER/MODIFY COLUMN

To change the data type of a column in a table, use the following syntax:

SQL Server / MS Access:

ALTER TABLE table_name
ALTER COLUMN column_name datatype;

DELETE COLUMN/TABLE TABLE

DELETE FROM Customers
WHERE CustomerName='Alfreds Futterkiste';

DELETE FROM table_name
WHERE condition;

Delete All Records

DELETE * FROM table_name;

ALTER TABLE – DROP COLUMN

To delete a column in a table, use the following syntax (notice that some database systems don’t allow deleting a column):

ALTER TABLE table_name
DROP COLUMN column_name;

SQL Inteview Questions

--CREATE a TABLE FOR INCREMENTAL NUMBBER FROM 1 to 1000
CREATE TABLE dbo.TEST_NUMBER (IncrNum int NOT NULL)
;
--INSERT VALUES USING LOOP
DECLARE @first AS INT
SET @first = 1
DECLARE @step AS INT
SET @step = 1
DECLARE @last AS INT
SET @last = 1000

BEGIN TRANSACTION
WHILE(@first <= @last) BEGIN INSERT INTO TEST_NUMBER VALUES(@first) SET @first += @step

END COMMIT TRANSACTION

-- SELECT TOP 1000 ROWS 
SELECT TOP (1000) [IncrNum]
 FROM [dbo].[TEST_NUMBER]

-- SELECT [IncrNum] FROM 1 to 1000
SELECT *
FROM [dbo].[TEST_NUMBER]
WHERE [IncrNum] <= 1000

-- SELECT ODD NUMBERS
SELECT *
FROM [dbo].[TEST_NUMBER]
WHERE [IncrNum] <= 1000
AND ([IncrNum] % 2 <> 0)

SISS to SQL Server Data Type Translations

It can be extremely confusing when you first encounter SSIS data types.  At first glance, they seem to be nothing like SQL Server data types you love and know.  That’s why I’ve provided below a conversion chart of SSIS data types to SQL Server data types.  This information is readily available on MSDN but it always seems difficult to find.  Hope this helps!

SISS to SQL Server Data Type Translations

http://www.sqlservercentral.com/blogs/dknight/2010/12/22/ssis-to-sql-server-data-type-translations/

SQL – Queries Tuning and Optimization Techniques

In SQL, it is very difficult to write complex SQL queries involving joins across many (at least 3-4) tables and involving several nested conditions because a SQL statement, once it reaches a certain level of complexity, is basically a little program in and of itself.

A database index is a data structure that improves the speed of operations on a database table. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records. Indexing is incredibly important when working with large tables, however, occasionally smaller tables should be indexed, if they are expected to grow.

Try to consistently indent and don’t be afraid to use multiple lines. You don’t have to write it all at once. Complex queries can sometimes just be a collection of simple queries. You need to follow some basic guidelines and Take the time to think these through such as-

List all of the columns that are to be returned

  1. List all of the columns that are used in the WHERE clause
  2. List all of the columns used in the JOINs (if applicable)
  3. List all the tables used in JOINs (if applicable)
  4. Get the correct records selected first
  5. Save the complex calculations for last
  6. If you do use a Common Table Expression (CTE), be aware that the query only persists until the next query is run, so in some cases where you are using the CTE in multiple queries, it might be better for performance to use a temp table.

 

Once you have the above information organized into this easy-to-comprehend form, it is much easier to identify those columns that could potentially make use of indexes when executed.

It makes a big difference to really understand how the data is combined, selected, filtered, and output. Here, query Optimization tricks come into the picture to increase the performance of the program or software. There are a lot of guideline points to tune your query which do work as the boost of the query performance.  These guideline points are mentioned below:

  1. SET NOCOUNT ON at the beginning of each stored procedure you write. This statement should be included in every stored procedure, trigger, etc. that you write.
  2. The SQL query becomes faster if you use the actual columns names in SELECT statement instead of than ‘*’.
  3. HAVING clause is used to filter the rows after all the rows are selected if you are using aggregation functions. It is just like a filter. Do not use HAVING clause for any other purposes.
  4. It is the best practice to avoid sub queries in your SQL statement and try to minimize the number of subquery block in your query if possible.
  5. Use operator EXISTS, IN and table joins appropriately in your query. The reason is- Usually IN has the slowest performance
  6. IN is efficient when most of the filter criteria are in the sub-query.
  7. EXISTS is efficient when most of the filter criteria is in the main query.
  8. Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.
  9. Be careful while using conditions in WHERE clause.
  10. To write queries which provide efficient performance follow the general SQL standard rules.
  11. Use single case for all SQL verbs
  12. Begin all SQL verbs on a new line
  13. Separate all words with a single space
  14. Right or left aligning verbs within the initial SQL verb
  15. Indexes have the advantages as well as disadvantages as given below-
  16. Do not automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by the queries run against the table.
  17. Indexes should be measured on all columns that are frequently used in WHERE, ORDER BY, GROUP BY, TOP and DISTINCT clauses.
  18. Do not add more indexes on your OLTP tables to minimize the overhead that occurs with indexes during data modifications.
  19. Drop all those indexes that are not used by the Query Optimizer, generally.
  20. If possible, try to create indexes on columns that have integer values instead of characters. Integer values use less overhead than character values.
  21. To provide up-to-date statistics, the query optimizer needs to make smart query optimization decisions. You will generally want to leave the “Auto Update Statistics” database option on. This helps to ensure that the optimizer statistics are valid, ensuring that queries are properly optimized when they are run.
  22. If you want to boost the performance of a query that includes an AND operator in the WHERE clause, consider the following:
  23. Of the search criteria in the WHERE clause, at least one of them should be based on a highly selective column that has an index.
  24. If at least one of the search criteria in the WHERE clause is not highly selective, consider adding indexes to all of the columns referenced in the WHERE clause.
  25. If none of the columns in the WHERE clause are selective enough to use an index on their own, consider creating a covering index for this query.
  26. queries that include either the DISTINCT or the GROUP BY clauses can be optimized by including appropriate indexes. Any of the following indexing strategies can be used:
  27. Include a covering, non-clustered index (covering the appropriate columns) of the DISTINCT or the GROUP BY clauses.
  28. Include a clustered index on the columns in the GROUP BY clause.
  29. Include a clustered index on the columns found in the SELECT clause.
  30. Adding appropriate indexes to queries that include DISTINCT or GROUP BY is most important for those queries that run often.
  31. When you need to execute a string of Transact-SQL, you should use the sp_executesql stored procedure instead of the EXECUTE statement.
  32. When calling a stored procedure from your application, it is important that you call it using its qualified name.
  33. Use stored procedures instead of views because they offer better performance and don’t include code, variable or parameters that don’t do anything.
  34. If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and reduce the performance and scalability of your applications.
  35. Instead of using temporary tables, consider using a derived table instead. A derived table is the result of using a SELECT statement in the FROM clause of an existing SELECT statement. By using derived tables instead of temporary tables, you can reduce I/O and often boost your application’s performance.
  36. Don’t use the NVARCHAR or NCHAR data types unless you need to store 16-bit character (Unicode) data. They take up twice as much space as VARCHAR or CHAR data types, increasing server I/O and wasting unnecessary space in your buffer cache.
  37. If you use the CONVERT function to convert a value to a variable length data type such as VARCHAR, always specify the length of the variable data type. If you do not, SQL Server assumes a default length of 30.
  38. If you are creating a column that you know will be subject to many sorts, consider making the column integer-based and not character-based. This is because SQL Server can sort integer data much faster than character data.
  39. Don’t use ORDER BY in your SELECT statements unless you really need to, as it adds a lot of extra overhead. For example, perhaps it may be more efficient to sort the data at the client than at the server.
  40. Don’t return more data (both rows and columns) from SQL Server than you need to the client or middle-tier and then further reduce the data to the data you really need at the client or middle-tier. This wastes SQL Server resources and network bandwidth.
Conclusion
To tune our SQL queries, understanding our database does play the most important role. In SQL, typically each table column has an associated data type. Text, Integer, Varchar, Date, and more, are typically available types for developers to choose from. When writing SQL statements, make sure you choose the proper data type for the column. Sometimes it’s easier to break up subgroups into their own select statement. To write a query, we need to know about the actual need for the query and scope of the query also.

 

 

 

read more:
http://www.sql-datatools.com/2016/05/sql-queries-tuning-and-optimization.html

Quick Tricls: Column and block text selection using SSMS

There are several ways to select text as shown below, including the ability to select and edit columns.

Using SHIFT to Select Text

It is well known that using the SHIFT key you can perform normal text selection in SSMS.

If you put your cursor to the left of “dbo.DimEmployee” and hold the SHIFT key and then put your cursor at the end of “dbo.DimReseller” it will select the first three lines of code as shown below.

ssms shift select

Using SHIFT+ALT to Select Columns

If you would like to select columns or blocks then Microsoft SQL Server offers a solution for you. You can use the key shortcut SHIFT+ALT as described in the following steps. Please note that this feature works using SSMS for SQL Server 2008 and up.

Place your cursor to the left of “dbo.DimEmployee”, press SHIFT+ALT then click at the end of “dbo” in “dbo.DimProductCategory”. This will select columns or blocks in SQL Server Management Studio as shown below.

ssms shift alt select

Using SHIFT+ALT to Select Columns and Insert Text

In SSMS for SQL Server 2012 and up, you can also use SHIFT+ALT to insert text in this block mode.

First place the cursor in the first row where you would like to insert the text (to the left dbo.DimEmployee in our example). Press SHIFT+ALT and click in the last line where you would like to append this text (left of dbo.DimProductCategory). Now type “SELECT * FROM ” and this text will be inserted for each line as shown below.

If you would like to select columns or blocks then Microsoft SQL Server offers a solution for you

Using CTRL+SHIFT+END to Select Text

If you want to select all text from a starting point to the end you can use CTRL+SHIFT+END.

Put your cursor at the beginning point and press CTRL+SHIFT+END to select all text from that point to the end of the text as shown below.

ssms ctrl shift end to select

Using CTRL+SHIFT+HOME to Select Text

If you want to select all text from a starting point to the beginning you can use CTRL+SHIFT+HOME.

Put your cursor at the beginning point and press CTRL+SHIFT+HOME to select all text from that point to the beginning of the text as shown below.

ssms ctrl shift home select

Using CTRL+A to Select All Text

If you want to select all text you can use CTRL+A.

Just press CTRL+A anywhere in the query editor and this will select all text as shown below.

ssms ctrl a to select all text

 

Read more:
https://www.mssqltips.com/sql-server-tip-category/52/sql-server-management-studio/

 

SQL Server Data Types you Must Know

Why data types are important

  1. The data is stored in the database in a consistent and known format.
  2. Knowing the data type allows you to know which calculations and formulations you can use on the column.
  3. Data types affect storage. Some values take up more space when stored in one data type versus another.  Take our age tables above for example.
  4. Data types affect performance. The less time the database has to infer values or convert them the better.  “Is December, 32, 2015 a date?”

Commonly used SQL Server Data Types

There are over thirty different data types you can choose from when defining columns.  Many of these are set up for very specific jobs such as storing images, and others more suitable to general use.

Here is the data types you’ll most frequently encounter in your everyday use of SQL.  These are:

  • INT
  • VARCHAR, NVARCHAR
  • DATETIME
  • DECIMAL, FLOAT
  • BIT

INT – Integer Data Type

The integer data type is used to store whole numbers.  Examples include -23, 0, 5, and 10045.  Whole numbers don’t include decimal places.  Since SQL server uses a number of computer words to represent an integer there are maximum and minimum values which it can represent.  An INT datatype can store a value from -2,147,483,648 to 2,147,483,647.

Practical uses of the INT data type include using it to count values, store a person’s age, or use as an ID key to a table.

But INT wouldn’t be so good to keep track of a terabyte hard drive address space, as the INT data type only goes to 2 billion and we would need to track into the trillions.  For this you could use BIGINT.

The INT data type can be used in calculations.  Since DaysToManufacture is defined as INT we can easily calculate hours by multiplying it by 24:

SELECT Name,
       DaysToManufacture,
       DaysToManufacture * 24 as HoursToManufacture
FROM   Production.Product

Here you can see the results

Use of INT to perform calculations.

There are many operations and functions you can use with integers which we’ll cover once we dig into functions.

VARCHAR and NVARCHAR – Text Values

Both VARCHAR and NVARCHAR are used to store variable length text values.  “VARCHAR” stands for variable length character.

The number of characters to store in a VARCHAR or NVARCHAR are defined within the column.   For instance as you can see in the following column definition from the object explorer, the product name is defined to hold fifty characters.

VARCHAR definition shown in SQL Server Management Studio

What makes VARCHAR popular is that values less than fifty characters take less space.  Only enough space to hold the value is allocated.  This differs from the CHAR data type which always allocates the specified length, regardless of the length of the actual data stored.

The VARCHAR datatype can typically store a maximum of 8,000 characters.  The NVARCHAR datatype is used to store Unicode text.  Since UNICODE characters occupy twice the space, NVARCHAR columns can store a maximum of 4,000 characters.

The advantage NVARCHAR has over VARCHAR is it can store Unicode characters.  This makes it handy to store extended character sets like those used for languages such as Kanji.

If your database was designed prior to SQL 2008 you’ll most likely encounter VARCHAR; however, more modern databases or those global in nature tend to use NVARCHAR.

DATETIME – Date and Time

The DATETIME data type is used to store the date and time.  An example of a DATATIME value is

1968-10-23 1:45:37.123

This is the value for October 23rd, 1968 at 1:45 AM.  Actually the time is more precise than that.  The time is really 45 minutes, 37.123 seconds.

In many cases you just need to store the date.  In these cases, the time component is zeroed out.  Thus, November 5th, 1972 is

1972-11-05 00:00:00.000

A DATETIME can store dates from January 1, 1753, through December 31, 9999.  This makes the DATETIME good for recording dates in today’s world, but not so much in William Shakespeare’s.

As you get more familiar with the various SQL built-in functions you’ll be able to manipulate the data.  To give you a glimpse, we’ll use the YEAR function to count employees hired each year.  When given a DATETIME value, the YEAR function return the year.

The query we’ll use is

SELECT   YEAR(HireDate),
         Count(*)
FROM     HumanResources.Employee
GROUP BY YEAR(HireDate)
ORDER BY YEAR(HireDate)

And here are the results

Use YEAR on DATETIME data type

The benefit is the DATETIME type ensures the values are valid dates.  Once this is assured, we’re able to use a slew of functions to calculate the number of days between dates, the month of a date and so on.

We’ll explore these various functions in detail in another blog article.

DECIMAL and FLOAT – Decimal Points

As you may have guessed DECIMAL and FLOAT datatypes are used to work with decimal values such as 10.3.

I lumped DECIMAL and FLOAT into the same category, since they both can handle values with decimal points; however, they both do so differently:

If you need precise values, such as when working with financial or accounting data, then use DECIMAL.  The reason is the DECIMAL datatype allows you to define the number of decimal points to maintain.

DECIMAL

DECIMAL data types are defined by precision and scale.  The precision determine the number of total digits to store; whereas, scale determine the number of digits to the right of the decimal point.

A DECIMAL datatype is specified as DECIMAL(precision,scale).

A DECIMAL datatype can be no more than 38 digits.  The precision and scale must adhere to the following relation

0 <= scale <= precision <= 38 digits

In the Production.Product table, the weight column’s datatype is defined as DECIMAL(8,2).  The first digit is the precision, the second the scale.

Weight is defined to have eight total digits, two of them to the right of the decimal place.  We’ll the following sample query to illustrate how this data type.

SELECT   DISTINCT Weight
FROM     Production.Product
WHERE    Weight BETWEEN 29.00 and 189.00
ORDER BY Weight DESC

The results follow:

Using DECIMAL data type to display results

FLOAT

Where DECIMAL datatypes are great for exact numbers, FLOATs are really good for long numeric values.  Though a DECIMAL value can have 38 digits total, in many engineering and scientific application this is inadequate.  For scientific applications where extreme numeric values are encountered, FLOAT rises to the top!

FLOATS have a range from – 1.79E+308 to 1.79E+308.  That means the largest value can be 179 followed by 306 zeros (large indeed!).

Because of the way float data is stored in the computer (see IEEE 754 floating point specification) the number stored is an extremely close approximation.  For many application this is good enough.

Because of the approximate behavior, avoid using <> and = operators in the WHERE clause.  Many a DBA has been burned by the statement.

WHERE mass = 2.5

Their expectation are dashed when mass is supposed to equal 2.5, but really, in the computer it is stored as 2.499999999999999; therefore, not equal to 2.500000000000000!

That is the nature of floating points and computers.  You and I see 2.499999999999999 and think for practical purposes it is 2.5, but to the computer, were off just a bit.  J

BIT – Boolean or Yes/No values

There’s times when you just need to store whether something “is” or “is not.”  For instance, whether an employee is active.  It is in these cases that the BIT datatype comes to its own.  This data type be one of three states: 1, 0, or NULL.

The value of 1 signifies TRUE and 0 FALSE.

In this query we’re listing all salaried position job titles

SELECT DISTINCT JobTitle
FROM   HumanResources.Employee
WHERE  SalariedFlag = 1

Here are the results

Using the BIT data type in Searches

We could have also use ‘True’ instead of 1.  Here is the same example using ‘True’

SELECT DISTINCT JobTitle
FROM   HumanResources.Employee
WHERE  SalariedFlag = 'True'

And the opposite using ‘False’

SELECT DISTINCT JobTitle
FROM   HumanResources.Employee
WHERE  SalariedFlag = 'False'

I tend to stick with 1 and 0, since it is easier to type, but if you’re going for readability, then ‘True’ and ‘False’ are good options.

Read more:
https://www.essentialsql.com/commonly-used-sql-server-data-types/

How to Create a Stored Procedures with Dynamic Search & Paging (Pagination)

I used this article as a reference for my very first stored procedure for my internship project so I would like to share it with you if you are looking for some ideas to start a  stored procedure with search and pagination.


How can we apply paging (or pagination) on the returned recordsets from a Stored Procedures?

Pagination is required when a lot of records are returned from an SP and the webpage becomes very heavy to load it, like more than 1k records. And it also becomes difficult for a user to handle that much records and view on a single screen.

Many webpages displaying records has First, Previous, Next & Last buttons on top and bottom of the data grid. So to make the buttons function we’ve to implement the paging functionality on SPs itself.

Let’s check with a sample code using [Person].[Contact] Table of [AdventureWorks] Database:

CREATE PROCEDURE USP_GET_Contacts_DynSearch_Paging
(
    -- Pagination
    @PageNbr            INT = 1,
    @PageSize           INT = 10,
    -- Optional Filters for Dynamic Search
    @ContactID          INT = NULL,
    @FirstName          NVARCHAR(50) = NULL,
    @LastName           NVARCHAR(50) = NULL,
    @EmailAddress       NVARCHAR(50) = NULL,
    @EmailPromotion     INT = NULL,
    @Phone              NVARCHAR(25) = NULL
)
AS
BEGIN
    DECLARE
        @lContactID         INT,
        @lFirstName         NVARCHAR(50),
        @lLastName          NVARCHAR(50),
        @lEmailAddress      NVARCHAR(50),
        @lEmailPromotion    INT,
        @lPhone             NVARCHAR(25)
    
    DECLARE
        @lPageNbr   INT,
        @lPageSize  INT,
        @lFirstRec  INT,
        @lLastRec   INT,
        @lTotalRows INT
 
    SET @lContactID         = @ContactID
    SET @lFirstName         = LTRIM(RTRIM(@FirstName))
    SET @lLastName          = LTRIM(RTRIM(@LastName))
    SET @lEmailAddress      = LTRIM(RTRIM(@EmailAddress))
    SET @lEmailPromotion    = @EmailPromotion
    SET @lPhone             = LTRIM(RTRIM(@Phone))
 
    SET @lPageNbr   = @PageNbr
    SET @lPageSize  = @PageSize
    
    SET @lFirstRec  = ( @lPageNbr - 1 ) * @lPageSize
    SET @lLastRec   = ( @lPageNbr * @lPageSize + 1 )
    SET @lTotalRows = @lFirstRec - @lLastRec + 1
 
    ; WITH CTE_Results
    AS (
        SELECT ROW_NUMBER() OVER (ORDER BY ContactID) AS ROWNUM,
            ContactID,
            Title,
            FirstName,
            MiddleName,
            LastName,
            Suffix,
            EmailAddress,
            EmailPromotion,
            Phone
        FROM Person.Contact
        WHERE
            (@lContactID IS NULL OR ContactID = @lContactID)
        AND (@lFirstName IS NULL OR FirstName LIKE '%' + @lFirstName + '%')
        AND (@lLastName IS NULL OR LastName LIKE '%' + @lLastName + '%')
        AND (@lEmailAddress IS NULL OR EmailAddress LIKE '%' + @lEmailAddress + '%')
        AND (@lEmailPromotion IS NULL OR EmailPromotion = @lEmailPromotion)
        AND (@lPhone IS NULL OR Phone = @lPhone)
    )
    SELECT
        ContactID,
        Title,
        FirstName,
        MiddleName,
        LastName,
        Suffix,
        EmailAddress,
        EmailPromotion,
        Phone
    FROM CTE_Results AS CPC
    WHERE
        ROWNUM > @lFirstRec
    AND ROWNUM < @lLastRec
    ORDER BY ROWNUM ASC
 
END
GO

let’s test this SP:

-- No parameters provided, fetch first 10 default records:
EXEC USP_GET_Contacts_DynSearch_Paging
 
-- On providing @PageSize=20, will fetch 20 records:
EXEC USP_GET_Contacts_DynSearch_Paging @PageSize=20
 
-- On providing @PageNbr=2, @PageSize=10, will display second page, ContactID starting from 11 to 20:
EXEC USP_GET_Contacts_DynSearch_Paging @PageNbr=2, @PageSize=10
 
-- On providing @PageNbr=1, @PageSize=50, @FirstName = 'Sam', it will search FurstName like Sam and will fetch first 50 records:
EXEC USP_GET_Contacts_DynSearch_Paging @PageNbr=1, @PageSize=50, @FirstName = 'Sam'
Retrived from :

https://sqlwithmanoj.com/2011/12/30/creating-stored-procedures-with-dynamic-search-paging-pagination/

Install AdventureWorks2014 and AdventureWorksDW2014 Step by Step

Before installation, please download 2 sample databases from below,


 

1

Open SQL Server Management Studio, Right click on “Databases”, select “Restore Database”,

2
Select “Device” and click button on the right,

3
Click “Add” button,

4
Select AdventureWorks2014.bak, click “OK” button.

5
Click “OK” button

6
Click “OK” button.

7
AdventureWorks2014 has been installed.

Please refer above steps to install AdventureWorksDW2014, it’s exactly the same steps.

 

 

 

Installing SQL Server 2016 Developer Edition and SQL Server Management Studio Step by Step

Installing SQL Server 2016 Developer Edition

First, you need SQL Server 2016 Developer Edition. Here is a link of how to get SQL Server 2016 Developer Edition for free: SQL Server 2016 Developer Edition is Free.

Once you have SQL Server 2016 Developer Edition, start the installation by executing setup.exe. SQL Server Installation Center starts:

1

Click the Installation page:

2

Click the link labelled “New SQL Server stand-alone installation or add features to an existing installation”:

3

Accept the defaults on the Product Key page – Developer Edition and no product key, then click the Next button:

4

On the License Terms page check the “I accept the license terms” checkbox and click the Next button:

5

If the Install Rules page tests pass, click the Next button to proceed:

6

On the Feature Selection page, select the features you want to install and a location for the instance root files. I installed the Database Engine Services, Analysis Services, Reporting Services – Native, Integration Services, and Client Tools Connectivity, Client Tools Backwards Compatibility, Master Data Services. Click the Next button to proceed:

7

On the Instance Configuration page select default instance if it’s your first time installing SQL Server, otherwise select Named instance and give an instance name and clicked the Next button:

8

On the Server Configuration page, I accepted the defaults for services and startup types:

9

On the Database Engine Configuration page, I opted to use Mixed Mode, supplied a strong password, and added the current user to the SQL Server Administrators group.

10

On the Analysis Services Configuration page, select Multidimensional and Data Mining Mode (from where you can also see Tabular Mode and PowerPivot Mode), and click Add Current User, then click Next button:

11

On the Reporting Services Configuration page, select install only, then click the Next button:

The Ready to Install page displayed; I clicked the install button to begin the installation.

12

Installing SQL Server Management Studio

Once the installation completed, I clicked the “Install SQL Server Management Tools” link on the SQL Server Installation Center’s Installation page:

13

The link takes me to a page titled Download SQL Server Management Studio.

14

Once the download is complete, click the Run button to install SSMS 2016:

15

The SQL Server Management Studio (SSMS) installation starts:

16

 

3 Effective Ways of Generating Surrogate Keys With SSIS

A surrogate key is an auto-generated value, usually integer, in the dimension table. It is made the primary key of the table and is used to join a dimension to a fact table. Among other benefits, surrogate keys allow you to maintain history in a dimension table. Despite their popularity, SSIS doesn’t have a built-in solution for generating surrogate keys. Let’s take a look at a few alternatives in this post.

First, create the following table. We will import data from Person. Person table (AdventureWorks sample) into this table. Note: PersonSK is the surrogate key.
CREATE TABLE [dbo].Person
(
[PersonSK] INT IDENTITY(1,1) NOT NULL
, [FirstName] NVARCHAR(50) NULL
, [LastName] NVARCHAR(50) NULL
, CONSTRAINT PK_PersonSK PRIMARY KEY [PersonSK]
) ;


1. With IDENTITY()

Drag a data flow task on to the control flow and configure the OLE DB source.

Next, drag OLE DB destination and connect it to the source. Specify the connection manager, and choose Table or view – fast load as the data access mode. This performs a bulk insert and is the fastest of all.

Destination table has 3 columns, but source has only 2 columns. In the mappings page, map input and output columns for FirstName and LastName and ignore the mapping for PersonSK.

When you run the package, becasue PersonSK is an identity column in the table, SQL Server will automatically generate values for you. This solution is easy and fast, but sometimes depending on your ETL methodology, you can’t rely on IDENTITY().


2.With Script Component

I frequently use Script Transformation. The steps are nicely written in this post by Phil Brammer (b).  This is simply a script used as a transformation. The script generates a row number for each row passed through the data flow.


3.. With ROW_NUMBER()

You can use ROW_NUMBER() when working with SQL Server data source to let the database engine do the work for you. This can be faster.

If you’re doing an incremental load, first find the maximum key value from the destination. I’ll use the following query. It’ll return zero if there were no rows, else it returns the maximum value.

SELECT  ISNULL(MAX(PersonSK),0) SK
FROM    dbo.Person

Add an Execute SQL Task to the control flow and set the result property to single row. Then, add a variable to hold the return value.

Next, connect a data flow task to the execute SQL task. We will use the following SQL statement in the OLE DB source editor. In addition to the LastName and FirstName columns, we are using ROW_NUMBER() function to generate a unique number for every row.

SELECT
ROW_NUMBER() OVER(ORDER BY [LastName]) SK
, [LastName]
, [FirstName]
FROM
Person.Person

The query will generate numbers starting from 1 for each row, but while loading to the destination we don’t want to start from 1. We want to find the maximum value in the destination and start from the next highest value. So, I’m adding the max value to every row number using parameters in the OLE DB source.


In the OLE DB destination, check the box that says Keep Identity. By doing this we are asking SSIS to keep the identity values that are generated by the source. In the mappings page, you’ll see a new input that was created in the OLE DB source. Map it to PersonSK surrogate key.

Go ahead and run the package. If it’s all good you will see cute little green tick marks like below.

Summary

In this example, we looked at different options to generate surrogate keys while loading dimensions using SSIS. We used IDENTITY() constraint in SQL Server. We talked about Script Component. Finally, we saw making use ROW_NUMBER() function. Last option is twice as fast as using the Script Component with around 20,000 rows and an index on LastName column.