maart 2014 ~ SSIS Tips and Solutions

dinsdag 25 maart 2014

Select data from previous day, previous week, previous month or previous year

DATEADD method

The DATEADD method returns a specified date with the specified number interval (signed integer) added to a specified DATEPART of that date.
(http://technet.microsoft.com/en-us/library/ms186819.aspx)

With DATEADD, you can add a period of time to your date. Plus is in the future, minus is in the past. DATEADD gives you the results you want if you want to search for a specific time.

It depends on the time frame you wish to search in. If you want the data from last day or only 5 days ago, then DATEADD is fine. If you want the data for the last 5 days then you are talking about a period of time starting 5 days ago till now (or yesterday). Then we need to use DATEDIFF within DATEADD. The same applies if you search on other intervals (weeks, months, quarters, years).

You should use the DATEADD method to all parts of your date if you have for example multiple months and/or years. Otherwise you will end up with the data of all previous selected day number available.

Example:

SELECT * FROM yourTable
WHERE  DATEPART(YEAR,YourDate) = DATEPART(YEAR,DATEADD(day,-1,GETDATE()))
 AND   DATEPART(MONTH,YourDate) = DATEPART(MONTH,DATEADD(day,-1,GETDATE()))
 AND   DATEPART(DAY,YourDate) = DATEPART(DAY,DATEADD(day,-1,GETDATE()))

Using the day as interval and splitting the month and year from the date you ensure three facts:

Ignore time, in this case irrelevant
When it is the first day of the month, the query automatically goes and checks the previous month. When it is the first day of the year, it also looks in the previous year
Only get the wanted year and month. For example, today it is 2014-03-25. If we only use the day part of the date, the query will return all records with the day number 25. This will apply for Jan 25th, Feb 25th and Mar 25th. If you have data in 2013, you will get the 25th of all months present in your dataset for 2013.

If your interval is a month´s period, then just drop the last part of the query (DATEADD -> day) and change your interval in the DATEADD part of the query to MONTH

Example:

SELECT * FROM yourTable
WHERE  DATEPART(YEAR,YourDate) = DATEPART(YEAR,DATEADD(month,-1,GETDATE()))
 AND   DATEPART(MONTH,YourDate) = DATEPART(MONTH,DATEADD(month,-1,GETDATE()))

You can do the same with a week´s period.

SELECT * FROM yourTable
WHERE  DATEPART(YEAR,YourDate) = DATEPART(YEAR,DATEADD(week,-1,GETDATE()))
 AND   DATEPART(MONTH,YourDate) = DATEPART(MONTH,DATEADD(week,-1,GETDATE()))
AND   DATEPART(WEEK,YourDate) = DATEPART(WEEK,DATEADD(week,-1,GETDATE()))

Mind that a -1 week returns the record from last week and it depends on your settings which days will be part of this (Sunday to Saturday or Monday to Sunday etc). This is totally different that the question: give me all sales for the last 7 days. Seven days are also a week, but not calendar week.

DATEDIFF method

The DATEDIFF method returns the count of the specified datepart boundaries crossed between the specified startdate and enddate.
(http://technet.microsoft.com/en-us/library/ms189794.aspx)

By using the DATEDIFF function together with a specific DATEPART, let’s say DAY, we eliminate the existence of time in a date value. Together with the DATEADD method, we can determine the minimum date by subtracting the required days of your start off date. Then you can use your start off date as the maximum date for the comparison. Usually the start off day is today-> GETDATE(). This way we can set a start and an end date for our query without bothering for the eventually existent time.

Get today no time:

SELECT DATEADD(DAY,DATEDIFF(DAY,0,GETDATE()),0)

Get yesterday no time:

SELECT DATEADD(DAY,DATEDIFF(DAY,1,GETDATE()),0)

Below query will return all rows from only yesterday:

SELECT * FROM yourTable
WHERE YourDate BETWEEN dateadd(day,datediff(day,1,GETDATE()),0)
AND dateadd(day,datediff(day,0,GETDATE()),0)

If you, for example, wish to retrieve all data for the last 5 days your query becomes:

SELECT * FROM yourTable
WHERE YourDate BETWEEN dateadd(day,datediff(day,5,GETDATE()),0)
AND dateadd(day,datediff(day,0,GETDATE()),0)

For an overview of all Transact-SQL date and time data types and functions, see Date and Time Data Types and Functions (Transact-SQL)

Find the gaps in a sequence with SQL

15:56 No comments

Case

At Allianz Global Assistance we sell, among other products, travel insurances. Every day of the year. Once a month our sold policies are extracted from a middleware system in several CSV files and need to be imported in our Datawarehouse. Before doing that, we want to check if there are no days missing in the files. So I needed to create something to check for gaps in the date sequence. The date in the file I need to check is in an ISO format: 20140321

Solution

There are different approaches on this matter. In all cases we need to have a reference "table" where we know the sequence is 100% correct. Then we can compare our real data against this reference list.

I found an interesting post at some website called eidias. There you can find 5 possible ways to address this problem.

Finding gaps in a sequence of identifier values of sql table

I used a query similar to the one of option 2 in the above website. First of all I created a table with one column containing all dates from 20120101 to 20201231. Doing that in Excel cost me 10 minutes. If you are in a Datawarehouse environment, you could probably also make use of your Time dimension.

This is my stored procedure, quite simple. I guess you will need to focus at the bottom query as all of the rest is related to the specific needs of my problem.

ALTER PROCEDURE [dbo].[sp_MCR_CheckIntegrationDatesMissing]
 @CurrPeriod AS NVARCHAR(6)
AS
BEGIN
 -- SET NOCOUNT ON added to prevent extra result sets from
 -- interfering with SELECT statements.
 SET FMTONLY OFF
 SET NOCOUNT ON;

-- DECLARE @CurrPeriod AS NVARCHAR
 DECLARE @CurrYear AS INTEGER
 DECLARE @CurrMonth AS INTEGER
 DECLARE @NumberOfDaysInMonth AS INTEGER


-- SET @CurrPeriod = 201212
 SET @CurrYear = CAST(LEFT(@CurrPeriod,4) AS INT)
 SET @CurrMonth = CAST(RIGHT(@CurrPeriod,2) AS INT)

 --Find the minimum and maximum value of the sequence column of your table
 DECLARE @MinDate AS INTEGER
 DECLARE @MaxDate AS INTEGER

 SET @MinDate = CAST(CAST(@CurrYear AS VARCHAR(4))+CAST(@CurrMonth AS VARCHAR(2))+'00' AS INT)

 SET @NumberOfDaysInMonth =  CASE @CurrMonth
        WHEN 1 THEN 31
        WHEN 2 THEN 28
        WHEN 3 THEN 31
        WHEN 4 THEN 30
        WHEN 5 THEN 31
        WHEN 6 THEN 30
        WHEN 7 THEN 31
        WHEN 8 THEN 31
        WHEN 9 THEN 30
        WHEN 10 THEN 31
        WHEN 11 THEN 30
        WHEN 12 THEN 31
        END
        
 SET @MaxDate = @MinDate + @NumberOfDaysInMonth

 --Query to list the missing sequence numbers
 SELECT  *
 FROM
 (
  SELECT IntegrationDate AS IntegrationDateMissing FROM POL_REF_Integration_Dates WHERE IntegrationDate BETWEEN @MinDate AND @MaxDate
 ) temp
 where IntegrationDateMissing not in
 (
  SELECT [Integration date] FROM PolicyFile
 )

SSRS Failed Subscription Notifications

16:53 1 comment

Case

A nice feature of Reporting Services is that it can email reports on a regular basis to certain recipients including attachments in several file types. After having implemented several reports with a lot of subscriptions to different recipients it becomes a bit of a headache to monitor if reports are still delivered.

SSRS is quite picky. It takes only one invalid email adres to cause the report not be sent. To no one. If one of the report parameters is changed and you forget to change the subscription the whole thing fails. With no notification.

I would like to know which subscriptions on reports failed the last x number of days.

Solution

I created a job on SQL server with one step that executes every morning. That job creates a HTML table in the email body telling me which subscriptions failed over the last 48 hours.

Step 1: Create a new job

Step 2: Create a new step, give it a nice name and choose for Type: Transact-SQL script (T-SQL). Save the SQL code you can find further below on this page, save it, create a schedule and you are good to go.

Replace <SERVERNAME> with the server name where the reporting services (databasename: ReportServer) are running.

DECLARE @TableHTML NVARCHAR(MAX);
	
	SET @TableHTML = 
		N'<h1> Failed Report Subscriptions Last 48 Hours</H1>' +
		N'<table border="1">' +
		N'<tr><th>Name</th><th>LastRunTime</th><th>LastStatus</th><th>Description</th></tr>' +
		CAST ((
			SELECT	td = C.Name, '',
					td = S.LastRunTime, '',
					td = S.LastStatus, '',
					td = S.Description
			FROM	<servername>.ReportServer.dbo.Subscriptions AS S
			LEFT OUTER JOIN <servername>.ReportServer.dbo.[Catalog] AS C
					ON C.ItemID = S.Report_OID
			WHERE	LEFT (S.LastStatus, 12) != 'Mail sent to'
			AND		LEFT (S.LastStatus, 12) != 'New Subscrip'
			AND		RIGHT (S.LastStatus, 9) != '0 errors.'
			AND		S.lastruntime > dateadd(hour,-48,cast(cast(GETDATE() as DATE) as datetime)) FOR XML PATH('tr'), TYPE
		) AS NVARCHAR(MAX) ) + N'</table>' ;

    EXEC msdb.dbo.sp_send_dbmail  
    @recipients = 'your.address@email.com',
--    @copy_recipients = <cc field>,
    @profile_name = 'System Administrator',
    @subject = 'SSRS Subscription Failures',
    @body = @TableHTML,
    @body_format = 'HTML';

Mapping Stored Procedures in SSIS OLE DB Source Editor

20:51 1 comment

Case

Usually I use the "Table of view" or a plain SQL command in an OLE DB Source Editor to collect the data I need. That is rather straight forward and simple. Last week I needed to add some more steps and logic in my query and decided to go for a Stored Procedure. To make it more usable, I wanted to add a parameter. Peanuts one would think. Everything seemed to work fine on design mode, I even got a list of expected columns when clicking on the columns tab in the OLE DB Source Editor.

Well... It didn't work.

For this tutorial I used the AdvancedWorks database voor SQL Server 2008

I started with creating a simple SPROC which returns all products based on size and color. You can ofcourse replace this with your own SPROC. After that I defined my variables and created a connection to the database. I also created a connection to a flat file just to send the data somewhere and check if it is correct.

After that I added a Data Flow Task in my project and an OLE DB Source in there. Linked it to AdventureWorks and chose for SQL Command as Data Access Mode. See the following screenshot.

This seems all right. I even got the column list back when clicking on "Columns".

Unfortunatelly when running the package the following error occurs:

[Get Products By Color And Size [1]] Error: The SQL command requires a parameter named "@Size", which is not found in the parameter mapping.

What happened? I did pass the parameters so what's the problem??

Solution

Well, it seems like SSIS really wants the parameters linked to the SPROC having exactly the same name as the parameters in your SPROC.

Check the changes in the "SQL Command" and the "Set Query Parameters" windows in the following screenshot.

You can download a copy of the package with the link below:
MappingSPROCParamsInSSIS.dtsx

SSIS Tips and Solutions

Tips and Solutions related to Microsoft BI (SSRS, SSAS, SSIS)

dinsdag 25 maart 2014

Select data from previous day, previous week, previous month or previous year

DATEADD method

DATEDIFF method

zaterdag 22 maart 2014

Find the gaps in a sequence with SQL

Case

Solution

vrijdag 21 maart 2014

SSRS Failed Subscription Notifications

Case

Solution

donderdag 20 maart 2014

Mapping Stored Procedures in SSIS OLE DB Source Editor

Case

Solution

Popular Posts

Labels

Blogarchief