Data Engineering Technical Blog: December 2017

Thursday, December 14, 2017

Download Links

https://1drv.ms/b/s!ApWtBIJhNRPShXA6005hVJKk1SoK

PowerBI Learning (Android Apps)

https://play.google.com/store/apps/details?id=com.thunkable.android.pgayathri15.PowerBI_Training

https://play.google.com/store/apps/details?id=com.wLearnPowerBI_7062146

Friday, December 8, 2017

Download and Install Azure Feature Pack for SSIS

Azure Feature pack allows SSIS package to connect with Azure services with readily available component which has to be installed separately after installing SSDT. Azure feature pack is available for SQL 2012 and later version.

You can download and install the feature pack from microsoft website - (https://docs.microsoft.com/en-us/sql/integration-services/azure-feature-pack-for-integration-services-ssis)

Please go through the prerequisite before installing the feature pack.

Installing feature pack will allow you to connect with big adat sources in Azure like Azure DataLake Store and HDInsight. You will get the complete list of connection managers in the download page.

After clicking on the link for a specific version you are redirected to the download page where you get the download button and all system requirements. Click on download button to download the specific architecture (x86 for 32 bit and x64 for 64 bit)

Once you download the MSI file you can double click the file to install the pack. Once pack is installed then you can refresh the toolbox for open projects to see the feature pack listed, it automatically displays the feature pack for new projects.

Sunday, December 3, 2017

SQL Server Procedure for Index maintenance

Index maintenance

For better query result its always good to have index statistics up to date. We normally look fragmentation percentage on indices which can be due to batch load or batch delete etc. We might observe fragmentation due to fill factors also.

If we go by the MSDN guideline then we normally do reorganize of the indices if the fragmentation is less than 35% and do rebuild of indices if the fragmentation is greater than 35%.

Below procedure accepts DB name as parameter and does the job for all the tables.

Stored procedure for index rebuild and reorganize

CREATE PROCEDURE [dbo].[SP_MainTainIndices]
(
@Database NVARCHAR(4000) = N'TestDB'
)
AS
BEGIN
DECLARE @ReorgIndex TABLE (ID INT IDENTITY(1,1),indexID INT,Table_Name NVARCHAR(4000), Index_Name NVARCHAR(4000))
DECLARE @RebuildIndex TABLE (ID INT IDENTITY(1,1),indexID INT,Table_Name NVARCHAR(4000), Index_Name NVARCHAR(4000))
DECLARE @id INT =1 , @idMax INT ;
DECLARE @sql NVARCHAR(MAX) = ''

INSERT INTO @ReorgIndex (indexID,Table_Name,Index_Name)
SELECT a.index_id,('['+D.name + '].['+ c.name+']') AS Table_Name, b.name AS Index_Name
FROM sys.dm_db_index_physical_stats (DB_ID(@Database), null, NULL, NULL, NULL) AS a
    JOIN sys.indexes (nolock) AS b ON a.object_id = b.object_id AND a.index_id = b.index_id
       JOIN sys.tables (nolock) AS C ON a.object_id = C.object_id
       INNER JOIN sys.schemas (nolock) AS D ON C.schema_id = D.schema_id
       WHERE avg_fragmentation_in_percent > 0 AND avg_fragmentation_in_percent <= 35
       AND b.Name IS NOT NULL
       ORDER BY D.name,C.name,avg_fragmentation_in_percent;

INSERT INTO @RebuildIndex (indexID,Table_Name,Index_Name)
SELECT a.index_id,('['+D.name + '].['+ c.name+']') AS Table_Name, b.name AS Index_Name
FROM sys.dm_db_index_physical_stats (DB_ID(@Database), null, NULL, NULL, NULL) AS a
    JOIN sys.indexes (nolock) AS b ON a.object_id = b.object_id AND a.index_id = b.index_id
       JOIN sys.tables (nolock) AS C ON a.object_id = C.object_id
       INNER JOIN sys.schemas (nolock) AS D ON C.schema_id = D.schema_id
       WHERE avg_fragmentation_in_percent > 35
       AND b.name IS NOT NULL
       ORDER BY D.name,C.name,avg_fragmentation_in_percent;

SELECT @idMax = MAX(ID) FROM @ReorgIndex
WHILE (@id <= @idMax)
BEGIN

SELECT @sql = (' ALTER INDEX ['+ Index_Name + '] ON ' + Table_Name +' REORGANIZE ')
FROM @ReorgIndex WHERE ID = @id
EXEC (@sql)
SET @sql = ''
SET @id = @id + 1

END

SET @id = 1
SELECT @idMax = MAX(ID) FROM @RebuildIndex

WHILE (@id <= @idMax)
BEGIN

SELECT @sql = (' ALTER INDEX ['+ Index_Name + '] ON ' + Table_Name +' REBUILD ')
FROM @RebuildIndex WHERE ID = @id
EXEC (@sql)
SET @sql = ''
SET @id = @id + 1

END

END

SQL Server Integration services 2017

SQL Server Integration Services 2017 new features

scale out - this is the most waited feature for better buffer management and performance and high availability.
New SSIS supports SQL Server in Linux environment, it allows you to run SSIS in Linux environment using command line.
Distributed processing of the SSIS could have been possible due to the scale out feature.
Odata source and connection manager supports Odata feed from Dynamics Ax and Dynamics CRM online.
Now you can deploy your SSIS package in Azure SQL SSIS Catalogue (SSISDB).

Many improvements done on database engine, MDS, Analysis Services and Azure Feature pack.

Saturday, December 2, 2017

PowerBI Syllabus

PART 1

Understanding BI and Visualization and usage
DataSources which PowerBI can connect to.
Connecting to sources and creating computed columns and joins using M-Language.

PART 2

Visualizing data in power bi.
Introduction to visual and slicers
modelling data, setting relationship.
Deploying and viewing report.

PART 3

Using powerBI portal
Row level security
Sharing Dashboard
Workspaces
Custom Visuals
DAX and Measures

PART 4

Lab for pulling data from HdInsight
Lab for custom visual
Exercise

SQL Server Common Table Expression

Common Table Expression

Common table expression is intermediate (temporary) result-set on which we can do DML operation.

Scope of CTE is next to the definition of the CTE object.

Usage of CTE

CTE can be used to created recursive query like getting the company hierarchy from a self referencing table.
Delete duplicate data from a table.
Solve complex grouping and ranking needs and result-set based on that.

Limitations

We can not use INTO, ORDER BY, OPTION
CTE can be followed by one DML statement which can reference CTE.
Query referenced by CTE can not be used to define cursor.

Syntax

WITH <CTE_NAME> AS

(

CTE_QUERY

)

DML_Query_Ref_CTE

Queries on common table expression

From Employee table with columns emp_id, emp_name, manager_id find the employee hierarchy of the company.
From product table with columns prod_id, prod_name, price, last_update. Delete the duplicate products by keeping only the ones with most recent last_updated date.

Friday, December 1, 2017

Google BigQuery Dataset

Google BigQuery

Google BigQuery is google's big data platform where you can store your data and use it for your analysis and reporting.

Prerequisite

You have already signed up for google cloud trial license and can access your project.

Create and Use BigQuery dataset

Login to google cloud console and navigate to bigdata section on left menus.
Click on BigQuery, it will open a new tab for Google BigQuery
In the new window go to you project name and click the dropdown as shown below.

Click on the "Create new dataset" option, you will get a new window.
New window has options for dataset expirary,location and id, you can fill the required details based on your requirement and click on ok.
Once your dataset name appears below your project hover there and click on the + icon as shown below.

Click on above icon you get the option to upload your file, you can choose appropriate options like delimiter, skiprows, file format etc.

You can save your data and use it for query or getting into other application. You can look into the below video on how to get data from BigQuery to PowerBI.

Data Engineering Technical Blog

Thursday, December 14, 2017

Power BI PDF Download Links

Download Links

https://1drv.ms/b/s!ApWtBIJhNRPShXA6005hVJKk1SoK

PowerBI Learning (Android Apps)

Friday, December 8, 2017

Azure Feature Pack for SSIS

Download and Install Azure Feature Pack for SSIS

Sunday, December 3, 2017

SQL Server Procedure for Index maintenance

Index maintenance

Stored procedure for index rebuild and reorganize

SQL Server Integration services 2017

SQL Server Integration Services 2017 new features

Saturday, December 2, 2017

PowerBI Syllabus

PowerBI Syllabus

PART 1

PART 2

PART 3

PART 4

SQL Server Common Table Expression

Common Table Expression

Usage of CTE

Limitations

Syntax

Queries on common table expression

Friday, December 1, 2017

Google BigQuery Dataset

Google BigQuery

Prerequisite

Create and Use BigQuery dataset

PowerBI monthly report refresh

Pages

Search This Blog