SQL Server basic Question and Answer

What is RDBMS?
Relational Data Base Management Systems (RDBMS) are database management systems that maintain data records and indices in tables. Relationships may be created and maintained across and among the data and tables. In a relational database, relationships between data items are expressed by means of tables. Interdependencies among these tables are expressed by data values rather than by pointers. This allows a high degree of data independence. An RDBMS has the capability to recombine the data items from different files, providing powerful tools for data usage.

What is normalization?
Database normalization is a data design and organization process applied to data structures based on rules that help build relational databases. In relational database design, the process of organizing data to minimize redundancy. Normalization usually involves dividing a database into two or more tables and defining relationships between the tables. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

What are different normalization forms?
1NF: Eliminate Repeating Groups Make a separate table for each set of related attributes, and give each table a primary key. Each field contains at most one value from its attribute domain. 2NF: Eliminate Redundant Data If an attribute depends on only part of a multi-valued key, remove it to a separate table. 3NF: Eliminate Columns Not Dependent On Key  If attributes do not contribute to a description of the key, remove them to a separate table. All attributes must be directly dependent on the primary key BCNF: Boyce-Codd Normal Form If there are non-trivial dependencies between candidate key attributes, separate them out into distinct tables. 4NF: Isolate Independent Multiple Relationships No table may contain two or more 1:n or n:m relationships that are not directly related. 5NF: Isolate Semantically Related Multiple Relationships  There may be practical constrains on information that justify separating logically related many-to-many relationships. ONF: Optimal Normal Form A model limited to only simple (elemental) facts, as expressed in Object Role Model notation. DKNF: Domain-Key Normal Form A model free from all modification anomalies.

Remember, these normalization guidelines are cumulative. For a database to be in 3NF, it must first fulfill all the criteria of a 2NF and 1NF database.

What is Stored Procedure?
A stored procedure is a named group of SQL statements that have been previously created and stored in the server database. Stored procedures accept input parameters so that a single procedure can be used over the network by several clients using different input data. And when the procedure is modified, all clients automatically get the new version. Stored procedures reduce network traffic and improve performance. Stored procedures can be used to help ensure the integrity of the database. e.g. sp_helpdb, sp_renamedb, sp_depends etc.

What is Trigger?
A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored in and managed by the DBMS.Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion.  A trigger cannot be called or executed; the DBMS automatically fires the trigger as a result of a data modification to the associated table. Triggers can be viewed as similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-drive and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. In addition, triggers can also execute stored procedures.

Nested Trigger: A trigger can also contain INSERT, UPDATE and DELETE logic within itself, so when the trigger is fired because of data modification it can also cause another data modification, thereby firing another trigger. A trigger that contains data modification logic within itself is called a nested trigger.

What is View?
A simple view can be thought of as a subset of a table. It can be used for retrieving data, as well as updating or deleting rows. Rows updated or deleted in the view are updated or deleted in the table the view was created with. It should also be noted that as data in the original table changes, so does data in the view, as views are the way to look at part of the original table. The results of using a view are not permanently stored in the database. The data accessed through a view is actually constructed using standard T-SQL select command and can come from one to many different base tables or even other views.

What is Index?
An index is a physical structure containing pointers to the data. Indices are created in an existing table to locate rows more quickly and efficiently. It is possible to create an index on one or more columns of a table, and each index is given a name. The users cannot see the indexes, they are just used to speed up queries. Effective indexes are one of the best ways to improve performance in a database application. A table scan happens when there is no index available to help a query. In a table scan SQL Server examines every row in the table to satisfy the query results. Table scans are sometimes unavoidable, but on large tables, scans have a terrific impact on performance.

Clustered indexes define the physical sorting of a database table’s rows in the storage media. For this reason, each database table may have only one clustered index. Non-clustered indexes are created outside of the database table and contain a sorted list of references to the table itself.

What is the difference between clustered and a non-clustered index?
A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.

A nonclustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.

What are the different index configurations a table can have?
A table can have one of the following index configurations:  No indexes A clustered index A clustered index and many nonclustered indexes A nonclustered index Many nonclustered indexes

What is cursors?
Cursor is a database object used by applications to manipulate data in a set on a row-by-row basis, instead of the typical SQL commands that operate on all the rows in the set at one time.

In order to work with a cursor we need to perform some steps in the following order: Declare  cursor Open cursor Fetch row from the cursor Process fetched row Close cursor Deallocate cursor

What is the use of DBCC commands? DBCC stands for database consistency checker. We use these commands to check the consistency of the databases, i.e., maintenance, validation task and status checks. E.g. DBCC CHECKDB - Ensures that tables in the db and the indexes are correctly linked. DBCC CHECKALLOC - To check that all pages in a db are correctly allocated. DBCC CHECKFILEGROUP - Checks all tables file group for any damage.

What is a Linked Server? Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query both the SQL Server dbs using T-SQL Statements. With a linked server, you can create very clean, easy to follow, SQL statements that allow remote data to be retrieved, joined and combined with local data. Storped Procedure sp_addlinkedserver, sp_addlinkedsrvlogin will be used add new Linked Server.

What is Collation? Collation refers to a set of rules that determine how data is sorted and compared. Character data is sorted using rules that define the correct character sequence, with options for specifying casesensitivity, accent marks, kana character types and character width.

What are different type of Collation Sensitivity? Case sensitivity A and a, B and b, etc.

Accent sensitivity a and á, o and ó, etc.

Kana Sensitivity When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive.

Width sensitivity When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently then it is width sensitive.

What's the difference between a primary key and a unique key? Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only.

How to implement one-to-one, one-to-many and many-to-many relationships while designing tables? One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships. One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships. Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.

What is a NOLOCK?
Using the NOLOCK query optimiser hint is generally considered good practice in order to improve concurrency on a busy system. When the NOLOCK hint is included in a SELECT statement, no locks are taken when data is read. The result is a Dirty Read, which means that another process could be updating the data at the exact time you are reading it. There are no guarantees that your query will retrieve the most recent data. The advantage to performance is that your reading of data will not block updates from taking place, and updates will not block your reading of data. SELECT statements take Shared (Read) locks. This means that multiple SELECT statements are allowed simultaneous access, but other processes are blocked from modifying the data. The updates will queue until all the reads have completed, and reads requested after the update will wait for the updates to complete. The result to your system is delay(blocking). 
What is difference between DELETE & TRUNCATE commands? Delete command removes the rows from a table based on the condition that we provide with a WHERE clause. Truncate will actually remove all the rows from a table and there will be no data in the table after we run the truncate command. 
TRUNCATE TRUNCATE is faster and uses fewer system and transaction log resources than DELETE. TRUNCATE removes the data by deallocating the data pages used to store the table’s data, and only the page deallocations are recorded in the transaction log.     TRUNCATE removes all rows from a table, but the table structure and its columns, constraints, indexes and so on remain. The counter used by an identity for new rows is reset to the seed for the column. You cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint. Because TRUNCATE TABLE is not logged, it cannot activate a trigger. TRUNCATE can not be Rolled back. TRUNCATE is DDL Command.  TRUNCATE Resets identity of the table. 
DELETE DELETE removes  rows one at a time and records an entry in the transaction log for each deleted row. If you want to retain the identity counter, use DELETE instead. If you want to remove table definition and its data, use the DROP TABLE statement. DELETE Can be used with or without a WHERE clause DELETE Activates Triggers. DELETE Can be Rolled back. DELETE is DML Command. DELETE does not reset identity of the table. 

Difference between Function and Stored Procedure? UDF can be used in the SQL statements anywhere in the WHERE/HAVING/SELECT section where as Stored procedures cannot be. UDFs that return tables can be treated as another rowset. This can be used in JOINs with other tables. Inline UDF's can be though of as views that take parameters and can be used in JOINs and other Rowset operations.

When is the use of UPDATE_STATISTICS command? This command is basically used when a large processing of data has occurred. If a large amount of deletions any modification or Bulk Copy into the tables has occurred, it has to update the indexes to take these changes into account. UPDATE_STATISTICS updates the indexes on these tables accordingly.

What types of Joins are possible with Sql Server? Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table. Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

Creating and using memory optimized table

Introduction

Memory optimized tables allow you to keep the data in memory and return the result fast whenever required. Since the data is stored on RAM so the disk io is 0 for memory optimized tables.

Steps to create memory optimized tables

Step1 

Create a database and add a filegroup and file and set memory optimize data on as below.

CREATE DATABASE imoDB   GO 
-------------------------------------- 
-- create database with a memory-optimized filegroup and a container. 
ALTER DATABASE imoDB   ADD FILEGROUP imoDB_mod CONTAINS MEMORY_OPTIMIZED_DATA   ALTER DATABASE imoDB   ADD FILE (name='memoltp_mod1', filename='c:\data\imoltp_mod1') TO FILEGROUP imoDB_mod   ALTER DATABASE imoDB   SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON  GO

Step 2 

Using above database you can use the below syntax to create a memory optimized table.
USE imoDB
GO
CREATE TABLE EMPLOYEE
(
EMPID INT IDENTITY(1,1),
EMPNAME NVARCHAR(255),
[ADDRESS] NVARCHAR(400)
) WITH (MEMORY_OPTIMIZED=ON)

Step 3 
Create natively compiled stored to insert and retrieve data.

CREATE PROCEDURE ADDEmployees
(
@number INT
)

with native_compilation,
         schemabinding,
         execute as owner
as
begin atomic
with (transaction isolation level = snapshot,
          language = N'us_english')
DECLARE @EmpID INT = 1

WHILE @EmpID <= @number
BEGIN

INSERT INTO EMPLOYEE VALUES('Employee_'+ CAST(@EmpID AS NVARCHAR(10)),'Address_'+CAST(@EmpID AS NVARCHAR(10)))

SET @EmpID = @EmpID + 1
END

To retrieve data

CREATE PROCEDURE ADDEmployees
(
@EmpID INT
)

with native_compilation,
         schemabinding,
         execute as owner
as
begin atomic
with (transaction isolation level = snapshot,
          language = N'us_english')
SELECT * FROM EMPLOYEE WHERE EMPID = @EmpID
end;

SQL Server In Memory OLTP - Memory Optimized Table

The memory optimized tables were first introduced with SQL 2014. The basic concept behind the introduction of memory optimized tables was to equip the sql server with 100x processing capabilities for some specific requirements (tables).
Before recommending memory optimized table you must understand about the memory optimized tables and how the data is stored, what type of indexes can be created on IM (In Memory) tables. There are basically two types of IM tables i.e schema durable and schema and data durable.

To learn more about In Memory OLTP you can go to the below MSDN Link.

https://docs.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/memory-optimized-tables

I have done few test with the processing capabilities of the IM tables and compared the same with disk based tables which are mentioned below.

I had loaded around 18000000 rows and tried to query data against that. to return 10000 rows based on some parameter condition the disk based table took approximately 5520 millisecond and IM table took 853 millisecond.

Now lets come to pros and cons (SCHEMA_DATA durable table)
  • We see 10-100x performance improvement while retrieving data using natively compiled procedures.
  • Disk based table occupied around 8GB data on drive and for same amount of data around 27GB of RMA was used.
  • You can see the retrieval performance through natively optimized code only.
  • DDL operation is very time and memory intensive once data is added to the table, so carefully use it.
  • You might need to create resource pool in case if you face any out of memory exception while creating/loading a large table.
  • If you restart the SQL Server service then for sometime your DB might go to recovery mode till all the IM table data loaded back to RAM.

Please read through the query processing guideline


Also read about indexes and buckets



T-SQL LEAD LAG and SUM function based query

  Query on T-SQL window clause Below is the sales table Order_Date Name Product SubCategory ...