New data compression - Columnstore Archive

All who have used columnstore index must know that columnstore index also compress data.
Many cases after some time this data become historical and use of this data reduces over time.
In this case we can further compress data sung columnsore_archive compression option. Retrieving data takes time after this compression but its acceptable as it reduces storage and its usage frequency is low compare to current data

Lets see with example



CREATE TABLE test
  (
     id   INT,
     data VARCHAR(max)
   
  )

DECLARE @i INT

SET @i = 1

WHILE @i <= 100000
  BEGIN
      INSERT INTO test
      VALUES      (@i,
                   'The value of i is '
                   + CONVERT(VARCHAR(10), @i))

      SET @I = @i + 1
  END

SP_SPACEUSED test

Normal Table


Now we will create clustered columnstore index here.
It will compress data

 
CREATE CLUSTERED COLUMNSTORE INDEX c1 ON test

Table with clustered columnstore index









Now we will compress more using columnstore_archive option 
This will compress data even more. 


ALTER INDEX c1 ON test REBUILD WITH (data_compression = columnstore_archive)

Table with columnstore_archive compression option

 
 



Here we can see size reduces from 8848 KB to 2704 KB and then 1232 KB.


Inline indexes


SQL 2014 has many silent features one of them is inline indexes.
Whats inline indexes?
Inline indexes are which can be created while creating table.
We can specify index in Create Table statement itself.

Here is an example


CREATE TABLE test_table
  (
     id   INT,
     data VARCHAR(max),
       INDEX idx1 (id)
   )






Limitations
1.We can create filtered index
2.We can not create columnstore index.

Good news is that both this issues resolved in SQL Server 2014.

Optimized select into in SQL Server 2014

Lets have quick glance of this feature.
As we all know select .... into clause and how to use it.
Now in SQL 2014 its more optimized and will operate in parallel also for better performance.


Lets see with example

I have one table test_table with 200 Million records in SQL Server 2014 and 2012 both.

We will run Select .... into query and check execution plan



SQL Server 2012

SELECT *
INTO   new_table
FROM   xyz




SQL Server 2014


SELECT *
INTO   new_table
FROM   xyz






As we can see in execution plan optimizer is using parallelism operator to dump data.


Popular Posts