Recently i was working on a requirement on exporting the data into excel file. As the data may exceed more than millions the data needs to be split into multiple excel files as row limitation is 1048576 i.e, if the source returns 3 million rows the data needs to be split into 3 destination files.
For this i have used the script component destination in the data flow task to achieve this
The source query for the data source has one single column "CSVCol"
SELECT
convert(varchar(50),D1.DateKey)+ ',' +
convert(varchar(50),D1.FullDateAlternateKey)+ ',' +
convert(varchar(50),D1.DayNumberOfWeek)+ ',' +
convert(varchar(50),D1.EnglishDayNameOfWeek)+ ',' +
convert(varchar(50),D1.SpanishDayNameOfWeek)+ ',' +
convert(varchar(50),D1.FrenchDayNameOfWeek)+ ',' +
convert(varchar(50),D1.DayNumberOfMonth)+ ',' +
convert(varchar(50),D1.DayNumberOfYear)+ ',' +
convert(varchar(50),D1.WeekNumberOfYear)+ ',' +
convert(varchar(50),D1.EnglishMonthName)+ ',' +
convert(varchar(50),D1.SpanishMonthName)+ ',' +
convert(varchar(50),D1.FrenchMonthName)+ ',' +
convert(varchar(50),D1.MonthNumberOfYear)+ ',' +
convert(varchar(50),D1.CalendarQuarter)+ ',' +
convert(varchar(50),D1.CalendarYear)+ ',' +
convert(varchar(50),D1.CalendarSemester)+ ',' +
convert(varchar(50),D1.FiscalQuarter)+ ',' +
convert(varchar(50),D1.FiscalYear)+ ',' +
convert(varchar(50),D1.FiscalSemester)+ ',' AS CSVCol
FROM DimDate D1
Note : This logic can be implemented inside the script task itself to concatenate the columns into the comma separated values.
The following variables are used for the below purposes
Bucket Size - For how many records a new file needs to be created
Filename - Name of the file. Multiple files will be created with _fileno Ex- Sorna_1
FilePath - Path in which file will be created
Header Row - In each file header row needs to be added. For this demo it is hard coded. This also can be achieved dynamically using expressions or other mechanisms
Following namespaces needs to be added in the script component code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
using Microsoft.CSharp;
Below is the code written in script component
public class ScriptMain : UserComponent
{
int RowCount = 0;
int batchcount = 0;
int FileCount = 0;
int BucketSize = 0;
String Filename = "" ;
String RunFileName = "";
String Filepath = "";
String FileFullpath = "";
String Excelfilepath = "";
string HeaderRow = "";
StreamWriter writer;
Object mv;
public override void PreExecute()
{
base.PreExecute();
RowCount = 0;
batchcount = 0;
FileCount = 1;
Filepath = Variables.FilePath;
Filename = Variables.FileName;
BucketSize = Variables.BucketSize;
HeaderRow = Variables.HeaderRow;
}
public override void PostExecute()
{
base.PostExecute();
writer.Dispose();
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (RowCount ==0)
{
RunFileName = Filename + "_" + FileCount.ToString();
FileFullpath = Filepath + RunFileName + ".csv";
writer = new System.IO.StreamWriter(FileFullpath);
writer.WriteLine(HeaderRow);
}
writer.WriteLine(Row.CSVCol.ToString());
RowCount++;
batchcount++;
if (batchcount == BucketSize)
{
writer.Dispose();
FileCount++;
batchcount = 0;
RunFileName = Filename + "_" + FileCount.ToString();
FileFullpath = Filepath + RunFileName + ".csv";
writer = new System.IO.StreamWriter(FileFullpath);
writer.WriteLine(HeaderRow);
}
}
}
The package is available in the below link
http://1drv.ms/1o44vnJ
This package will create the files as CSV. This also can be parameterized by putting this in a variable.
Hope this post is helpful for this scenario.