When loading files into staging tables using SSIS, it is often helpful to include the filename during the load so you can verify where the data came from. This is easily accomplished with a buried option which in my opinion should be better spotlighted called the Advanced Editor for modifying flat file sources.
Often when loading data from another system into a data warehouse, someone will ask “where did this data come from?”. You can say, it just came from some file that was given to you or you can be a data professional and give them the exact filename that was used.
Below is a table that gets populated from a file from Accounts Receivable showing new accounts. It would be nice to know which file that they supplied us that was used to load them into the database.
First some background on our current ETL process. Below is a simple foreach loop that loads multiple csv files that are located in a specific directory. And you saw above that our table had 3 columns.
First thing we need to do is modify our table to include a new field name. Below I name the field “SourceFileName”, but you can pick any name. I also made the data type NVARCHAR. You can make it varchar, but then you will need to do a data type conversion anyway to get it to load.
Now go into the data flow tab and right click the data source for flat file accounts or whatever yours may be named for the source file. Then click Show Advanced Editor.
This brings up the advanced editor. Click the Component Properties tab. At the bottom there are options for custom properties. For the FileNameColumnName field, put in your column name that you added to the database.
You can use any name, but I like to match my fields with the names in the database so I use the same name: SourceFileName that we added above.
Now double click the destination.
Click the Mapping option on the left side. You can see the destination column SourceFileName has an ignore input column.
Change the input column to SourceFileName which is what we added in the advanced editor for flat file above. Your input column could be named different from what I used below.
I truncated the database table and reran the SSIS package again.
Looking at the table again, we can see the filename for each row got loaded into the SourceFileName field.
Using this feature, you can now better document your ETL processes.