Consolidate data in Excel and merge multiple sheets into one worksheet
4 stars based on
Without a solid knowledge of Excel VBA programming, this task typically entails opening each file, copying the data, and then pasting the data into single workbook. Today, I want to show you a relatively easy way to do this using a Power Query function. The basis for this tip comes 48 thoughts on using power query to combine data from multiple excel files into one table an article Hilmar Buchta posted about consolidating Hadoop files.
Query steps are embedded lines of M code that allow your actions to be repeated each time you refresh your Power Query data. You can extend the utility of that embedded code with your own custom function. The idea is relatively simple. Build a starter query via the Power Query Editor, and then wrap the resulting M code in your own function.
Each file contains a table with the same table structure on worksheet named MySheet. Build a query that connects to just one of the Excel files. After a few seconds, the Navigation pane will activate. In the Navigator pane, choose the sheet that holds the data which needs to be consolidated and then click the Edit button to open the Query Editor. Apply any needed transformation actions. Use the Query Editor to apply some basic transformation actions.
48 thoughts on using power query to combine data from multiple excel files into one table this example, you can see that First Row was promoted to column headers and a few unneeded columns were removed. Edit the embedded code to create your function. As you can see, while you leisurely build your query in the Query Editor, Power Query diligently creates the bulk of the code for you. Note in the portion of the code highlighted in gray for illustration Power Query hard-coded the file path and the file name for the Excel file originally selected.
The idea is to wrap this starter code in a custom function which will pass a dynamic file path and file name. Wrap the starter code with your function tags, specifying a FilePath parameter and a FileName parameter. Replace the hard-coded file path and file name with your dynamic parameters. In the Query Settings pane, change name of the query in the Name input box. The goal here is to give your function a reasonably descriptive name in this scenario, fGetMyFiles.
Unfortunately, there is no way to create and use a custom function without creating an 48 thoughts on using power query to combine data from multiple excel files into one table table. Use your newly created function to combine all Excel files. The Query Editor window will activate to show you a table containing a record for each file in the chosen directory.
These will provide our function with the needed FilePath and FileName parameters. Right-click on any column header and select the Insert Custom Column action. In the Insert Custom Column dialog box, invoke your function 48 thoughts on using power query to combine data from multiple excel files into one table pass the Folder Path and Name fields as parameters separated by a commas. Once you confirm your changes, Power Query will trigger the function for each row in the data table.
Click the Custom column header to see a list of fields included in each table array. Here, you can choose which fields in the table array to show, click the Expand radio button, and then click the OK button.
With each table array expanded, Power Query exposes the columns pulled from each Excel file and adds the detailed records to the data preview. As long as the Excel files are in the same location with the same naming conventions, you can right-click the output and click the Refresh option to have Power Query automatically re-combine any fresh data. This means you can run through this setup once, then simply refresh the query to re-combine your Excel files.
One small annoyance is that Power Query functions apply only to the workbook in which they reside. Now, this may seem like a lot of steps, but think about it.
For all the steps required to accomplish this task, very little effort was actually expended on writing the code for the function. Power Query wrote the code for the core functionality, and we simply wrapped that code into a function. You can leverage the Query Editor to create some base code, then just customize from there. Excel and I will be putting on a new Business Intelligence Boot Camp April in Bentonville Arkansas, where 75 lucky registrants will be getting tips and trick like the one you just learned.
This 3-day event is aimed squarely at business analysts and managers who find it increasingly necessary to become more efficient at working with the new Microsoft BI tools like Power Pivot and Power Query.
Click here to get more details. Since I already have a number of files which consume allogmerated! After you output the results, go back in and Edit the query for the output. In the Advanced editor, find the line of code that expands the table array. It will look something like this: Notice there are two sets of column names, the original names and the funky output names starting with Custoem.
All you have to do is edit this line to change the second set of column names to the ones you want. In this example, I simply removed the silly Custom. I do struggle with this part: I can then enter the folder and file, but it cannot find the file.
Mike, thx for this one I tried this one from multiple CSV files converted from excel, but I am still trying to find a way to remove the repeated headers that populated to the entire table. I guess since my new table contains more than rows if cannot reach to those text values headers. Cheers and greetings from Lima — Peru. This is very interesting tool, however what if someone need to add data of two table side-by side instead of one below other.
I am seeking this as my data has one table with primary key and half columns in other file without primary key.
Cannot convert a value of type Table to type Function. Need some assistance on step 3 to add embedded code to create a dynamic file path and file name to pull multiple files out of a folder. What do I put in the gray box? I am brand new to Power Query. What goes where the hard coded file name was to make the command dynamic?
I have a long list of excel files from which I want to lift data and combine on a single summary table. All files reside on a shared drive. This helped me with my work at university for long, but since few weeks all the files I created with merge options never refresh correctly.
Any idea what might be going wrong. Several comments have asked about the grey box. In Step 2, you take a single sample file and carry out the steps you need to in order to pull in the data from that file and clean and structure the data however you want.
This same process is what will then be applied to all of the files. Initially, the value that had been in the grey box was the path to the single example file you had written the procedure on. When turning this into a general formula, the grey box gets changed to variables instead of a constant path. The grey box shows where these variables are actually used in the body of the function itself. Step 4 shows you how 48 thoughts on using power query to combine data from multiple excel files into one table generate a list of files in a particular folder and pass EACH of these files to this function in order to pull in the data from each file and process using the function you created and then aggregate all of the resulting tables into a single big table.
I want to know if there is some posibility to work with relative path instead of Absolute path in Power Pivot and Power Query with Excel and Excel However, this will work, as I end up downloading a new spreadsheet from the website, anyway. That really worth knowing. However, sometimes I think using UNIX is so much more elegant…pretty much all of the above can be achieved using.
Is there anyway to make a similar approach to loop Excel sources in the new Power BI Desktop — I have created the function but kinda gets stuck there.
Got this to work in current version of Power Query; only a couple button names had changed. This saves me SO much time! Data is in D4 till L22 — all cells have data. Data is in B1 till N46 and all cells have data. P2 till H48 and all cells have data. This is great work — thank you for sharing! I have an interesting dilemma after applying this method to my files.
The files I am using have row headers spread across a few rows. I ended up creating two functions: Then I append the queries and use the first row as column headers. This works well if the 48 thoughts on using power query to combine data from multiple excel files into one table is have the same number of columns and they do not change.
But in reality, as you may have guessed, the columns in my files are all over the place. Great help, but I need some more help because I have lots of similar files but with few different WorkSheet names, is it possible to put somehow OR operator in Source statement so that I could use few different WorkSheet names?
Your email address will not be published. Build a query that connects to just one of the Excel files On the Power Query tab, select the From Excel connection option. Browse to the directory which contains all the Excel files and choose just one of them. Apply any needed transformation actions Use the Query Editor to apply some basic transformation actions. Apply any other needed actions.
You can see your actions in the Query Settings pane. When completed, close the Advanced Editor. Power Query creates a fairly useless looking table. At this point, the custom function is ready to be used.