When using Power Query (Get & Transform) in Power BI and Excel, you have the ability to merge queries together and expand the dataset to view all the columns in your merged dataset.
While you can do this in Power BI, the intrinsic join capabilities of that software program do not require you to perform merges and expands. This will primarily be used in Excel.
To start we need to related tables as queries in your Data queries. In this post, I will use the Tasks and TaskBaselines tables to pull in existing data. To improve your data refreshes, please be sure to limit your data by filtering rows and columns.
In this instance I only want to link together the baseline 0 start and finish dates with my task data. My Tasks table only has 6 columns: ProjectId, ProjectName, TaskId, TaskName, TaskStartDate, and TaskFinishDate. My TaskBaselines table only has 5 columns: ProjectId, TaskId, BaselineNumber, BaselineStartDate, and BaselineFinishDate. I also limited my rows of data to just BaselineNumber = 0.
Merging your data:
Open your Tasks query in the query editor, and in the Home ribbon click on the Merge Queries dropdown. You get two options: Merge Queries or Merge Queries as New.
‘Merge Queries’ will merge the data of one query into the data of a second query. ‘Merge Queries as New’ will merge your two queries and create a third query in your dataset. I will use ‘Merge Queries’.
Before I click on ‘Merge Queries’, I need to be on my primary query (Tasks, in this case). Then click Merge Queries. This will open the Merge Queries dialog box.
My Tasks table will be on top, and in the blank dropdown menu, I will choose my TaskBaselines table. Once that is done, I need to choose which columns will link the two tables together. In this screen, I can choose a single column to match: TaskId. Make a pairing by clicking the TaskId column in the Tasks table and the TaskId column in the TaskBaselines table.
Or, I can choose additional pairs by control clicking additional matching columns. Power Query will remember the order in which you select columns. For this query I will join on the ProjectId and the TaskId.
The ‘Join Kind’ dropdown at the bottom of the screen defaults to ‘Left Outer (all from first, matching from second)’ This will show all my tasks and only the task baselines that match. There several options, but in this case the default works.
Click OK. This will add an additional Applied Step to your Tasks Query and a new column called ‘New Column’. I’ll Rename the column to TaskBaselines and click on the expand icon to the right of the column name.
Expanding Your Data:
Click on the Expand icon (shown above). This will open the Expand dialog box to allow you to choose which columns to expand. You can either expand (create new columns) or aggregate columns (sum, count, etc.). I will expand the columns so I can use the actual dates. In this case, I only need the BaselineStartDate and the BaselineFinishDate columns as the others are redundant or used for filtering.
Click OK to expand your data. The end result should look like this:
The left side shows two of my existing Tasks table columns and the right (green) are my expanded columns from the TaskBaselines table.
One note of caution when using this technique. If your second table has multiple matches to each row of your primary table, (with the Left Outer join), you will get multiple lines of identical Tasks data that matches each line of TaskBaselines data. This is why I chose to filter my TaskBaselines data to BaselineNumber = 0.
I noticed in the Tasks table that there are several joined tables already, can’t I just expand those existing merged tables? Alas, no. This is not allowed at this time. You will get an error message when trying to expand those merged tables.
DataFormat.Error: OData: The format ‘application/atom+xml;charset=utf-8’ is not supported.