Updated: Jun 9, 2020
Note - This post keeps our client, their projects, identity, and any intellectual property confidential. While the situation described bellow is equivalent to the real world problem, all identifying information has been removed and all the data is fictional.
Need help with a data related project?
While Microsoft Excel packs hundreds of ready-to-use tools and formulas, there are some problems that cannot be solved easily (or at all) by them alone. This was the case with a recent request by an Arithmech client. The problem involved conditional formatting on spreadsheets with varying dimensions and configurations. Keep reading to learn how the team at Arithmech successfully fulfilled the clients request using VBA.
The client regularly produces spreadsheets with a varying number of columns, rows, and configurations.
Take the example spreadsheet below:
Similar spreadsheets are sent out to teams of people who's job it is to input data into one or multiple columns. For this example, lets say the column they are concerned with is the "Location" column.
The team member inputs a response into the "Location" column and the cell is highlighted to signify that name has been done. However, the "Name" column may have any number of matching names. As you can see in the example, the name "Mike" appears three times and not in succession.
The client request was to come up with a solution that would format the cells in the "Location" column of any group of matching names IF one of the "Location" column cells for that group had a data input.
This would make the input job easier by identifying what names already had a location value. As the team member is going down the "Location" column, any cell formatted in red can be skipped because the name already has an input somewhere else in the spreadsheet. This makes the process much more efficient for large spreadsheets with hundreds of rows.
In addition to the main conditional formatting aspect, the client also requested the following attributes:
When a cell is formatted, add "Yes" to the "Done?" column for each row in the group
Color all cells in the group except for the initial cell that had data input
If the user erases an entry that has been formatted, un-format all cells in the group (similar to an "Undo" function)
Have this work for spreadsheets with any number of columns and rows. And, have it work for multiple columns of data input.
Here is an example of what a final version of the conditional formatting would look like if a user input a location for "Patrick" and "Mike". This example has one input column:
Here is a video example of what a final version of the conditional formatting would look for two input columns:
With the complexity of the logic, and the necessity to make this feature adaptable to any shape of spreadsheet AND multiple input columns, the native functions of Excel just couldn't cut it (or at least we couldn't figure it out). So that lead us to creating our own solution by programming a macro using VBA.
Summary of Code
The macro Arithmech developed is split into four for-loops:
Loop 1 - Find what column is the "Done?" column where a confirming "Yes" string would be placed.
Loop 2 - Add an identifier for the cell that has a data input.
Loop 3 - Format each cell in the group (except the one with data).
Loop 4 - Remove formatting for each cell in the group if the cell in the group that contains data is cleared.
The macro first identifies three variables:
Number of Rows (NumRows) - Finds the last row in the sheet that contains data
Number of Columns (LastColumn) - Finds the last column in the sheet that contains data
Active Column (ActiveCol) - Get the column that contains the active cell. This variable is dictated by the user. Since the spreadsheet may have multiple input columns the user defines what column to run the macro on by selecting a cell in that column and then running the process. The process only runs on the column that contains the active cell.
Next, the macro runs the first loop to identify where the "Done?" column is and stores the column index as a variable.
Then, the second loop goes through each row in the active column and checks if the cell is empty or not empty. If a cell is not empty (i.e. - the user has added some data), then the string "YES" is added to the cell in the same row, but in the column LastColumn + ActiveCol. The reason for this is due to the possibility of multiple input columns. This keeps the "YES" for the first input column different from any "YES" for second input column.
Why is the "YES" needed? The client requested that the cell containing data in the group NOT be formatted red. The "YES" in the outlying column identifies the cell in the group that contains data. As you can see in the image above, the cells B5 and C9 contain data for their groups and are identified with "YES" in G5 and H9 respectively.
Now that the cell containing data in the group has been identified, the third loop can go through and perform the color changes for all the other cells in the group. The loop only runs for the column containing the active cell. The loop always starts from row 2 (the top) and runs for every row in the data set. The loop first checks if the outlying column contains the string "YES" - if it does, the name in the "Name" column is stored in a variable.
Once a name is stored, the loop starts over at row 2 until it finds a row with the same name. The loop has to start from the top again because there may be a row in the group that is above the cell containing data.
If the loop finds a matching name, it checks if the active column cell is the one that contains data. If it does not contain data, the cell is formatted red and the string "Yes" is added to the "Done?" column. If it does contain data, the cell is formatted with no fill and "Yes" is added to the "Done?" column.
The fourth loop, which runs every time the macro runs, is dedicated to the situation where the user erases data. For example, if the user erased "Here" in cell B5 the cells in the group need to be un-formatted and corresponding Yes's removed - essentially an "Un-Do" function.
The fourth loop looks down the active column until it finds a red cell. Once it finds a red cell, it looks at the name in the "Name" column.
Similar to the third loop, it now starts back at the top and looks down the "Name" column until it finds a matching name. If the active column cell is empty and not red then store the name and start back from the top.
If the loop finds a matching name, remove the active column cell formatting and delete the "Yes" in the "Done?" column
This conditional formatting process requires a level of logic and adaptability that cannot easily (or ever) be accomplished by the native Excel functions alone. Creating a custom VBA macro to perform the task was our best option. While the whole macro requires four different nested loops to run each time the macro runs, the formatting is handled within a second or two at most for spreadsheets containing 500 rows. It may become sluggish with spreadsheets in the 1000's of rows. Initially, we had the macro run with each change in a cell (essentially running it on auto), however there were too many issues with this method, so we instead recommended assigning the macro to a hot key combo such as ctrl+x. A user can make any number of data additions or subtractions to the spreadsheet then run the macro rather than running the macro after each change.
The great thing about this project (or any project involving writing code) is the ability for us to accommodate any request made by the client. Since we are creating our own solution, there is essentially no limit to what functionality we can provide.