Friday, May 27, 2011

Sorting Excel data using formulas

Sometimes, you sort data in excel. This is handy since the order provides a framework around which our mind can be wrapped. However, it's not always handy to have to select the cells you want to sort and use the menus to sort them. Sometimes, it would be handy to sort the data automatically.

One way to do this is to use the large() and small() functions. These functions return the kth largest or smallest number in a set of numbers. If you have 10 numbers you want to sort largest to smallest, you would enter these formulas (assuming the 10 numbers you want to sort are in A1:A10):

If the data you want to sort also has category labels (i.e. every data point has a label), you can (depending on the uniquiness of the data) lookup the category of the number based on the number itself. If the data is not unique enough, you can combine the label with the number by converting the label to an integer dividing by a huge number (say 1,000,000) and adding it to the number. This way, the label addition won't severely change the sorting values, but will still make each value unique.

Lost? I just read that back to myself and I even got a little lost. Let's propose an example. Let's say that i have 10 salesmen on my team and they each have their monthly sales numbers. I'd like to be able to sort this data. Let's also say that this data gets updated every month and i'd rather not sort it manually. Here's the data:

Salesman, 12/2010 Sales
Han Solo, 827
Luke Skywalker, 1207
Mace Windu, 1162
Anakin Skywalker, 1385
Qui-Gon Jinn, 1071
Obi-Wan Kenobi, 1121
Bail Organa, 1163
Boba Fett, 932
Wedge Antilles, 1393
Lando Calrissian, 1141

Build the spreadsheet like this:
Sort Value Salesman 12/2010 Sales Name converted to number
=C2+D2 Han Solo 827 =(CODE(MID(B2,1,1))+CODE(MID(B2,2,1))+CODE(MID(B2,3,1))+CODE(MID(B2,4,1)))/1000000
=C3+D3 Luke Skywalker 1207 =(CODE(MID(B3,1,1))+CODE(MID(B3,2,1))+CODE(MID(B3,3,1))+CODE(MID(B3,4,1)))/1000000
=C4+D4 Mace Windu 1162 =(CODE(MID(B4,1,1))+CODE(MID(B4,2,1))+CODE(MID(B4,3,1))+CODE(MID(B4,4,1)))/1000000
=C5+D5 Anakin Skywalker 1385 =(CODE(MID(B5,1,1))+CODE(MID(B5,2,1))+CODE(MID(B5,3,1))+CODE(MID(B5,4,1)))/1000000
=C6+D6 Qui-Gon Jinn 1071 =(CODE(MID(B6,1,1))+CODE(MID(B6,2,1))+CODE(MID(B6,3,1))+CODE(MID(B6,4,1)))/1000000
=C7+D7 Obi-Wan Kenobi 1121 =(CODE(MID(B7,1,1))+CODE(MID(B7,2,1))+CODE(MID(B7,3,1))+CODE(MID(B7,4,1)))/1000000
=C8+D8 Bail Organa 1163 =(CODE(MID(B8,1,1))+CODE(MID(B8,2,1))+CODE(MID(B8,3,1))+CODE(MID(B8,4,1)))/1000000
=C9+D9 Boba Fett 932 =(CODE(MID(B9,1,1))+CODE(MID(B9,2,1))+CODE(MID(B9,3,1))+CODE(MID(B9,4,1)))/1000000
=C10+D10 Wedge Antilles 1393 =(CODE(MID(B10,1,1))+CODE(MID(B10,2,1))+CODE(MID(B10,3,1))+CODE(MID(B10,4,1)))/1000000
=C11+D11 Lando Calrissian 1141 =(CODE(MID(B11,1,1))+CODE(MID(B11,2,1))+CODE(MID(B11,3,1))+CODE(MID(B11,4,1)))/1000000

Let's break it down. First, build the column to the right of the sales numbers. This column takes the salesperson's name and converts the first four characters to a number. Given that the first four characters of each salesperson's name is different, this provides a unique value that we can use to identify the salesperson. We divide it by 1,000,000 so that it has little effect on the sorting value.
Then build the column to the left of the names. This combines the value we want to sort by with the unique salesperson identifier. This becomes the new value we will sort by.

Now we just need to pull the data together. In the first column below, i put the order i wanted the sales people in. I could have just as easily put these values into the formulas in the second column, but this was easier. The second column uses the LARGE function to display the values from the first column of the above table in descending order. The values returned here will be the unique combination of sales numbers and salesperson's name (converted to a number). We now will just need to get the salespeople's names and numbers based on the unique identifier. Use VLOOKUP for this.
Order Sorted Values Salesman Sales number
1 =LARGE($A$2:$A$11,A14) =VLOOKUP(B14,$A$2:$C$11,2,FALSE) =VLOOKUP(B14,$A$2:$C$11,3,FALSE)
2 =LARGE($A$2:$A$11,A15) =VLOOKUP(B15,$A$2:$C$11,2,FALSE) =VLOOKUP(B15,$A$2:$C$11,3,FALSE)
3 =LARGE($A$2:$A$11,A16) =VLOOKUP(B16,$A$2:$C$11,2,FALSE) =VLOOKUP(B16,$A$2:$C$11,3,FALSE)
4 =LARGE($A$2:$A$11,A17) =VLOOKUP(B17,$A$2:$C$11,2,FALSE) =VLOOKUP(B17,$A$2:$C$11,3,FALSE)
5 =LARGE($A$2:$A$11,A18) =VLOOKUP(B18,$A$2:$C$11,2,FALSE) =VLOOKUP(B18,$A$2:$C$11,3,FALSE)
6 =LARGE($A$2:$A$11,A19) =VLOOKUP(B19,$A$2:$C$11,2,FALSE) =VLOOKUP(B19,$A$2:$C$11,3,FALSE)
7 =LARGE($A$2:$A$11,A20) =VLOOKUP(B20,$A$2:$C$11,2,FALSE) =VLOOKUP(B20,$A$2:$C$11,3,FALSE)
8 =LARGE($A$2:$A$11,A21) =VLOOKUP(B21,$A$2:$C$11,2,FALSE) =VLOOKUP(B21,$A$2:$C$11,3,FALSE)
9 =LARGE($A$2:$A$11,A22) =VLOOKUP(B22,$A$2:$C$11,2,FALSE) =VLOOKUP(B22,$A$2:$C$11,3,FALSE)
10 =LARGE($A$2:$A$11,A23) =VLOOKUP(B23,$A$2:$C$11,2,FALSE) =VLOOKUP(B23,$A$2:$C$11,3,FALSE)

The resulting data is shown below. In reality, the formulas in the first two columns below could be combined into the formulas for the second two columns. This would result in a cleaner look.
Order Sorted Values Salesman Sales number
1 1393.000391 Wedge Antilles 1393
2 1385.000379 Anakin Skywalker 1385
3 1207.000401 Luke Skywalker 1207
4 1163.000376 Bail Organa 1163
5 1162.000374 Mace Windu 1162
6 1141.000383 Lando Calrissian 1141
7 1121.000327 Obi-Wan Kenobi 1121
8 1071.000348 Qui-Gon Jinn 1071
9 932.000372 Boba Fett 932
10 827.000311 Han Solo 827