# our data
li = ["apple","orange","apple","pear","apple","orange"]
# Create a set to find unique values.
# You can think of a set like a dictionary,
# except it holds values only, instead of key-value
# pairs. Like dictionaries, sets require uniqueness
# and you can't rely on the order of the items.
uniques = set(li)
# Build a dictionary to count occurrences
# but skip frequencies less than or equal to 1
rank = {}
for fruit in uniques:
count = li.count(fruit)
if count > 1:
rank[fruit] = count
# This can also be written more concisely as the
# following (one might argue, by sacrificing readability)
# rank = dict((u, li.count(u)) for u in uniques if li.count(u) > 1)
# 'rank' should now look something like this
# {'apple': 3, 'orange': 2}
#
# Then we can use the 'sorted' built-in to
# create a list of keys in 'rank' ordered
# by the values (frequency of occurrence,
# in this case)
li = sorted(rank, key=rank.get, reverse=True)
# ['apple', 'orange']
# The 'sorted' built-in function takes any
# iterable sequence and, by default, returns
# a list of items sorted in a way customary
# for the type of the items contained within.
#
# The 'reverse' argument to sorted should be
# fairly self-explanatory, if True it returns
# the resulting list in reverse order.
#
# In this case we have also provided the 'key'
# argument, which can be any function (that
# takes one argument), and returns a value
# which is then used as the comparison for
# sorting. I'm certain my explanation in
# this instance has fallen short. Please
# have a look at the official docs at
# http://docs.python.org/lib/built-in-funcs.html
# or ask for further clarification.
#
# So, in our example above the important concepts
# to understand are 1) dictionaries are iterable
# too, 2) 'sorted' is more or less a looping
# construct, it just happens behind the scenes,
# and 3) the 'key' argument to 'sorted' takes
# any function that accepts one argument.
#
# When you use a dictionary in a for loop, you
# are actually implicitly looping over the keys.
# So, we could have also written it thus:
#
# li = sorted(rank.keys(), key=rank.get, reverse=True)
# ... or ...
# li = sorted(rank.iterkeys(), key=rank.get, reverse=True)
#
# Since each key in our dictionary 'rank' is a
# fruit, and each value represents the number of
# times that fruit appears in our original list 'li'
# then it is convenient to pass the .get method of
# 'rank' as the key argument to sorted. For each
# fruit in rank.keys(), rank.get(fruit) is called,
# and the return value is used to sort the resulting
# list.
#
# In any case, I hope that makes a little more sense.
|