More twitter: naive friend recommendation

August 22, 2009

To finish the holy trilogy of twitter madness, there goes a naive friend recommender script for twitter. It basically applies a simple idea over the social graph: from all my friends, select all their friends that I don’t follow, rank them by ocurrence, filter against who is already in my list and recommend them to me.

Interesting enough, the results are useful. I managed to found some people that I really didn’t knew that had a twitter account or simply that I liked to follow.

The inspiration for this came from this article, which describes a more refined algorithm. I thought that the simplification made sense because, without a tuned text classification module, all recommendations based in the social graph really don’t get better results even with more tunning.

This code also abuses the python twitter project and the defaultdict from python’s collection. It would be really interesting to split it in a Map/Reduce setup using Disco Project more in a sense of nerdness than a practical vantage point.

Don’t forget to ask twitter for whitelisting, because it can really eat up the 150 hits/hour limit. I tried to fix it by using shorter friend lists and refiltering it with my friend’s list later. You can read more about rate limiting here.


import twitter
from urllib import urlencode
import simplejson
from collections import defaultdict

my_user_name = ''  #here goes your twitter username
my_pass = '' #here goes the passwd

api = twitter.Api(username=my_user_name, password=my_pass)
api.SetUserAgent("ZenMachine recommender")

# keep checking this user's rate limit
print simplejson.loads(api._FetchUrl("https://twitter.com/account/rate_limit_status.json"))  

friends_of_friends = defaultdict(lambda: 0)

def download_friends(screen_name):
 if screen_name == None: return []
 friends = simplejson.loads(api._FetchUrl("http://twitter.com/friends/ids/%s.json" % screen_name))  
 # limiting friends avoid bashing twitter's api, but it can bias the result or taint it
 # with people you already have as friend. Do it only if you have a LOT of friends
 return friends #[:20]  

def download_friends_by_id(id):
 if id == None: return []
 friends = simplejson.loads(api._FetchUrl("http://twitter.com/friends/ids.json?user_id=%s" % id))  
 # limiting here wouldn't affect twitter's api...
 return friends

def compute_friends_of_friends(my_friends):
 for friend in my_friends:
 ff = download_friends_by_id(friend)
 for af in ff:
 if af not in my_friends:
 friends_of_friends[af]+=1
 return friends_of_friends

def generate_friend_recommendation_ranking(friends_of_friends):
 friend_list = []
 for a in friends_of_friends.keys():
 if friends_of_friends[a] > 1:
 # order by count of similar friends and translate user id into screen_name
 u_info = api.GetUser(a)
 print "http://twitter.com/%s (%d): %d" % (u_info.screen_name, a, friends_of_friends[a])
 friend_list.append({"id": a, "screen_name":u_info.screen_name, "ranking": friends_of_friends[a]})
 friend_list.sort(key = lambda x: x['ranking'])
 return friend_list

def main():
 # my friends list
 my_friends = download_friends(my_user_name)
 my_info = api.GetUser(my_user_name)

 # then, for each friend, I get all of his/her friends and fill a dict
 # as an extra pass, I already exclude the ones I already have in my friends list
 # in a Map/reduce setup, this would be a good candidate to Map
 # the trick for limiting is chop down my friends here and filter them again later

 friends_of_friends = compute_friends_of_friends(my_friends[:20])

 # filter friends of friends again. The initial filtering may be disabled if you want.

 for fr in friends_of_friends.keys():
 if fr in my_friends:
 del(friends_of_friends[fr])
 # delete myself, if I happen to be on the list 

 del(friends_of_friends[my_info.id]) 

 # and this one would be a good candidate to be Reduce
 friend_list = generate_friend_recommendation_ranking(friends_of_friends)
 print friend_list

if __name__ == "__main__":
 main()

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: