algorithm - Redis Twitter-like Follow/Unfollow Design Pattern -
suppose duplicating twitter's follow function. far can tell, agrees following design using redis.
all tweets followed joe stored in sorted set "ss:joe" key=tweet_id, score=tweet_timestamp
so when joe follows ladygaga, ladygaga's tweets added "ss:joe", far good.
the question is: how remove ladygaga's tweets "ss:joe" when joe unfollows ladygaga?
iterating through every single "ss:joe" tweet , remove belong ladygaga out.
the best can think of maintain sorted set every user storing own tweets, ladygaga have sorted set "tweets:ladygaga" key=tweet_id, score=tweet_timestamp, can pick out ladygaga's tweets zinterstore "ss:joe" , "tweets:ladygaga".
is there better solution?
there bigger problem design. storing tweet_ids in ss:joe means system cannot account gaga creating new tweet (or deleting one, if supported) without modifying ss:joe. imagine having few hundred celebs 50,000 followers each, , each writing dozen tweets per day. that's lot of inserts lot of sets, cannot distribute either. and, it's lot of duplicate data (remember redis ram-only database, , although ram gets cheaper it's still near "unlimited"). edit: , updating follower records, need know followers (since iterating on every user on every newly written tweet hardly option). need maintain list of backlinks well.
an alternative design store user id of followed person in set (or sorted set, if will, user can shuffle order). each person further has sorted set tweet ids (sorted date).
this require additional query per followed person tweet ids, reduce unfollowing removing 1 value set, , keep updated automatically new tweets created.
lookups less costly inserts/removes (which may require rebalancing or rehashing), if follow dozen people, queries aren't of issue more frequent updates.
plus, lookups can happen on network of replicated slaves (a second or 2 may pass before new tweet visible everyone, cares -- scales infinitely).
Comments
Post a Comment