Self-referencing many-to-many through
Django's ManyToMany through attribute allows you to describe relationships between objects. I've written a post about this - (Describing Relationships, Django's ManyToMany Through) - and so I won't cover here the details of its implementation or usage. What I want to talk about in this post is how to create ManyToMany relationships between objects of the same kind, and more than that, to show how those relationships can be described using through models.
Asymmetrical Relationships - the Twitter model
As we all know by now, on twitter you follow people. Maybe some people follow you, but the relationships are all uni-directional, or asymmetrical. One of the benefits of this model is that I can follow anyone I want and they don't have to approve me, because the relationship is one-way and no relationship to me is implied on their part.
So how do you model something like this in Django? It's actually pretty simple. The relationships will all be between 'people', so there needs to be a class for that. Since the relationships being described should be somewhat flexible, there also should be a model for them.
class Person(models.Model): name = models.CharField(max_length=100) relationships = models.ManyToManyField('self', through='Relationship', symmetrical=False, related_name='related_to') def __unicode__(self): return self.name RELATIONSHIP_FOLLOWING = 1 RELATIONSHIP_BLOCKED = 2 RELATIONSHIP_STATUSES = ( (RELATIONSHIP_FOLLOWING, 'Following'), (RELATIONSHIP_BLOCKED, 'Blocked'), ) class Relationship(models.Model): from_person = models.ForeignKey(Person, related_name='from_people') to_person = models.ForeignKey(Person, related_name='to_people') status = models.IntegerField(choices=RELATIONSHIP_STATUSES)
So, taking a look at the models, what's important to note is that on the Person model I've created a ManyToMany to 'self' through 'Relationship'. The attribute 'asymmetrical' is True, but when you're using intermediary models in Django this is a must because Django won't know exactly how to describe the other side of relationship since the through model may have any number of fields besides ForeignKeys. Which brings up the next model, Relationship. Relationship has two foreign keys to Person, and a status, which indicates the type of relationship 'from_person' has to 'to_person'. Now, let's add some methods to the Person model to make it easier to talk about how these relationships can be used:
def add_relationship(self, person, status): relationship, created = Relationship.objects.get_or_create( from_person=self, to_person=person, status=status) return relationship def remove_relationship(self, person, status): Relationship.objects.filter( from_person=self, to_person=person, status=status).delete() return
Adding and removing relationships are pretty straightforward - we can deal directly with the Relationship model and create or delete instances of it. If we wanted to find out who is following a user, though, it's sort of obnoxious to query Relationship and then extract the people from the returned queryset. This is where the ManyToMany comes in. We can query the 'relationships' (and its partner 'related_to') to look at Relationship objects and return the people they refer to. Here are some more methods for the Person model:
def get_relationships(self, status): return self.relationships.filter( to_people__status=status, to_people__from_person=self) def get_related_to(self, status): return self.related_to.filter( from_people__status=status, from_people__to_person=self) def get_following(self): return self.get_relationships(RELATIONSHIP_FOLLOWING) def get_followers(self): return self.get_related_to(RELATIONSHIP_FOLLOWING)
Looking at this at a SQL level might make it more clear. What we do when we create relationships is a simple INSERT into the relationships table. But reading relationships out and referring them back to people in a meaningful and efficient way is the biggest win of using the ManyToMany. Here is the SQL for getting who a person is following:
SELECT twitter_person.id, twitter_person.name FROM twitter_person INNER JOIN twitter_relationship ON (twitter_person.id = twitter_relationship.to_person_id) WHERE (twitter_relationship.from_person_id = 1 AND twitter_relationship.status = 1)
This is opposed to what the ORM would run if we got a Relationship queryset and then iterated over it to find out who the 'to_user' was:
SELECT twitter_relationship.id, twitter_relationship.from_person_id, twitter_relationship.to_person_id, twitter_relationship.status FROM twitter_relationship WHERE (twitter_relationship.status = 1 AND twitter_relationship.from_person_id = 1) -- followed by this for every twitter user returned: SELECT * FROM twitter_person WHERE id = X
It's generally much more efficient to use the JOIN and execute just one query. The 'get_relationships' and 'get_related_to' are simple wrappers around filter which creates the appropriate query. Here's an example of what you might do:
In : from twitter.models import Person In : john = Person.objects.create(name='John') In : paul = Person.objects.create(name='Paul') In : from twitter.models import RELATIONSHIP_FOLLOWING In : john.add_relationship(paul, RELATIONSHIP_FOLLOWING) Out: <Relationship: Relationship object> In : john.get_following() Out: [<Person: Paul>] In : paul.get_followers() Out: [<Person: John>] In : paul.add_relationship(john, RELATIONSHIP_FOLLOWING) Out: <Relationship: Relationship object> In : paul.get_following() Out: [<Person: John>] In : yoko = Person.objects.create(name='Yoko') In : john.add_relationship(yoko, RELATIONSHIP_FOLLOWING) Out: <Relationship: Relationship object> In : paul.remove_relationship(john, RELATIONSHIP_FOLLOWING) In : john.get_following() Out: [<Person: Paul>, <Person: Yoko>] In : paul.get_following() Out: 
Now, let's add one more thing to the mix. Say that if two people are following eachother, we'll call them 'friends'. How you would implement this is by combining the two queries for get_followers and get_following:
def get_friends(self): return self.relationships.filter( to_people__status=RELATIONSHIP_FOLLOWING, to_people__from_person=self, from_people__status=RELATIONSHIP_FOLLOWING, from_people__to_person=self)
Symmetrical Relationships - the Facebook model
In the world of facebook, when you become friends with someone, you are their friend and they are your friends - it's a mutual kind of thing. Django's ManyToManyField allows you to specify a 'symmetrical' attribute, but you cannot use this when also specifying a 'through' model. We can actually use most of the model definitions from above -- the only change will be to the ManyToMany field:
class Person(models.Model): name = models.CharField(max_length=100) relationships = models.ManyToManyField('self', through='Relationship', symmetrical=False, related_name='related_to+')
Note the plus-sign at the end of related_name. This indicates to Django that the reverse relationship should not be exposed. Since the relationships are symmetrical, this is the desired behavior, after all, if I am friends with person A, then person A is friends with me. Django won't create the symmetrical relationships for you, so a bit needs to get added to the add_relationship and remove_relationship methods to explicitly handle the other side of the relationship:
def add_relationship(self, person, status, symm=True): relationship, created = Relationship.objects.get_or_create( from_person=self, to_person=person, status=status) if symm: # avoid recursion by passing `symm=False` person.add_relationship(self, status, False) return relationship def remove_relationship(self, person, status, symm=True): Relationship.objects.filter( from_person=self, to_person=person, status=status).delete() if symm: # avoid recursion by passing `symm=False` person.remove_relationship(self, status, False)
Now, whenever we create a relationship going one way, its complement is created (or removed). Since the relationships go in both directions, we can get rid of the following/followers stuff and simply use:
def get_relationships(self, status): return self.relationships.filter( to_people__status=status, to_people__from_person=self)
Using it in the admin
You may want to use access the Relationships in the context of a Person in the admin. Since the Relationship model has two foreign keys to Person, the underlying code that instantiates the inlines will blow up unless you specify a ForeignKey to use. Here's how to make it work:
# admin.py from django.contrib import admin from twitter.models import Person, Relationship class RelationshipInline(admin.StackedInline): model = Relationship fk_name = 'from_person' class PersonAdmin(admin.ModelAdmin): inlines = [RelationshipInline] admin.site.register(Person, PersonAdmin)
Doing other cool stuff
The pattern here can be used to do a lot more than just describe who's friends with who. One possible improvement is to normalize status types into its own proper model, so status-types can be defined more dynamically. You could even make a Relationship's status a ManyToMany itself. Another possible use would be if you had a tree-like structure but wanted to describe relationships between Nodes that may not be direct descendants of one another. Anyways, that's about it for this post - I hope you found it useful!
Commenting has been closed, but please feel free to contact me