Self-referencing many-to-many through
Django's ManyToMany through attribute allows you to describe relationships between objects. I've written a post about this - (Describing Relationships, Django's ManyToMany Through) - and so I won't cover here the details of its implementation or usage. What I want to talk about in this post is how to create ManyToMany relationships between objects of the same kind, and more than that, to show how those relationships can be described using through models.
Asymmetrical Relationships - the Twitter model
On twitter you follow people. Maybe some people follow you, but the relationships are all in one direction, asymmetrical. In Django you can implement this using a ManyToMany relationship. We don't need a special through model for this, but suppose we wanted to attach some metadata to those relationships. Below is sample code for a twitter-style database of people and their relationships with one another. The relationships carry a status column denoting whether a particular user is following another or blocking another:
class Person(models.Model):
name = models.CharField(max_length=100)
relationships = models.ManyToManyField('self', through='Relationship',
symmetrical=False,
related_name='related_to')
def __unicode__(self):
return self.name
RELATIONSHIP_FOLLOWING = 1
RELATIONSHIP_BLOCKED = 2
RELATIONSHIP_STATUSES = (
(RELATIONSHIP_FOLLOWING, 'Following'),
(RELATIONSHIP_BLOCKED, 'Blocked'),
)
class Relationship(models.Model):
from_person = models.ForeignKey(Person, related_name='from_people')
to_person = models.ForeignKey(Person, related_name='to_people')
status = models.IntegerField(choices=RELATIONSHIP_STATUSES)
Taking a look at the models, what's important to note is that on the Person model I've created a ManyToMany to self
through Relationship
. The attribute asymmetrical
is True, but when you're using intermediary models in Django this is a must because Django won't know exactly how to describe the other side of relationship since the through model may have any number of fields besides ForeignKeys. Which brings up the next model, Relationship. Relationship has two foreign keys to Person, and a status, which indicates the type of relationship 'from_person' has to 'to_person'. Now, let's add some methods to the Person model to make it easier to talk about how these relationships can be used:
def add_relationship(self, person, status):
relationship, created = Relationship.objects.get_or_create(
from_person=self,
to_person=person,
status=status)
return relationship
def remove_relationship(self, person, status):
Relationship.objects.filter(
from_person=self,
to_person=person,
status=status).delete()
return
Adding and removing relationships requires no magic - we can deal directly with the Relationship model and create or delete instances of it. If we wanted to find out who is following a user, though, it's sort of obnoxious to query Relationship and then extract the people from the returned queryset. This is where the ManyToMany comes in. We can query the 'relationships' (and its partner 'related_to') to look at Relationship objects and return the people they refer to. Here are some more methods for the Person model:
def get_relationships(self, status):
return self.relationships.filter(
to_people__status=status,
to_people__from_person=self)
def get_related_to(self, status):
return self.related_to.filter(
from_people__status=status,
from_people__to_person=self)
def get_following(self):
return self.get_relationships(RELATIONSHIP_FOLLOWING)
def get_followers(self):
return self.get_related_to(RELATIONSHIP_FOLLOWING)
Looking at the actual SQL helps me understand what these ORM incantations actually mean. Creating a relationship between two users is a simple INSERT into the relationships table. But reading relationships out and referring them back to people in a meaningful and efficient way is the biggest win of using the ManyToMany. Here is the SQL for getting who a person is following:
SELECT twitter_person.id, twitter_person.name
FROM twitter_person
INNER JOIN twitter_relationship
ON (twitter_person.id = twitter_relationship.to_person_id)
WHERE
(twitter_relationship.from_person_id = 1 AND
twitter_relationship.status = 1)
This is opposed to what the ORM would run if we got a Relationship queryset and then iterated over it to find out who the 'to_user' was:
SELECT twitter_relationship.id, twitter_relationship.from_person_id,
twitter_relationship.to_person_id, twitter_relationship.status
FROM twitter_relationship
WHERE
(twitter_relationship.status = 1 AND
twitter_relationship.from_person_id = 1)
-- followed by this for every twitter user returned:
SELECT * FROM twitter_person WHERE id = X
It's generally much more efficient to use the JOIN and execute just one query. The 'get_relationships' and 'get_related_to' are simple wrappers around filter which creates the appropriate query. Here's an example of what you might do:
In [1]: from twitter.models import Person
In [2]: john = Person.objects.create(name='John')
In [3]: paul = Person.objects.create(name='Paul')
In [4]: from twitter.models import RELATIONSHIP_FOLLOWING
In [5]: john.add_relationship(paul, RELATIONSHIP_FOLLOWING)
Out[5]: <Relationship: Relationship object>
In [6]: john.get_following()
Out[6]: [<Person: Paul>]
In [7]: paul.get_followers()
Out[7]: [<Person: John>]
In [8]: paul.add_relationship(john, RELATIONSHIP_FOLLOWING)
Out[8]: <Relationship: Relationship object>
In [9]: paul.get_following()
Out[9]: [<Person: John>]
In [10]: yoko = Person.objects.create(name='Yoko')
In [11]: john.add_relationship(yoko, RELATIONSHIP_FOLLOWING)
Out[11]: <Relationship: Relationship object>
In [12]: paul.remove_relationship(john, RELATIONSHIP_FOLLOWING)
In [13]: john.get_following()
Out[13]: [<Person: Paul>, <Person: Yoko>]
In [14]: paul.get_following()
Out[14]: []
Now, let's add one more thing to the mix. Say that if two people are following eachother, we'll call them 'friends'. How you would implement this is by combining the two queries for get_followers and get_following:
def get_friends(self):
return self.relationships.filter(
to_people__status=RELATIONSHIP_FOLLOWING,
to_people__from_person=self,
from_people__status=RELATIONSHIP_FOLLOWING,
from_people__to_person=self)
Symmetrical Relationships - the Facebook model
Django's ManyToManyField allows you to specify a 'symmetrical' attribute, but you cannot use this when also specifying a 'through' model. We can actually use most of the model definitions from above -- the only change will be to the ManyToMany field:
class Person(models.Model):
name = models.CharField(max_length=100)
relationships = models.ManyToManyField('self', through='Relationship',
symmetrical=False,
related_name='related_to+')
It's hard to spot the difference. Note the plus-sign at the end of related_name
. This indicates to Django that the reverse relationship should not be exposed. Since the relationships are symmetrical, this is the desired behavior, after all, if I am friends with person A, then person A is friends with me. Django won't create the symmetrical relationships for you, so a bit needs to get added to the add_relationship and remove_relationship methods to explicitly handle the other side of the relationship:
def add_relationship(self, person, status, symm=True):
relationship, created = Relationship.objects.get_or_create(
from_person=self,
to_person=person,
status=status)
if symm:
# avoid recursion by passing `symm=False`
person.add_relationship(self, status, False)
return relationship
def remove_relationship(self, person, status, symm=True):
Relationship.objects.filter(
from_person=self,
to_person=person,
status=status).delete()
if symm:
# avoid recursion by passing `symm=False`
person.remove_relationship(self, status, False)
Now, whenever we create a relationship going one way, its complement is created (or removed). Since the relationships go in both directions, we can get rid of the following/followers stuff and simply use:
def get_relationships(self, status):
return self.relationships.filter(
to_people__status=status,
to_people__from_person=self)
Using it in the admin
You may want to use access the Relationships in the context of a Person in the admin. Since the Relationship model has two foreign keys to Person, the underlying code that instantiates the inlines will blow up unless you specify a ForeignKey to use. Here's how to make it work:
# admin.py
from django.contrib import admin
from twitter.models import Person, Relationship
class RelationshipInline(admin.StackedInline):
model = Relationship
fk_name = 'from_person'
class PersonAdmin(admin.ModelAdmin):
inlines = [RelationshipInline]
admin.site.register(Person, PersonAdmin)
Doing other cool stuff
The pattern here can be used to do a lot more than just describe who's friends with who. One possible improvement is to normalize status types into its own proper model, so status-types can be defined more dynamically. You could even make a Relationship's status a ManyToMany itself. Another possible use would be if you had a tree-like structure but wanted to describe relationships between Nodes that may not be direct descendants of one another. Anyways, that's about it for this post - I hope you found it useful!
Relationships App
Comments (3)
Charles Leifer | jan 17 2010, at 05:22pm
Thanks for the good word! I am actually on twitter, though I'm woefully bad about using it. My username is 'beetlemeyer'
Samuel Clay | jan 17 2010, at 03:32pm
Excellent guide. How are you not on Twitter? This is the exact kind of blog where I look for the "Follow me on twitter" link and click it.
Commenting has been closed.
kurs | jan 17 2010, at 09:31pm
I always find the m2m relation to be a little bit tricky in Django.