2014-04-01

Large web apps in Python: A good architecture

If you are setting out to write a large application in Python using a relational database, this long post is for you. Herein I share some experience acquired over 6 months while writing large applications that use SQLAlchemy (the most advanced ORM available for the Python language) in a large team.

To be honest, I think this post might be too convoluted, trying to teach too many things at once. But it is really trying to show how multiple aspects converge to fail together.

The dangers ahead

It would be near impossible to explain all the reasons for the bad software I have seen, but it is safe to say they result from the interplay of some forces:

haste
suboptimal technological choices
the fact that SQLAlchemy requires a few weeks of study before a developer can use it wisely
lack of a better overall architecture than MVC
inconsistent understanding of MVC among developers
developers who don’t yet realize that test automation should inform the way the main code is factored
the need to practice TDD (Test Driven Development), which means, write your unit test first
the deleterious effect that a relational database can have on the speed of one’s test suite
the need to learn what unit tests really are: some devs write integration tests thinking they are unit tests

I will talk about all of these forces, their relationships and some solutions.

Haste is bad... Don’t hurry... Mkay?

Unless the software is to be extremely short-lived, it never pays off to write it in a hurry. Creating software takes time, research, learning, experimentation, refactoring and testing. Every concession you make to haste will have nasty repercussions for sure. But don’t believe me; suffer these yourself ― everyone does.

Carefully pick your frameworks

The recommendations contained in this section stand as of April 2014.

When writing large applications, one should choose tools more carefully. Don’t just go the quickest path. For instance, it pays off to choose a more complex web framework such as the full-featured, beautifully designed and thoroughly documented Pyramid, which shows a better defined scope and superior decoupling, than something like Flask, which gets small jobs done more quickly, but suffers from thread local variables everywhere, incomplete documentation that seems to only explain something to someone who already knows it, and plugin greed (Flask enthusiasts may want everything in the system to be a Flask plugin).

You will be tempted to choose Django. Its single advantage is its community size. But it suffers from age and the need for backwards compatibility. Django had to create its own ORM (SQLAlchemy didn’t exist then) which is much worse than SQLAlchemy. Also, if you only know Django, you are likely to have a misconception of the importance of SQLAlchemy. You see, the Django ORM follows the Active Record pattern, while SQLAlchemy adheres to the completely different Unit of Work pattern. It might take you some time to understand the session and the lack of a save() method on your models. This talk helps with that. Finally, Django’s templating engine is severely (and deliberately) handicapped; Genshi, for instance, is much, much richer in features.

Django is also monolithic (as in Not Invented Here syndrome) ― a design trait that flies in the face of everything I say below ― and you see people boasting that Django is a single package, as if this were a good thing. Often they avoid using the package management tools in Python and that is just silly. Many things in Django are just silly...

Now, SQLAlchemy is so advanced in comparison to all the other ORMs in the Python world that it is safe to say, if you are accessing a relational database through anything else, you are missing out. This is why you should not choose the web2py web framework, either. It does not have an ORM, just a simple DAL to generate SQL.

(Since I have already recommended Pyramid and SQLAlchemy, why not a word about form validation toolkits? Since the ancient FormEncode, many libraries have been created for this purpose, such as WTForms and ToscaWidgets. I will just say that you owe it to yourself to try out the Deform and Colander combination – they have unique features, such as the conversion of data to a Python structure, and the separation of schema (Colander) from widgets (Deform), that really get it right. This architectural clarity results in a much more powerful package than the competition. But again, it will be necessary to spend a little more time learning how to use these tools. The learning curve of the competition might be less steep, but you can suffer more later on.)

MVC is not enough for a large project

You probably know the MVC (model, view, controller) architecture as applied to web development. (If you don’t, you are not ready to create a large application yet: go make a small one first, using an MVC framework, and come back to this post in a few months. Or years.)

Strictly speaking, MVC is an ancient concept, from early Smalltalk days, that doesn’t really apply to web development. The Django folks have correctly understood that in Python we actually practise MTV (model, template, view):

The template contains HTML content and presentation logic. It is written in a templating language such as Kajiki. It gets data from the view and outputs a web page.
The view (also sometimes called “controller”), written in Python, is just glue code. It uses the web framework to put everything together. It can see all the other layers. It defines URLs, maps them to functions which receive data from the web framework and use the other layers to finally answer back to the web framework. It should be kept small, because the code in it is not very reusable. Even if you keep it lean, web forms tend to make it complex.
The model layer is essentially a persistence layer: its most important dependency is SQLAlchemy. The model knows how to save the data, constituting the most reusable code in the entire project. It certainly does not know anything about HTTP or your web framework. It represents the essence of your system without the details of a user interface.

But wait. Where? In the view or in the model? Where should you put the soul of your project: the business rules? The template layer is automatically out because it is not written in Python. So 3 possible answers remain:

The view. This is the worst choice. The view should contain only glue code, should be kept thin, and should isolate the web framework from the rest of your system, so the system can be reached independently of the web framework, in usage and in unit testing. Also, business logic should reside in a more reusable place. The view layer is considered a part of the presentation logic, so business logic is out of there. Indeed, to create a desktop UI in addition to the web UI, one would ignore views and HTTP and need business rules to be reusable ― therefore, out of the view layer.
The model. This is slightly preferrable, because at least the model is a reusable layer. But the model should focus on persistence, it should depend on little more than SQLAlchemy (which is already a complex beast).
A new layer. This is the correct answer. Let us understand this better through an example below.

MTV certainly is all you need to create a blog. But for more complex projects, there is at least one layer missing there. I mean, you need to remove the business logic from where it stands and put it in a new, reusable layer, which most people call a “Service” layer, but I like to call “Action” layer.

Why do you need that?

In larger applications, it is common for a single user action to cause multiple activities. Suppose, for instance, the user successfully signs up for your service. Your business rules might want to trigger many processes as a consequence:

Create data in multiple tables of the relational database, using the models.
Enqueue a task that will send an email to the user.
Enqueue a task that will send an SMS to the user’s phone.
Enqueue a task that will create space and other resources in preparation for actual use of the service.
Enqueue a task that will update user statistics.
...

This is a good example of what we understand by a “business rule”: Given a user action (e. g. sign up), these are the activities the system must perform. This business rule had better be captured in a single function; in which layer should this function go?

If all this were implemented in a model, can you imagine how complex it would be? Models are hard enough when they focus only on persistence. Now imagine a model doing all that stuff, too. How many external services does it have to consume? How many imports are there at the top of the file? Don’t some of those modules, in turn, want to import this model, too, creating a cyclic dependency that crashes the system before it starts?

A circular dependency alone is a clear sign that you aren’t seeing your architecture properly.

It simply isn’t clean for a model to depend on Celery, to know how to send emails and SMS and consume external services etc. Persistence is a complex enough subject for the model to handle. You need to capture many of these business rules outside of the model – in a layer that stands between the model and the view. So let’s call it the “Action” layer.

Also remember that a model usually maps to a single table in the relational database. If you are at once inserting a User and a Subscription, which of these 2 models should contain the above logic? It is almost impossible to decide. This is because the set of activities being performed is much larger than either the User’s concerns or the Subscription’s concerns. Therefore, this business rule should be defined outside of either model.

When a dev is performing maintenance, sometimes she wants to run each of these steps manually; other times she will execute the whole action. It helps her to have these activities implemented separately and called from a single Action function.

You might wonder if what I am proposing isn’t really an instance of an antipattern called Anemic Domain Model. Models without behaviour are just contrary to object-oriented design! I am not saying “remove all methods from your models”, but I am saying “move away methods that need external services”. A method that finds in a model all the data that it needs... really wants to belong to that model. A method that looks out to the world, consumes external services and barely looks into self... has been misplaced in the model.

Another reason that makes this a successful architecture is testing. TDD teaches a programmer to keep things decoupled and it always results in better software. If you have to set up a Celery application and who knows what other external services before you can test your model, you are going to be frequently in pain.

There is a final reason to keep business rules outside the view layer. In the future, when you finally decide it’s time to switch from Flask to Pyramid, you will be glad that your views are as thin as possible. If all your view does is deal with the web framework and call a couple methods on your Action layer, it is doing one job (as should be) and code is properly isolated. Web frameworks can be greedy; don’t let them have their way with your system.

So here are the layers I propose for a typical large application in Python:

Model is the lowest, most reusable, most visible layer. It focuses solely on persistence. It is all right for a model to contain behaviour, as long as this behaviour pertains only to this model and not to other things. Models can be returned by each layer, all the way to the template at the end of a request.
External services. Make one of these for each service such as sending email.
Action. This layer is the core of your system. It contains business rules and workflows. It uses the external services to achieve certain goals and uses the models to persist data. With the above layers, it constitutes the whole system, including configuration, except for the user interface.
Template. This layer only contains presentation logic such as looping through a list to output an HTML table.
View. It is the highest, least reusable, layer. It depends on (and isolates from the rest of the system) the web framework. It depends on (and isolates from the rest of the system) the form validation library. It can see the Template and Action layers. It cannot call a Model directly ― it must go through an Action. But when an Action returns models, these can be passed on to the template. (A Celery task is analogous to a web view.)

This architecture helps avoid heroic debugging sessions insofar as it clearly defines responsibilities. It is also eminently testable because it is more decoupled, thus requiring less test setup and fewer test mocks and stubs.

Good architecture is all about decoupling things. If you ever catch yourself trying to resolve a cyclic dependency, rest assured you haven’t thought well about the responsibilities of your layers. When you see yourself giving up and importing something inside a function, know that your architecture has failed.

It also goes without saying that your web app should be kept separate from your Celery (tasks) app. There can be code reuse between them ― especially models ― but there is no reason for a Celery app to ever import the web framework! Obtaining configuration is no exception. Reading configuration is easy in Python.

Automated testing is a mighty challenge, too

Python is a very flexible, expressive, reflexive language. A down side of its dynamism is that it catches fewer errors at “compile time”. One wouldn’t create large systems in Java without automated testing today; even less so in Python.

You start writing tests as soon as you realize their importance towards your confidence in the software. You understand this and you start writing them. The first tests you write have enormous value. They give you an incredible boost in confidence in your system. You are having fun.

However, soon your tests start feeling more like a burden. You now have hundreds of tests and the test suite takes forever to run ― minutes, even. In this sense, each new test you write makes your life worse. At this point, some might become disenchanted and conclude testing isn’t worth it. A premature conclusion.

You thought you knew how to write tests. But in your naiveté, you have been writing all integration tests. You call them unit tests, but they really aren’t. Each test goes through almost the whole stack in your system. You put your mocks at the most remote points possible. You thought this was good (it was testing more this way). Now you start to see this isn’t good.

A unit test is the opposite. A real unit test is focused like a laser. It executes only one function of one layer, it uses mocks to prevent descent into other layers, it never accesses external resources, it asserts only one condition, and it is lightning fast.

To add insult to injury, when your tests do their job ― showing you that you messed up ― they are a nuisance. Instead of a single focused failed unit test showing you exactly where you did something wrong, you have dozens of integration tests failing (all of them for the same reason, because they all go through the same code) but it takes you a long time to figure out where the bug really is. You still need some heroic debugging. You need to learn to write better tests!

Experts recommend you write 99% of real, focused, mocked, sniper, unit tests. And only 1% of all-layers-encompassing integration tests. If you had done so from the start, your unit test suite would still be running in only a couple seconds, which is required for TDD to be feasible. If a unit test is not instantaneous (less than 10 milliseconds for sure), it’s really some other kind of test, but not a unit test.

If this were a small application, you could still get away with your sluggish integration tests. But we are talking about large applications, remember? In this context, the reality is, you either optimize your tests performance, or you cannot practise TDD!

Also, as you can remember, some tests have been difficult to write. They required too much work to set up. (Integration tests tend to.) Someone explains this is because your tests aren’t really unit tests and you aren’t doing Test First – you are writing tests to existing code that wasn’t written with sufficient decoupling that it would be easily testable. Here you start to see how TDD really changes not only the tests, but your actual system, for the better.

Watch these talks about test writing.

Fast Test, Slow Test
Integration tests are a scam
Stop mocking, start testing (Does not really attack the practice of mocking, just proposes the reuse of mocks and stubs.)

To find out which are your 2 slowest tests, run this command:

py.test -s --tb=native -x --durations 2

SQLAlchemy and testing

But the system uses SQLAlchemy! Data travels between layers in the form of model instances. A database query is performed and boom, you are already over the 10ms limit. This forces you to conclude: if it hits the database, it is not a unit test. (It is instantaneous to instantiate a SQLAlchemy model, but it is expensive to talk to SQLite, even if it is in memory.) Yeah, TDD will force you to keep queries, session.flush() and session.commit() outside of a function suitable for unit testing!

You still need to write a few integration tests anyway. They test the contracts between the layers and catch bugs that unit tests don’t catch. For integration tests, John Anderson has a good approach: Use SQLAlchemy, but never allow the transaction to be committed. At the end of each test, do a session.rollback() so the next test can run without the database having been changed. This way you don’t need to recreate tables for each test you run.

To make that possible, you can’t be committing the session all over the place. It is probably best to stipulate a rule: the system can only call session.commit() in the outermost layer possible. This means the web view or the Celery task. Don’t commit in the Model layer! Don’t commit in the Action layer!

This creates a final problem: How do I write a unit test for a task, if the task commits the transaction? I need a way for the test to call the task saying: exceptionally, just this once (because this is a test), please don’t commit. Otherwise the unit test would hit the database server and exceed the maximum of 10 milliseconds.

I finally came up with a mechanism to give an external function (e. g. a test case) control over whether the transaction is to be committed or not. With this scheme, a task commits the transaction by default, but allows a test to tell it not to commit. You are welcome to the following code:

from functools import wraps

def transaction(fn):
    '''Decorator that encloses the decorated function in a DB transaction.
    The decorated function does not need to session.commit(). Usage::

        @transaction
        def my_function():  # (...)

    If any exception is raised from this function, the session is rewinded.

    But if you pass ``persist=None`` when calling your decorated function,
    the session is neither committed nor rewinded. This is great for tests
    because you are still in the transaction when asserting the result.

    If you pass ``persist=False``, the session is always rewinded.

    The default is ``persist=True``, meaning yes, commit the transaction.
    '''
    @wraps(fn)
    def wrapper(*a, **kw):
        persist = kw.pop('persist', True)
        try:
            fn(*a, **kw)
        except:
            db.session.rollback()
            raise
        else:
            if persist is False:
                db.session.rollback()
            elif persist is True:
                db.session.commit()
    return wrapper

This post is dedicated to my friend Luiz Honda who taught me most of it all.

Posted by Nando Florestan

Tags: in English, programming

Recent Posts

Quick search