Python/Style Guide

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Two kinds of style

There are two kinds of style that I worry about.

One is what I might call "lexical" style: How the code is actually written.

The other is more a "design" style, how the code works.

I'll talk about both.

Lexical Style

4 spaces, no tabs

Each line of code should be indented by units of 4 spaces. You aren't going to write deeply nested code, so you don't have to ever use 2 or 1 space indents.

I feel so strongly about this rule that I almost feel like we should make it part of the Python language itself.

USE_CONSTANTS = False

There is literally no reason to use constants in Python, except one: DRY (Don't Repeat Yourself). When you have magic numbers that you can't get rid of, you kind of have to use constants, but try to avoid them otherwise.

Caps and such

  • StudlyCaps for class names.
  • ALL_CAPS for constants (very rare)
  • no_caps_at_all for everything else.

To be honest, I don't like StudlyCaps. I'd prefer we used Studly_Spaced_Caps instead, but what are you going to do? Honestly, treating classes like they are special leads to wrong thinking.

Design Style

Here are some rules of thumb I have for Python code design.

Functions are First Class Citizens

You'll need to understand and use a functional language to truly understand this, but there is a lot of things that can be made simpler if you allow people to pass functions around. Let them.

Classes are First Class Citizens

Like functions, classes can be passed around. Learn what `type(name, bases, dict)` is all about. And use it.

In the rare case you actually need metaclasses, use them. There's no reason to be shy about it.

Duck Typing

Never, ever ask for the type of an object. There is really only one exception to this rule: It's hard to tell lists and strings apart. So don't write code that people may either pass in strings or lists. Either accept lists, or strings, but not both.

There is one other case: Files. You might get a filename or an open filehandle. (See the next rule.)

The Proper Way to Handle Files

This isn't a standard, but I think it should be. There should be 2 ways to specify any file:

  • By filename (a string)
  • By file handle.

Being able to pass around strings as if they were files is very useful. Often times, you don't even want to open the file.

(Maybe I should invent an object that is a file but is not a file. IE, it isn't opened yet, but could be with a simple call to "open".)

What you pass into __init__ should end up as attributes you can modify

Users expect this behavior to be the same:

x = X(a=5)
x = X()
x.a = 5

With descriptors, there is no reason why you can't do that. Exception: Sometimes you have objects that can't change once they are created.

Regardless, you should make the params to __init__ available as attributes.

Descriptors in the wrong places

Speaking of which, don't abuse descriptors.

  • Don't use descriptors which have major side effects. Use a method instead. People expect a.x to just return a value while a.x() will do something.
  • Don't make get_x and set_x functions. Use a descriptor, even if there are side effects (such as changing cached results.)

Making Code Testable

If you have a class that is a wrapper around another class, make it so that the user can replace the class with a Mock object. That means you need to expose the underlying magic with an explicit attribute.

Is that really private?

People tend to overuse "private" attributes, attributes or methods which start with '_'. The only time when you should use a private function are:

  • When there is NO foreseeable case where the user of your code might ever access that.
  • When there is a high probability you will change it in the near future.

If both of those aren't satisfied, don't do it. That's because nothing is private in Python. It's simply impossible. (Well, you CAN use a trick that I won't tell you about because I don't want you to use them. It starts with "C" and ends in "losure".)

Document EVERYTHING!!!!

Get in the habit of adding doc strings to everything. EVERYTHING.

You should feel bad when you write code without even a single sentence describing the thing.

Minimize the number of classes, functions, etc...

Try as hard as you can to minimize the number of moving parts. You'll thank me later, as well as the people who use your code.

  • Limit the number of variables. Put values that go together in a single variable, a dict or a list.
  • Limit the number of functions. Write bigger functions that do bigger things. Break out code that is actually shared.
  • Limit the number of parameters. See my note about about how to group values.
  • Limit the number of classes. Try to keep the number of methods and attributes small as well.
  • Limit the number of modules. You should have 1 module or 5+, but never 2-4.

Avoid classes

If you don't need a class, don't use a class.

Why would you want a class?

  • You have a set of functions that follow a pattern that go together all the time.
  • You want to create a new type that can be used to replace and existing type.
  • You have a paradigm that you want people to stick to.

Otherwise, don't use a class.

You can just write a bunch of functions and put it in a module and it's fine.

Avoid Functions

Write bigger functions. Only break out code from functions when it is genuinely useful somewhere else.

Think HARD about names

Try to get as much documentation as you can in the names themselves.

But don't make your names too long.

Be consistent with naming.

Think HARD before you use singular or plural forms. Plurals tend to denote lists. Singulars denote single values. Package and module names should usually be singular.

Factory Pattern: NO!

The Factory Pattern is an anti-pattern in Python. You don't need it. Just make a better __init__.

Prepare for evolution

The way to prepare for code evolution is to limit the interconnectivity of your code. One piece of code should call into another piece of code with only a few parameters.

Use dicts!

99% of the problems I have can be solved by dicts. I don't need a class, just a dict.

Use Lists!

The other 1% of the problems I have are solved by lists.

Use iterators and generators!

Always think of how iterators can make your code simpler. You should be able to break every problem down into one or two for loops.

Use Comprehensions

Comprehensions are a major time saver. Remember that you don't want intermediate variables. You should be able toable to get all you want from a single comprehension.