Sunday, September 11, 2005

Code Advice #3: No Tabs! Ever!

WARNING: This blog entry was imported from my old blog on blogs.sun.com (which used different blogging software), so formatting and links may not be correct.


(See intro for a background and caveats on these coding advice blog entries.)



Should you use spaces or tab characters when indenting your code?
This question has been debated at length in the past, with a fervor similar to the "emacs versus vi" editor debate.
But unlike "emacs versus vi", we cannot just agree to disagree. We can each choose to use a different IDE.
But the source code is often shared, and if there's one thing that's worse than a source file indented with tabs, it's a source
file partially indented with tabs and spaces. This is typically the result of a file edited by multiple users.



My advice is simple: Always use spaces to indent. That doesn't mean you can't use the Tab key in your keyboard to indent - most tools will automatically do the right thing with spaces instead. In other words, the Tab key is the Indent key, not the Tab character key.



So why is it bad to use tabs instead of spaces?



There are several reasons. Obviously, there's the reason I started out with: that we really need to pick one convention. Spaces for indentation is the most common scheme used to today, so it's a reasonable choice on that basis alone.



One of the problems with tabs is that a tab character needs to be converted into whitespace by the editor when displaying the file. How much whitespace should each tab character be replaced with? In an ideal world, the old typewriter functionality could be used, where each tabstop had a certain pixel position. That way people could even use proportional width fonts in their editors (instead of the blocky monospace fonts used by practically all code editors today), and the code would still indent nicely. However, no editor that I'm aware of supports this, so that's not a practical venue. Instead, editors typically make an assumption that a tab is either 8 characters (common in ye old days) or 4 characters (common in Java editors today). Some editors will stick with the 8 character assumption, but support 4-character indents in Java (which is common), so when indenting to level 3, they will insert a tab, followed by 4 characters, to get a 12 character indent using an 8-character tab.



Why is this bad? Because code is viewed in more than one tool. In the presence of tabs, code often gets misaligned. Code integration e-mail diffs, code viewed in other editors, code edited by other tools which treats tabs in a different way will easily get "mangled" (e.g. start getting mixed spaces and tabs).



(Sidenote: In the old days, source files sometimes included a comment at the top of the file, with special "tokens" (-*-) intended for Emacs. These tokens would identify the language mode as well as the intended tab size for the file. When loading the file, emacs would use the specified tab size. Thus, the source files actually carried the tab information needed to edit the file as intended. However, this solution doesn't really solve the problem since all other tools which process and display the file would also need to be aware of this metadata.)



I've heard people put forward two arguments in favor of using the tab character:


  1. If a file uses ONLY tab characters for indentation, it is easy for users to read code at their own favorite indentation level.
    In other words, I can read your source file with tabs=4, you can read it with tabs=2

  2. It's easier to move back and forth indentation levels, since a single left/right keystroke will jump across tab characters, e.g.
    whole indentation levels.



Regarding argument 1: There are lots of other things I want to customize when I read other people's code too. You see, people don't all agree with my code rules that I'm putting forth in these blog entries :-) So if I read code that is indented poorly, or worse yet put spaces between function calls and the parenthesis, or other horrible coding sins, I hit Shift-F10 to reformat the source properly first anyway. This solution is more comprehensive than simply adjusting the indentation depth.



Regarding argument 2: I don't see a big usecase for being able to move the caret up and down indentation levels. These only apply at the beginning of the code line, and the Home key should alternate between jumping to the beginning of the line and the first nonspace character on the line. Why would you ever need to go somewhere else? Perhaps you want to move some code up an indentation level. That's what the Reformat feature is for. Just reformat the buffer instead.


(Minor sidenote: In
Emacs, and I believe in JBuilder, the Tab key was bound to a reindent action, NOT inserting indentation. This is a much better use of the Tab key. When you're on a new line, pressing Tab should move the tab to the correct indentation level (reindent), NOT inserting say 4 characters. If you're on a line with existing code, hitting Tab should NOT insert 4 characters where the caret is located, it should adjust the line indentation such that it's correctly indented. Thus, if I put an if block around a piece of code, I can just hit Tab, Arrow Down a couple of times to indent the block correctly. I submitted a

patch
for NetBeans to do this a while ago but this behavior is apparently a bit controversial. For a previous XEmacs user like myself it's indispensable.)




Therefore, in my opinion, these potential advantages do not make up for the massive problems and ugly code that result.
Let's all use the same convention - no tabs.



All IDEs let you do this. (I even believe most IDEs default to using spaces. Please double check in your own.)
Here's the option in the new NetBeans 5.0 options dialog:






The people who seem to rely the most on Tabs today are people using old-style editors where Tab characters are still the default.
If you're using Emacs, add the following to your .emacs file:


(custom-set-variables
'(indent-tabs-mode nil)
'(tab-width 4))

Here's how you do the same thing in Vim.


10 comments:

  1. <q>So why is it bad to use spaces?</q>
    Shouldn't that read "tabs"?

    ReplyDelete
  2. Yes indeed. Thanks, I've updated the entry.

    ReplyDelete
  3. The one thing to be careful with in your suggestion is Makefiles. Make requires use of the tab character, so if you switch your editor to always substitute spaces for tabs, then your Makefile won't come out correctly. That being said, I do agree with you on the spaces versus tabs issue.

    ReplyDelete
  4. I use tabs and a proportional font to code in. Why ? Because tabs are for indenting. Spaces are for separating keywords.
    Spaces are so 1990s. Do you still limit the width of your code to 80 characters ? It's 2005 for God's sake lets make some progress here.
    In Eclipse we can set our own tab = N spaces.
    Anyone not using Eclipse we have working on chipping out stone boulders into wheels. No kidding!

    ReplyDelete
  5. Arbitrarily reformatting code using S-F10 just because you don't like someone's indentation level causes configuration management problems. When you need to commit a real change back to the repository you're expected to annotate what you changed, and why. "I like code to start at column 8" is not a valid reason to obfuscate necessary changes. Diffing successive versions of the code in the repository will show that every line in the file has changed, making it nearly impossible to figure out what the real change is.

    ReplyDelete
  6. I also prefer tabs and would argue that argument #2 is really about being able to backspace to the previous indenation level with one keystroke. And it's not laziness - if you use spaces and hit the backspace key too many or too few times, all of a sudden you have a line that is in limbo and not lined up at the right indentation level.
    I don't think anyone will disagree that mixing spaces and tabs are bad. But that's why you should always have published coding standards and force new developers joining the team to read them.

    ReplyDelete
  7. >> Regarding argument 1: ... Using Shift-F10

    And if i do not want/can modify the text ?

    And if the formatter style is not the same ?
    Why not using spaces :

    - Size of the source file containing 8xSpaces instead of ONE tab char.

    - Source comparators performances (diff,...).

    - Source navigation.

    - Spaces->Tab conversion may not be possible. Tab->Spaces is.

    ReplyDelete
  8. Hi Tor, love you on the Posse. Do not love spaces, have much love for tabs. Reiterate previous comments, reformatting will make CVS think you have made changes when you really haven't, whereas tabs allows people to view however they want without it giving CVS fits.
    PS Need more info on Semplice. Have searched java.sun.com site and have come up with diddly/squat. May even be worth upgrading from JDK 1.2 for (just kidding :D )
    PPS How does Semplice handle the abomination which causes desolation? (AKA 1.5 Generics)

    ReplyDelete
  9. Hi Tor,
    - the fact that we have been working with spaces to do lay out: that is really ugly. So I vote in favor of tabs. The issue that you address of editors not in a right way supporting tabs, let us make that the real issue.
    - Well, if cvs thinks you changed all lines when you only changed white space, isn't it time for a better cvs?

    ReplyDelete
  10. Well, honestly, mixing spaces and tabs is wrong, but that's because spaces force your idea of indentation on everyone else.
    Tabs are indentation levels (in block-structured languages such as all C dialects); spaces aren't.
    When I want to view a file with indentationlevel=2, I can do that. Other people (or myself when editing C) will sometimes set the indentation level to 8. Fine. Tabs will be displayed wider. But that's only an editor option.
    While tabs will automatically respect the user's tab width setting, spaces are rigid and will always be the same, so you're only telling other people (with other tab width preferences), that *your* indentation is the best, which is honestly kindof arrogant in a collaborative environment (i.e. programming team).

    ReplyDelete