WARNING: This blog entry was imported from my old blog on blogs.sun.com (which used different blogging software), so formatting and links may not be correct.
(See intro for a background and caveats on these coding advice blog entries.)
Should you use spaces or tab characters when indenting your code?
This question has been debated at length in the past, with a fervor similar to the "emacs versus vi" editor debate.
But unlike "emacs versus vi", we cannot just agree to disagree. We can each choose to use a different IDE.
But the source code is often shared, and if there's one thing that's worse than a source file indented with tabs, it's a source
file partially indented with tabs and spaces. This is typically the result of a file edited by multiple users.
My advice is simple: Always use spaces to indent. That doesn't mean you can't use the Tab key in your keyboard to indent - most tools will automatically do the right thing with spaces instead. In other words, the Tab key is the Indent key, not the Tab character key.
So why is it bad to use tabs instead of spaces?
There are several reasons. Obviously, there's the reason I started out with: that we really need to pick one convention. Spaces for indentation is the most common scheme used to today, so it's a reasonable choice on that basis alone.
One of the problems with tabs is that a tab character needs to be converted into whitespace by the editor when displaying the file. How much whitespace should each tab character be replaced with? In an ideal world, the old typewriter functionality could be used, where each tabstop had a certain pixel position. That way people could even use proportional width fonts in their editors (instead of the blocky monospace fonts used by practically all code editors today), and the code would still indent nicely. However, no editor that I'm aware of supports this, so that's not a practical venue. Instead, editors typically make an assumption that a tab is either 8 characters (common in ye old days) or 4 characters (common in Java editors today). Some editors will stick with the 8 character assumption, but support 4-character indents in Java (which is common), so when indenting to level 3, they will insert a tab, followed by 4 characters, to get a 12 character indent using an 8-character tab.
Why is this bad? Because code is viewed in more than one tool. In the presence of tabs, code often gets misaligned. Code integration e-mail diffs, code viewed in other editors, code edited by other tools which treats tabs in a different way will easily get "mangled" (e.g. start getting mixed spaces and tabs).
(Sidenote: In the old days, source files sometimes included a comment at the top of the file, with special "tokens" (
-*-) intended for Emacs. These tokens would identify the language mode as well as the intended tab size for the file. When loading the file, emacs would use the specified tab size. Thus, the source files actually carried the tab information needed to edit the file as intended. However, this solution doesn't really solve the problem since all other tools which process and display the file would also need to be aware of this metadata.)
I've heard people put forward two arguments in favor of using the tab character:
- If a file uses ONLY tab characters for indentation, it is easy for users to read code at their own favorite indentation level.
In other words, I can read your source file with tabs=4, you can read it with tabs=2
- It's easier to move back and forth indentation levels, since a single left/right keystroke will jump across tab characters, e.g.
whole indentation levels.
Regarding argument 1: There are lots of other things I want to customize when I read other people's code too. You see, people don't all agree with my code rules that I'm putting forth in these blog entries :-) So if I read code that is indented poorly, or worse yet put spaces between function calls and the parenthesis, or other horrible coding sins, I hit Shift-F10 to reformat the source properly first anyway. This solution is more comprehensive than simply adjusting the indentation depth.
Regarding argument 2: I don't see a big usecase for being able to move the caret up and down indentation levels. These only apply at the beginning of the code line, and the Home key should alternate between jumping to the beginning of the line and the first nonspace character on the line. Why would you ever need to go somewhere else? Perhaps you want to move some code up an indentation level. That's what the Reformat feature is for. Just reformat the buffer instead.
(Minor sidenote: In
Emacs, and I believe in JBuilder, the Tab key was bound to a reindent action, NOT inserting indentation. This is a much better use of the Tab key. When you're on a new line, pressing Tab should move the tab to the correct indentation level (reindent), NOT inserting say 4 characters. If you're on a line with existing code, hitting Tab should NOT insert 4 characters where the caret is located, it should adjust the line indentation such that it's correctly indented. Thus, if I put an
if block around a piece of code, I can just hit Tab, Arrow Down a couple of times to indent the block correctly. I submitted a
patch for NetBeans to do this a while ago but this behavior is apparently a bit controversial. For a previous XEmacs user like myself it's indispensable.)
Therefore, in my opinion, these potential advantages do not make up for the massive problems and ugly code that result.
Let's all use the same convention - no tabs.
All IDEs let you do this. (I even believe most IDEs default to using spaces. Please double check in your own.)
Here's the option in the new NetBeans 5.0 options dialog:
The people who seem to rely the most on Tabs today are people using old-style editors where Tab characters are still the default.
If you're using Emacs, add the following to your .emacs file:
Here's how you do the same thing in Vim.