This document is very far from being finished. Further, a
sizable part of
the contents (in particular concerning Exception
s) were written on paper
during a period of computer malfunction, and were
subsequently added in bulk with sub-optimal integration.
I hope to improve this over time; however, as I have left Java development, it is not a high personal priority.
Much of the contents were written as early as 2009 and might reflect a different technical situation than the time of reading. While the general principles are unlikely to be much affected (and are often not Java-specific to begin with), some details might need to be taken with caution.
There are many existing style guides for Java (e.g. http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.htmle); and this page is not an attempt to provide yet another. Instead, it should be considered an add-on to existing style guides, with some finer points typically not included in these. It further contains some particularly important points that would typically be included, and a few issues that, IMO, are often handled sub-optimally elsewhere.
Generally: Style guides are, by their very nature, highly subjective; and there is often no one right answer. Instead, the answer will depend on personal preference, circumstances that vary from organization to organization, backwards compatibility and what developers are used to, etc. In particular, the ideas in a separate discussion of rules and rule-breaking applies. Nevertheless, a “usually the best way to go” can often be found, and this page should be seen as an attempt to identify such cases based on rational criteria.
The recommendations will often not be limited to Java, but reflect more general principles; however, care should be taken not to compare apples and oranges, when generalizing to other languages/areas. A particular issue is that different languages might have developed different idioms that can be contrary to general principles, but be so established that it would be foolish to ban them.
Note that there are cases below where it can be disputed whether “style guide entry” or e.g. “best practice” is the better term. (However, the legitimate scope of a style guide is not limited to naming conventions and layout.)
Further note that there are cases where terminology is not ideal. For instance, cf. an early item, “variable” is often used in a technical meaning in Java that is not compatible with the idea of something that is variable, i.e. varies, leading to paradoxes like a variable being declared final. For the time being, I have made no greater attempts to find better solutions in such situations, as the intent should always be clear and as natural language can be treated a bit more casually than code, humans more casually than compilers, etc.
Below is a continually growing list of recommendations with a corresponding rationale.
Declare any new variable as final
, unless it is known in advance that it
will be changed. If and when a legitimate need to change it arises, then remove
the final
.
Rationale: This allows the compiler to provide checks against accidental changes, gives other developers a hint not to make unwise alterations, and prevents re-use of variables (a common source of bugs).
Beware, however, that a declaration as final
only affects which object
is pointed to: The object can no longer be switched for another; however, it
can still be altered.
General principle: Help the compiler to help you!
This principle is very important in its own right and should be kept in mind both when reading the below (even when it is not explicitly mentioned) and when actively working, as there might be many cases where it applies that are not listed here, and as it is independent of Java.
TODO There are many other examples not listed below. Add them.
What constitutes “legitimate” in “legitimate need to change it” is open to dispute. (Ditto for other uses of “legitimate” and similar words.) However, I recommend erring on the side of the conservative, especially as, per the above, it was not known in advance that the variable needed to be changed, which implies that any later change is disproportionately likely to be e.g. an illegitimate re-use—not to mention poorly thought through.
A similar idea can apply to methods/classes/whatnots, but should be viewed with
more caution. For instance, use of final
on a method can prevent an
accidental override, but it is not usually possible to know in advance what
non-private
methods might be legitimately overridden by someone else—and,
possibly, someone who does not have the option of removing that final
,
e.g. because he uses a fix library. Here, it is better
to make methods never intended for
overriding private
(resp. as restricted as possible, if they can not be made private
).
Of course, a deliberate override should always use @override
, and this opens the door for
automatic checking.
Reduce direct access to member variables to a bare minimum, preferably only to getter/setter methods and, possibly, constructors.
Rationale: A direct access might look as a convenient time saver for the developer, but it is also a potential source of bugs and maintenance problems. Instead, always use getters and setters to provide a “single point of access”. This makes it easier to enforce particular behaviors (e.g. preventing that a variable is set to null), allows centralized changes, and facilitates debugging and logging.
An access from
a class defined in another file should never use anything but getters and
setters. (Note that this does not apply to constants, e.g.
java.lang.Math.PI
.) Ideally, only one getter respectively setter per
variable is present; if several are used, only one should do direct access.
Note that with a good IDE, or an editor like Vim or Emacs, the extra work involved in setting up the accessor methods will be minimal; and even were it not, this would be a small one time cost.
Always use the this
keyword when accessing an instance variable or
method internally. Benefits include:
It is easier for a reader to differ between instance and class members.
If the programmer mistakenly thinks that a member is on the instance level, this gives the compiler a chance to catch that error. (Note that the worst-case effects of such errors are far from trivial.)
Many editors/IDEs help with code completion after an input of
this.
; starting immediately with the member name does not activate
this help. (And if it does, the number of potential matches is much
larger, which reduces the value of the help.)
Counter-argument (I actually once had a team lead who was
emphatically, but not, to me, convincingly, against use of this
):
Use of the this
keyword would be a sign that the
programmer does not understand the language sufficiently to work
without crutches. This is, of course, absolute nonsense. Not only is
there no such connection, but the argument also leaves later
readers out of the equation: One of the most important considerations, when
writing code, is the later readability and maintainability—and there
is no guarantee that later readers are on the same competence level as the
original author. Besides, no-one is infallible and “Help the compiler to help you!” applies.
Do not name variables according to schemes like theString
,
theName
, etc. (Other common and equally bad prefixes are “tmp” and
“a”.)
Rationale: Such schemes almost
always lead to uninformative and too generic names of variables—use
descriptive names like textToTranslate
instead. An additional
complication is that there is no good way of handling collections: compare e.g.
aNames
or aCollectionWithNames
with namesToIndex
.
A common rationale in favor of these prefixes is that their use
would make it
easier to keep variables of different kinds apart
(e.g. by using “tmp”
for method-local variables and “the” for method arguments). This is partially correct,
but the benefit is not large enough to outweigh the poor names. Further, if
this benefit is really wanted, it can be better achieved
by using a prefix without “natural language”
meaning, which is affixed to an already formed logical name. For instance,
“l” for local variables and “p” for arguments/parameters can be used.
(Not, however, “a” for “argument”—this could lead to confusion with
the indefinite article.) Instead of e.g.
theName
and tmpName
, we would now have pNameToIndex
and
lCanonicalName
.
Why not simply instruct the developers to make sure that the names are informative, even when they do begin with e.g. “the”?
Simply put, it will not work:
I have seen such schemes used in practice on
several occasions, and more often than not the names degenerate into
prefix–type combinations like aString
, aCollection
, etc. Even
“tmp”, which might seem more abstract, has this problem, because a semantic
connection is made and the “tmp” is seen as an actual part of the logical
name. The advantage of entirely abstract prefixes, like “l”, is that this
problem goes away—at least as long as the “l” is not explicitly read out as
“local”. (Why not use e.g. “q” to avoid even this risk? A prefix without no
connection whatsoever would increase the risk for mistakes and inconsistencies:
With e.g. “l” both the coder and later readers can easily recognize the right
category based on the prefix.)
This is an example of the general principle that style guides should make it easy for developers to do their job well. Cruise controlsw and alcolocksw in cars are good examples of the same principle—when they work correctly. Unfortunately, some not uncommon guidelines, like the naming schemes above, do the opposite—like a cruise control that automatically increases the maintained speed by 5 km/hour at random intervals.
Another issue is the odd names that can result
from having several variables (etc.) of the same type, e.g. the common combination of aX
and anotherX
. And what happens when we have three or more Xs? aThirdX
?
A sometimes common practice is to signify interfaces with an “I”
(e.g. ICircle
and ISquare
), with the “plain” names reserved for
the implementations (e.g. Circle
and Square
). Variations include
CircleI
. Do not follow this bad practice.
Rationale: A general principle in object-oriented development is to always
program against interfaces and to avoid explicit implementation names in any
public context. By implication, if a variable corresponds to a circle, if a
method returns or operates on a circle, etc., then the interface for circles should be used,
not a specific circle implementation. In this situation, a name
for the interface like ICircle
is cumbersome, redundant, and misses the
point; Circle
, OTOH, is spot on. Additionally, an interface might have
several independent implementations, and using Circle
for exactly one of
these introduces a lot of arbitrariness. (Using Circle
for more than
one, through different packages, would in turn be a source of confusion and
errors.) Lastly, it is possible that a class hierarchy is refactored to make
e.g. a class into an interface or vice versa (this should, obviously, be
avoided for externally used classes/interfaces). The forced additional name
change is an unnecessary complication, and lacks any logical reason.
The last points to a subtle complication: An interface in a generic sense need not correspond to an interface in the specific Java sense. (However, for any non-trivial or widely used hierarchy, I recommend that they do.)
The rationale for the bad practice is usually along the lines that the
developer knows whether he is dealing with an interface or not.
This, however, is a specious argument: Firstly, the developer
should not need to know this. Secondly, the same benefit can be
provided by using suitable names for the implementations (e.g. by including an
“impl”, “abstract”, “concrete”, or similar in the name; or by using a
more specific name, e.g. List
for the interface and ArrayList
for
an implementing class).
Prefer empty collections over null
collections, and only allow the
latter in cases where both are necessary for semantic reasons
(hypothetically for a value that can be unset, set-and-empty,
set-and-not-empty).
Rationale: The possibility of encountering a null
value forces the
developers to use additional if
statements to check for this case,
while an empty collection can often be handled generically by the same code
that handles the non-empty collections, e.g.:
Collection<Place> placesToVisit = ...
for (Place currentPlace : placesToVisit) {
currentplace.visit(this);
}
In fact, cases where empty collections need special treatment are very rare outside of UI programming, when this idiom is applied.
Premature optimizers might complain that a null
check is faster or
that the extra collections waste resources: This is a highly naive opinion
in all but the most extreme cases, and only goes to prove that they are a
danger to software development.
Do not duplicate the contents of constructors, but always have one constructor call another (typically fewer arguments -> more arguments) until the one “true” constructor is reached. An exception: When a sub-class wants to provide constructors from a super-class, it is often better to have each individual constructor call its super-correspondent; however, this is a judgment call, depending on the amount of logic that is or is not added.
Similar applies to methods of the same name, but different arguments. Here, however, errors are rare to begin with.
Rationale: This makes it easier to avoid duplication and to keep a neat logical structure. It also gives a single point of control.
Consider:
public class Gizmo {
...
public Gizmo() {
this(null);
}
public Gizmo(String name) {
this(name, null);
}
public Gizmo(String name, String inventor) {
super();
this.name = name;
this.inventor = inventor;
}
...
Note that going the other way (by having the two-argument constructor call the one-argument, which in turn calls the zero argument) would make for messier and harder-to-maintain code—even if it could look like a good idea of “incremental development” at a casual glance.
Some care must be
taken when the constructors have arguments of different types or semantics:
For constructors with String name
and Name name
arguments a
simple conversion is almost always the best solution (the direction will
depend on the internals of the class); however, for highly different types
some other solution might be necessary. (Then again, if the types are that
different, there is a fair chance that they do not belong in the same class
to begin with, but should be put in two separate implementations of the
same interface. Generally, an excess of constructors is often a sign
that the class tries to do to much.)
Any design decision that is not obvious should be documented, including (but not limited to) optimization tricks, unexpected data types, hacks of various kinds, choice of algorithm, whether something is a “for now” solution, what could break if a more expected solution had been chosen, ... Unfortunately, this is something that even experienced developers tend to be too lazy to do.
Rationale: There is no guarantee that the current developer will do all the
future work—nor that any future developers will have psychic powers. It is
also quite possible that the original developer has, himself, forgotten his
reasons a few months later. The result, more likely than not, is that later
changes unintentionally break something, be it by causing a bug or compatibility
problem or by worsening performance by a factor of ten. If nothing else: By
explicitly telling others why a decision was made, the original developer can
avoid looking like a fool in the eyes of others (e.g. for having used a
Vector
instead of an ArrayList
). Note that this is not limited
to technical decisions, but also includes e.g. domain requirements of an unexpected
nature.
I once read a very illustrative anecdote on this topic: A family recipe for pot roast included cutting of large parts of the roast. The latest family member in line to cook was confused as to why this done. Several older family members were consulted, before the habit was traced to a still living member of an older generation. Her eventual answer: “I do not know why you do it. I did so, because otherwise the roast would not fit into my pan.”
I have tried in vain to dig up the original reference for this story. However, interestingly, a Google search for “"pot roast" "why you do"” yielded three apparent references as the first three hits—with two different versions and an unreadable “Google Books Result” which might have contained a third.
Correspondingly, the factuality of the story might be low. Nevertheless, the principle holds.
Within reasonable limits, code units should be as small as possible. This includes, but is not limited to, methods, classes, and class files: If a method grows too long, split the contents over several methods. If a class becomes to large, try to remodel it (likely, it has been given too much responsibility—a sign of poor modeling); if this fails, divide the work onto internal helpers. If a file is too large, break it up into smaller units (this includes files in general, e.g. HTML, XML, build files for ANT, and MS Word—although the last is easier said than done...).
Rationale: Larger units are harder to read and overview, making a division forces the developer to think interfaces through, calls to well-chosen method names are much more informative than the corresponding blocks of code, smaller units reduce the risk of conflicting changes, smaller units facilitate re-use and centralization of control, ...
A counter-argument that I have occasionally heard is that, contrarily, having all code in one place (e.g. a method) would make it easier to overview. These developers have been weak in their ability to group information, abstract things, and similar; and have relied on having the code immediately in front of their respective nose in order to process it linearly. Linear reading, however, is not the way to read code—code is not a novel. A good developer learns to think more abstractly, divide the code into distinct chunks that are investigated separately, etc. For him, a higher degree of division (using good names) will be helpful, not detrimental. (Indeed, someone who sets out to write quality code, with good “chunking”, proper separation of concerns, etc., will only rarely run into the danger of some unit growing so large as to need division into smaller units.)
As for the linear readers: They show a sufficiently large deficiency that it is only acceptable in beginners; and when someone who has worked as a developer for more than, possibly, a year has not learned non-linear reading, then he should seriously reconsider his career choices.
The above is a completely different topic from spaghetti code, which often forces the reader to follow a linear thread-of-content that is distributed in a file in a non-linear manner. What is discussed here is the ability to get a higher level overview and investigate lower levels if and when the information becomes relevant. Spaghetti code resembles gamebooksw; the above, a book with a good table of contents, an overall overview, individual summaries for each chapter, descriptive names for each chapter/section, etc.
A recommendation to enable linear reading over spaghetti-code reading has my full support—but it is even better to strive directly for support of “chunking”.
... or at least be very careful when using them.
Rationale: These might not be understood by everyone else, typically fit poorly within the framework of Java (more generally, any “foreign” language), and often lead to absurd code.
Example: One of my previous employers started as a SmallTalk specialist. When a move to Java was made, the then developers missed several features present in SmallTalk (notably, anonymous code-blocks), and tried to emulate them in Java. The result was many cases of hard-to-read code, with disadvantages far outweighing the advantages. (Although they might have worked as long as everyone had a solid SmallTalk background. This state, however, did not last long; and ten years later...)
In fact, they were so stuck in their SmallTalk thinking that they used
formulations like “send object X the message Y” when discussing the Java
method call X.Y()
in documentation intended for the public. Not only
does this indicate an inability to think in the proper paradigm, but it is also
highly likely to confuse readers—most of whom will not be familiar with
SmallTalk, but will often be familiar with message passing in other
senses (e.g. between two threads).
Notably, this applies even if the paradigms are beneficial in and by themselves: The anonymous code blocks of SmallTalk, e.g., are wonderful—in SmallTalk. That language has built-in support for them; Java does not. In an analogy: It can be disputed whether electric razors are superior or inferior to classic ones; but the point is moot when no electricity is available.
Always wrap third-party products in an abstracting interface—including logging, DB-access, and similar. Notably, for a first time use, a perfunctory, internal mini-class might be enough.
Rationale: There is no telling when a change might be needed, and single points of control are invaluable. This will also help in keeping modularization up, and to keep “feature discipline” high (developers will think twice before using features of the third-party product that are not available elsewhere). A common issue is that the third-party products are often too low-level for them to be used directly.
Their being low-level is not in any way wrong—it can even be necessary in order to preserve sufficient flexibility. This does not mean, however, that it is always a good decision to use the provided interfaces directly; instead, they should be considered a generic framework to build the needed own interfaces.
Here, however, it is important to keep the own interfaces sufficiently abstract, so that a switch is actually possible later on, and so that the local developers do not have to deal with the low-level details of the foreign framework (except, of course, when working on the wrapper).
The built-in language features and standard libraries, e.g. file access, can almost always be used directly; however, exceptions can exist. For instance, logging in Java has two frameworks of considerable popularity (the built-in and Log4J), and there is a non-trivial probability that a switch or dual use will occur in any one application. Even when a single framework is used, a new version of that framework might have a different interface from the old version—do we now change each and every use in a gazillion classes or do we change a single wrapper class? Further, the extra layer can be highly beneficial for other reasons, e.g. simplifications and to have a single point of control.
Explicitly state input/output conditions, how null values are treated, and similar. In particular, state which conditions on input are actively checked and which are (at least potentially) silently assumed to be correct.
Similarly, explicitly document which methods are idempotent and which are not,
whenever relevant (e.g. for a deleteItem
method, but not e.g.
getXXX
).
Rationale: A developer must be able to know and rely upon the behavior of the existing code that he uses. Note, in particular, that methods with similar names and conventions do not always behave similarly—often they vary even within in the same team or for the same developer. Making an explicit statement helps overcome this problem. Any unstated side-effects (and similar) must be sufficiently trivial that they will only be relevant in extremely rare cases (say the writing of something to a log file, which could cause a timing problem to be triggered, but will almost never do so).
The issue of what checks are performed is not secondary: If a method
actively checks for its pre-conditions, developers can rely on these checks
to catch any errors they make (at least after writing test cases with a decent
coverage); if it does not, they must themselves take corresponding
precautions—and they must know of this need in advance. Notably, illegal input
to a (non-checking) method need not lead to an Exception
or other visible
error, but can result in a data inconsistency or other hard-to-detect
problem.
Obviously, consistency is something to strive for and it is highly recommended that the local style guide lay down a standard (and for the developers to adhere to that standard...); however, even then there will always be methods and cases that do not fit the corresponding scheme—and external users of a certain module might follow a different convention.
Do use checked Exception
s.
Rationale: Unchecked Exception
s are almost always a bad idea, because the
developers have no reliable way of being aware of them, catch them
appropriately, and so on. In effect, an unchecked Exception
will be caught
when the first “catch-all” occurs, where it can rarely be treated in a
good way—or it will not be caught at all. (Even giving a good error
message to the user might unnecessarily be impossible.) As a side-effect, this
makes it harder to build good hierarchies of Exception
s and to use module-specific
Exception
s.
A good test is whether anyone in the course of normal development (as
opposed to e.g. testing) could reasonably want to catch and handle the
Exception
: If so, he should be helped to do so.
(Note that a “catch-all” should be a last resort for unexpected events,
nothing more. Further, that even a hypothetical tool that informs the
developer of the currently thrown unchecked Exception
s will be extremely
unreliable, because it can guarantee neither completeness nor immunity to
future changes.)
Apparently, there is a school of thought strongly opposed to checked exceptions—and, unfortunately, this school is gaining popularity.
So far, knock-on-wood, its arguments have not been convincing, often leaving
me with the impression that this is a matter of short-term personal
convenience—save five minutes for me now; lose ten minutes for several others
down the line. (Something which goes against what I see as the core of
software development: Long-term laziness. Generally, many who claim to be
agile developers seem to be using agility as an excuse to be lazy in the short term,
to be “cowboy coders”, whatnot, rather than applying it for the benefits that it
legitimately can bring.) Some of the arguments used
are implicitly dealt with below, e.g. complaints about methods that have
nothing to do with X declaring that they throw X-related Exception
s (a result of
poor propagation—not of checked Exception
s).
One argument that does have some validity
is that there are Exception
s that are naturally fatal—no reasonable
way of recovering exists. However, this is rare, the one developer cannot
always determine what is reasonable for another, and short of outright Error
s,
this is a tricky area, where it is better to err on the side of checked Exception
s.
If in doubt, if an Exception
truly cannot be handled, it can always be re-thrown
(as is or after wrapping). Then there is the complication of where such an Exception
should end (e.g. whether it should kill the running programming, be caught and logged in a main
method, be caught and logged elsewhere). As this “where” need not be the same for all types
of Exception
, even within the same application, a good decision might not be possible with an untyped Exception
.
From another point of view, the original rationale for unchecked Exception
s
was that (a) to e.g. check for an IndexOutOfBoundsException
upon each and every index-based access
would be extremely cumbersome, (b) the Exception
is exceptional even by the standards of Exception
s—only
very rarely does an own method have a similar issue (not already covered by existing RuntimeException
s) and
these come close to being the opposite of the “naturally fatal” Exception
s.
(That a method passes on unchecked Exception
s that it encounters can be quite in order.
Ditto that e.g. an own Array
class throws an IndexOutOfBoundsException
when it is
called with an out-of-bounds index, respectively some other own class/method throws another existing unchecked Exception
when it fits the situation. Both are different matters entirely.)
Let a module throw only its own (checked) Exception
s; not just pass
through the Exception
s of others, but catch and wrap. (The latter will
usually also be checked; however, catching-and-wrapping unchecked Exception
s can make
sense in some cases.)
Rationale: This makes for a more consistent, simpler, and easier-to-handle
interface. In particular, any developer using the module will know that he has
only a limited set of Exception
s that he needs to be concerned about. Further,
the change resilience is increased, in that no additional effort is needed when
something changes in a lower layer.
Use fine-grained Exception
s (in a suitable tree-hierarchy) for any module.
While it is rarely possible (or a good idea) to give every
special case its own class (or interface), at a minimum, every major category that actually occurs
with the module at hand should
have one (hypothetically: IO Exception
s, security Exception
s,
incorrect-input Exception
s—this will vary from case to case). Particularly
important individual Exception
s should also have own classes. When in doubt,
err on the side of too many individual classes.
Rationale: This enables other classes to make their treatment correspondingly fine-grained. They might choose not do so, but the decision should be left to them, not you.
While it is important to not get carried away with the number of classes, arguments around e.g. the effort
needed are usually misplaced, as much can be done with code generation and as the amount
of work per Exception
class/interface can be quite small: usually, no actual
functionality is needed beyond that already provided by standard Exception
s
and the role is more to give a marker that allows for discrimination.
Never use catch–do-nothing: Minimum is a log message. (Some minor
exceptions can be allowed, like failed numerical conversions, where the
Exception
is used as a flow control feature, rather than a “true”
Exception
—this, however, is usually a bad practice.)
Rationale: With a catch–do-nothing, there is no way to find out what went wrong after the fact. Indeed, it will often be hard to find out that something went wrong at all.
Configure logging to send emails (or similar messages) upon any Exception
(or other noticed problem) above a certain severity.
Rationale: Too often, log files are not checked (time constraints, lazy co-workers, a sloppy mentality). Even more often, they are not checked in a timely manner. The result is errors that remain undetected and uncorrected, that are detected too late, that re-occur half-a-dozen times even though they could have been fixed after the first occurrence, etc. Explicitly sending an email reduces the risk for this considerably—and can even allow developers to correct a problem before the users become aware of it.
A case in point: I was once a member of a team that developed and maintained one of Germany’s leading online-auction platform for a customer. My code was written to send emails, as discussed above. Once I received an error email triggered by someone at the customer’s playing around with the newsletter interface. The error was entirely uncritical for (and probably went unnoticed by) him—but for us the alarm bells went off: The error was caused by the ISP’s failure to mount all NFS file-systems correctly. If we had not noticed this, a major malfunction would have occurred at a later stage; as is, we called the ISP and had it correct the error before any non-trivial problems occurred.
Note that for consumer applications this must be restricted to development phases or be made contingent on explicit consent by the users: Sending emails without the user’s knowledge and full control is evil. Further, there is always a risk that one single code error results in a hundred thousand emails, once the application has been released.
A FileNotFoundException
(the same applies, m.m., to other specific
Exception
s) should only be thrown by methods that have a very clear
association with files, e.g. a convenience method around Java IO file calls.
In contrast, a generic resource or stream class should throw a special purpose
Exception
, e.g. ResourceNotFoundException
, when a needed file (stream, database
entry, whatnot) is not found. This
Exception
, in turn, should wrap the original Exception
(if one exists).
Rationale: Doing anything else leads to poor abstraction, forces other classes to do a lot of special-case handling, etc.
Make sure that each module has its own Exception
s hierarchy with one “root” per
module (which, on occasion, might inherit from a more abstract root—if
logically reasonable). If several roots are needed, this is usually a sign that
the module does too much and should be split into several; however, the
existence of several sub-roots, to justify the different conceptual needs
of the module, is entirely acceptable.
Another module should inherit the hierarchies only when this is conceptually
justified—not when it happens to be convenient. Using the same Exception
will hardly ever be justified.
A common recommendation is some variation of “Handle Exception
s/errors as early as possible. Do
not let higher levels see the problem.”. This recommendation is
fundamentally flawed: A problem should be handled at the most appropriate
level—and this can be quite high:
Consider e.g. a file not found error after the user has manually requested that
a specific file be opened in an editor.
Here, telling the user that the file requested could not be found is the right
thing to do; to log that the file is missing, to open an empty dummy file, and then to proceed as if nothing went wrong, would
be inexcusable.
The pertinent issues are who knows most about the current situation, who
could find a workaround during runtime, who could fix the error, who needs to know,
etc.
Obviously, this does not mean that an Exception
should be propagated
willy-nilly; the relevant information should be brought to the right point in a
controlled manner. Further, the right point can be where the problem
is first detected.
Another problem is that this recommendation is often misinterpreted to forbid a propagation at all—or to require hiding vital information by a try-catch-log-continue. This is only very rarely an acceptable, let alone appropriate, solution.
(See also my discussion of error messages.)
When logging Exception
s, make sure to include the stack trace and all
“recursive” Exception
s.
Rationale: The stack trace is a developer’s best friend when it comes to debugging. Not having the corresponding information, or only having it in an incomplete form, makes debugging much harder (and is extremely frustrating).
Some developers like to complain that this or that code convention leads to unnecessary work. This, however, is seldom a valid concern: If a developer wants to save time for himself now, he does so at the price of time lost later—often by others, often through more difficult maintenance or unnecessary bugs, often outweighing the short-term save by several orders. Notably, this “later” can be in the near future, often even the same day...
As a general rule, good software developers focus on making life easy for others, not for themselves: They make sure that they choose good names, they write comments, structure their code well, write test cases, etc.—even if this slows them down in the short term. The reason: They know that this extra effort will pay off in the long run.
Being able to hold the workings of a nightmare class in one’s head, does not make one a good developer—making sure that no class is a nightmare, so that no-one needs to, does. Also note that as good as everyone is wrong, who considers himself intelligent enough to handle the complexities of software; and that even if he were right, not all of his colleagues would be.
A sometimes heard argument against use of some conventions is brevity, e.g.
that use of this
or accessor methods leads to cluttered code or too long
lines; however, this should not be an issue for good coders: Problems arise
when statements are too complex, line-breaks are not used correctly, and
similar. In effect, if adding a this
in front of a name leads to
problems with reading or maintaining the code, then the code already is in a
severe need of clean-up.
Neither should the additional keystrokes be an issue: Any negative effects here can be neutralized by learning touch typing or how to use macros; further, the additional hints for code-completion systems will often lead to a net decrease in the number of key strokes.
There are a number of tools available for automatic checks of compliance to a certain style. These make it easy to check for and enforce the right use; and will help in avoiding future problems. Checkstylew is one example of such a checker, also available as an Eclipse plug-in.
Notably, the single greatest problem with style guides is how to get developers to follow them: Attitudes like “I know what I am doing, and do not need a style guide.”, “I don’t give a f-ck!”, and “Style guides are good in theory, but I do not have the time and memory to learn ours by heart.” are very common—never mind innocent oversights and errors. Just mandating that the style guide be followed will not work with all employees, threatening with repercussions is likely to backfire, and attempts at creating awareness and understanding are time-consuming and will not work with everyone. By instead having an automatic compliance check, e.g. in daily builds or as a pre-requisite before it is (technically) possible to commit a change set, these problems can be reduced—while at the same time making it easier for the developers to follow the style guide through direct feedback to deviations.
It can be interesting to have a closer look at the attitudes mentioned:
“I know what I am doing...” can actually be true; however, overlooks the benefits of consistency throughout a group of developers or a product. There are often several equally worthy solutions, yet keeping to one of them throughout makes it easier for individual developers to read and understand each other’s code, and reduces the risk of misunderstandings. Further, in my experience, a clear majority of those who consider themselves competent, or even highly competent, usually are not; thus, merely having this attitude does not automatically put the developer in the group where a limited justification of the attitude is possible.
A related complication is that having a “You do what you want!” rule for one developer and a “You obey the style guide!” rule for another can lead to unnecessary conflicts or make the latter insist that he, too, be allowed to ignore the style guide.
“I don’t give a f-ck!” (and, yes, I have encountered this sentiment—usually in incompetents) indicates someone with a severe attitude problem. Unless there are mitigating circumstances, or the developer is extremely competent in other regards, his position in the company might need re-thinking. Actually getting someone of this mindset to change is very hard. (However, I stress that no-one should be let go without having been explicitly told what the problem is, and given an opportunity to change. Further, great care should be taken to ensure that the mindset has not been misunderstood.)
“Style guides are ...” is often an excuse; however, when it is honestly meant, the possibilities of a change in attitude are comparatively large. Apart from use of tools (as above), it can help to stress that later time savings will outweigh the short-term problems, and that the difficulty of learning a style guide are comparatively small—in particular, if a gradual learning is allowed.
However, a dual lesson is that the author of a style guide should take care not to go into too much detail, and that he might be better off sticking to “industry standards” over finding his own rules for everything. In this manner, he reduces the time investment needed to master the style guide.
Why not simply ensure that the code is automatically formatted to match the rules before every check-in? That way, everyone could write the way that he wants.
When we look explicitly at formatting (white-space, line-breaks) or some simple
issues (order of imports, whether non-mandatory braces are present, whether
this
is used, ...) this
can be a viable solution—and one that was not available in a
to-be-taken-seriously manner at the original time of writing. However, it is
not a complete solution, being unable to handle many other typical guidelines.
To e.g. determine whether a certain name is suitable will usually require
intelligent thought—and even apparently easy changes, say ensuring different
prefixes for parameters and local variables, can turn out to be quite tricky.
Further, there might be legitimate reasons why a particular guideline should be
ignored in a particular context, requiring intelligent discretion. (Say, that
the parameter names of a particular class deviate from a prefix rule in order
to match, identically, the names used in a conceptually related third-party
interface.)
A further complication is that the promised benefit will only work trivially in one direction: A user who wants “his” rules to apply in the files that he has checked out must himself implement a back-translation, might run into problems with comparisons between different code bases, might cause spurious changes through cycles of translation and back-translation, whatnot.
Yet another complication is that there is often reason to format some code in one of several style-guide compliant manners, which an automatic formatter could naively turn into a different compliant manner. (With variations on the same theme.) For instance, I have repeatedly had the issue that an automatic formatter insists on bringing parameters of a method onto the same line, as long as the resulting line does not exceed a given maximal length, after I had deliberately put them on more than one line for better readability or a better conceptual grouping.
In other words: Such tools are a great complement to a style-guide—but they are not a replacement.
In an addendum to the addendum: We have now reached 2023, and it does not seem unreasonable that AI’s might now or in the foreseeable future manage tasks like judging the quality of a name (cf. above). Then again, it is not inconceivable that much of software development will be reduced to an AI task as time passes, which could render not just this page but this entire category, maybe even this entire website, redundant.
The following is an automatically generated list of other pages linking to this one. These may or may not contain further content relevant to this topic.