675 lines
26 KiB
HTML
675 lines
26 KiB
HTML
<?xml version="1.0" encoding="us-ascii"?>
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
|
|
<head>
|
|
<link type="text/CSS" rel="stylesheet" href="style.css" />
|
|
<link type="image/x-icon" rel="shortcut icon" href="favicon.png" />
|
|
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
|
|
<title>Thalassa CMS official documentation</title>
|
|
</head><body>
|
|
<div class="theheader">
|
|
<a href="index.html"><img src="logo.png"
|
|
alt="thalassa cms logo" class="logo" /></a>
|
|
<h1><a href="index.html">Thalassa CMS official documentation</a></h1>
|
|
</div>
|
|
<div class="clear_both"></div>
|
|
<div class="navbar" id="uppernavbar"> <a href="cpp_subset.html#uppernavbar" title="previous" class="navlnk">⇐</a> <a href="devdoc.html#coding_style" title="up" class="navlnk">⇑</a> <a href="scriptpp.html#uppernavbar" title="next" class="navlnk">⇒</a> </div>
|
|
|
|
<div class="page_content">
|
|
|
|
<h1 class="page_title"><a href="">Coding style guide</a></h1>
|
|
<div class="page_body">
|
|
<ul>
|
|
<li><a href="#formatting">Code formatting, indentation, spaces and the
|
|
like</a></li>
|
|
<ul>
|
|
<li><a href="#sacredrule">The sacred 80 column rule</a></li>
|
|
<li><a href="#indentation">Basic indentation</a></li>
|
|
<li><a href="#braces">Curly braces placement</a></li>
|
|
<li><a href="#longlines">Breaking up long lines</a></li>
|
|
</ul>
|
|
<li><a href="#alphabet">Alphabet and language</a></li>
|
|
<ul>
|
|
<li><a href="#asciionly">ASCII only</a></li>
|
|
<li><a href="#english">English only</a></li>
|
|
<li><a href="#Identifiers">Identifiers</a></li>
|
|
</ul>
|
|
<li><a href="#restrictions">More restrictions</a></li>
|
|
<ul>
|
|
<li><a href="#typedefs">No commitee-invented typedefs</a></li>
|
|
<li><a href="#sideeffects">Side effects</a></li>
|
|
<li><a href="#goto">Goto is only allowed in two situations</a></li>
|
|
</ul>
|
|
</ul>
|
|
|
|
<p><hr />
|
|
</p>
|
|
<h2 id="formatting">Code formatting, indentation, spaces and the like</h2>
|
|
|
|
<p>First of all, no auto formatters, such as well-known GNU
|
|
<code>indent</code> program, are allowed. The rules from this section must
|
|
be obeyed continuously, which means that your code <em>in any given
|
|
moment</em> must be rules-compliant. Once you did something to the code so
|
|
that it is no longer compliant, you <strong>must not</strong> do anything
|
|
but making it compliant again, until it is.
|
|
</p>
|
|
|
|
<h3 id="sacredrule">The sacred 80 column rule</h3>
|
|
|
|
<p>Thou shalt not cross 80 columns in thy file.
|
|
</p>
|
|
<p>Once again: <strong>Thou shalt not cross 80 columns in thy file.</strong>
|
|
</p>
|
|
<p>If you use tabs for indentation (which is <strong>not</strong> recommended,
|
|
but still allowed), the 80 columns rule must be obeyed for 8-column tabs.
|
|
</p>
|
|
<p>In fact, it is recommended to keep the lines no longer than 75 columns, but
|
|
in case you really need so, 78 is still okay. Even 79 is still okay. 80
|
|
is not okay, but, well, tolerable. <strong>For 81-column and longer lines,
|
|
zero-tolerance policy is in effect.</strong>
|
|
</p>
|
|
<p>If your line doesn't want to fit into this limit, see the section devoted
|
|
to <a href="#longlines">long lines</a> for further instructions (spoiler:
|
|
no, there's no exception for the sacred 80 column rule).
|
|
</p>
|
|
<p><div class="remark">
|
|
</p>
|
|
<p>People often argue there's no real reason to maintain the 80-column rule
|
|
nowadays, when monitors are wide and so on. Some even recall that the
|
|
figure of 80 in fact came from a punch card width; those people would tell
|
|
you the punch card epoch is over so traditions should be revised.
|
|
</p>
|
|
<p>Damn all the crap like this. To understand how misleading it is, just come
|
|
to your bookshelf (well, you do have some books printed on paper, don't
|
|
you? if you don't, then visit local library or one of your friends who
|
|
still have books), take any arbitrary book, printed in any year from, say,
|
|
XVIII century to the present time, in any place in the world, in any
|
|
language, in any alphabet (well, not hieroglyphic, so a book in Japanese,
|
|
Chinese or Corean will not fit — but any of English, Spanish,
|
|
Russian, Armenian, Arabic — it doesn't matter that Arabic is written
|
|
right to left — all of these work), open it on a random page, peek a
|
|
line from somewhere in the middle of the page, and <strong>count letters,
|
|
spaces and punctuation marks on that line</strong>.
|
|
</p>
|
|
<p>The result will be 40 to 75. With 40 to 50 letters per line, books are
|
|
often printed in two columns layout; for a single column typesetting,
|
|
typical line length are from 58 to 67 “symbols” (including spaces), 73 is
|
|
rare enough, but it is absolutely predictable you will never see a book
|
|
having lines longer than 75. It is because <strong>lines longer than 75
|
|
symbols are hard to read for a human</strong>, and book publishers know
|
|
this fact for centuries. That's why the well-known 80-column punch card
|
|
was so popular; other formats existed, but were rarely used. First four
|
|
columns were usually occupied by the line number, one was left blank, and
|
|
the rest — 75 columns, you see — contained actual text. The
|
|
width of 80 column was not in any way arbitrary, and nothing has actually
|
|
changed in real reasons behind the 80-column rule when punch cards became
|
|
ancient history.
|
|
</p>
|
|
<p>So we repeat it once again: <strong>Thou shalt not cross 80 columns in thy
|
|
file.</strong> It is unfair to others to force them keeping their terminal
|
|
windows wider than the traditional 80 columns.
|
|
</p>
|
|
<p></div>
|
|
</p>
|
|
|
|
<h3 id="indentation">Basic indentation</h3>
|
|
|
|
<p>The recommended indentation is <strong>four spaces</strong>, but we
|
|
consider acceptable to use two spaces, three spaces or one tab for
|
|
indentation. <strong>It is prohibited to use single space indentation, as
|
|
well as more than four spaces and more than one tab</strong>.
|
|
</p>
|
|
<p>Also the following rules are to be strictly obeyed:</p>
|
|
<ul>
|
|
|
|
<li>no mixture of tabs and spaces is allowed; either you use spaces, or you
|
|
use tabs, but not both, and if your text editor replaces 8 spaces with a
|
|
tab, then change the editor;</li>
|
|
|
|
<li>the same indentation must be maintained within any single file; it is
|
|
also strongly discouraged to use different indentations within a single
|
|
“unit”, be it a program or a library;</li>
|
|
|
|
<li>tab is assumed to be 8-columns; if you use different tab stops in your
|
|
editor, always keep in mind others use 8-cols.</li>
|
|
|
|
</ul>
|
|
|
|
|
|
<h3 id="braces">Curly braces placement</h3>
|
|
|
|
<p>Curly braces that <strong>delimit a function body</strong> are placed like
|
|
this:
|
|
</p>
|
|
<pre>
|
|
int f(int x)
|
|
{
|
|
/* ... */
|
|
}
|
|
</pre>
|
|
|
|
<p>and <strong>not</strong> like this:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
int f(int x) {
|
|
</pre>
|
|
|
|
<p>The only exception for this rule is made for C++ methods in case the body
|
|
is placed right inside the class (or structure). Such a body must be short
|
|
enough (one line, may be two, but <strong>never more than three</strong>),
|
|
and for the sake of compactnes of the class header itself, it may be
|
|
formatted other ways.
|
|
</p>
|
|
|
|
|
|
<p>Curly braces <strong>within control statements</strong> are placed like this:
|
|
</p>
|
|
<pre>
|
|
while (a != b) {
|
|
/* ... */
|
|
}
|
|
|
|
if (a == b) {
|
|
/* ... */
|
|
} else {
|
|
/* ... */
|
|
}
|
|
|
|
do {
|
|
/* ... */
|
|
} while (a != b);
|
|
</pre>
|
|
|
|
<p>There's one important exception from this rule, which will be discussed in
|
|
the section devoted to <a href="#longlines">breaking up long lines</a>.
|
|
</p>
|
|
|
|
<h3 id="longlines">Breaking up long lines</h3>
|
|
|
|
<p>There are a lot of cases something doesn't fit on a single code line. One
|
|
of the most important cases of this is when a head of a statement like
|
|
<code>if</code>, <code>while</code>, <code>for</code> or even
|
|
<code>switch</code> becomes too long because of the conditional expression.
|
|
In this situation we do the following:</p>
|
|
<ul>
|
|
|
|
<li>indent additional lines of the conditional expression by the usual
|
|
indentation step, e.g., four spaces;</li>
|
|
|
|
<li>enclose the statement's body with a block, even if it only contains one
|
|
non-block statement;</li>
|
|
|
|
<li>write the opening “<code>{</code>” on a separate line, precisely
|
|
under the first char of the statement's name. <strong>This is an exception
|
|
for the general rule that prescribes to write the “<code>{</code>” on the
|
|
same line with the statement's head</strong>.</li>
|
|
|
|
</ul>
|
|
|
|
<p>Together it looks like this:
|
|
</p>
|
|
<pre>
|
|
while (!the_collection->known_set->first &&
|
|
the_collection->to_parse->first &&
|
|
the_collection->to_parse->first->s == ' ')
|
|
{
|
|
skip_space(the_collection);
|
|
}
|
|
</pre>
|
|
|
|
<p>What we explicitly disallow here are things like the following:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
while (!the_collection->known_set->first &&
|
|
the_collection->to_parse->first &&
|
|
the_collection->to_parse->first->s == ' ') {
|
|
skip_space(the_collection);
|
|
}
|
|
</pre>
|
|
|
|
<p>or like this:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
while (!the_collection->known_set->first &&
|
|
the_collection->to_parse->first &&
|
|
the_collection->to_parse->first->s == ' ')
|
|
skip_space(the_collection);
|
|
</pre>
|
|
|
|
<p>or like this:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
while (!the_collection->known_set->first &&
|
|
the_collection->to_parse->first &&
|
|
the_collection->to_parse->first->s == ' ')
|
|
skip_space(the_collection);
|
|
</pre>
|
|
|
|
|
|
|
|
|
|
<h2 id="alphabet">Alphabet and language</h3>
|
|
|
|
<h3 id="asciionly">ASCII only</h3>
|
|
|
|
<p>For most of programmers around the world, this is obvious, but
|
|
unfortunately not for all; otherwise, all these “wide strings” would
|
|
never slip into language specifications.
|
|
</p>
|
|
<p>So here is the rule: any source file for any programming language, not only
|
|
for C and C++, must only contain chars from ASCII alphabet. See '<code>man
|
|
7 ascii</code>' for what ASCII alphabet is.
|
|
</p>
|
|
<p>Non-ascii chars, such as latin letters with diacritics, letters from
|
|
non-latin alphabets (be it cyrillic, greek or whatever else), hieroglyphs,
|
|
math operators and so on, <strong>are not allowed in source code</strong>.
|
|
Not only they are prohibited in identifiers, which is rejected by most
|
|
iterpreters and compilers anyway; but also <strong>they must never appear
|
|
in string constants and even in comments</strong>.
|
|
</p>
|
|
<p>As a rule of a dumb: a correct source file must, first of all, be
|
|
considered as a sequence of 8-bit bytes, one byte per character
|
|
(fortunately, not many compilers agree to work with commitee-invented
|
|
“encodings” such as ucs32, but the things are bad enough so
|
|
this has to be mentioned explicitly), and these bytes can only have the
|
|
following values: 9 (tab), 10 (newline), 13 (carriage return, not
|
|
recommended but still acceptable), 32 (space), 33–126 (printable
|
|
ASCII chars). That's all.
|
|
</p>
|
|
|
|
|
|
<h3 id="english">English only</h3>
|
|
|
|
<p>There's only one native language to be used within source code, and that
|
|
language is English. Identifiers must be derived from English words, not
|
|
Spanish, not German, not Russian, not French, not Arabic — English.
|
|
Comments must be written in English, or not written at all.
|
|
</p>
|
|
<p class="remark">
|
|
Damn, this has nothing to do with american or british chauvinism even if
|
|
such chauvinism really exists (which is doubtful). The original author of
|
|
this text (and of Thalassa CMS) is not a native English speaker, and this
|
|
fact must be obvious for any native English speaker reading this text, heh
|
|
(sorry guys, I realize how disgusting it is to read a text written in your
|
|
native language by a non-native author).
|
|
</p>
|
|
|
|
<p class="remark">
|
|
The mere fact is that all programmers around the world understand English
|
|
at least to some extent, so English is THE language we programmers can
|
|
communicate with each other. It isn't so bad, as, among all more or less
|
|
popular native languages, English is the simplest to learn.
|
|
</p>
|
|
|
|
|
|
<h3 id="Identifiers">Identifiers</h3>
|
|
|
|
<p>In plain C, <strong>all identifiers but macro names are written lowercase,
|
|
optionally using underscores to separate the words</strong>, like this:
|
|
<code>i</code>, <code>namelen</code>, <code>name_length</code> and so on.
|
|
</p>
|
|
<p>Please note <strong>all but macro names</strong> means exactly this: all
|
|
but macro names. Hence, <strong>enum constants are written in
|
|
lowercase</strong>. So, this is okay:
|
|
</p>
|
|
<pre>
|
|
enum traffic_lights { tl_red, tl_yellow, tl_green };
|
|
</pre>
|
|
|
|
<p>But the following is NOT okay, despite you might be used to things like this:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
enum traffic_lights { TL_RED, TL_YELLOW, TL_GREEN };
|
|
</pre>
|
|
|
|
<p>Macro names, and <strong>only</strong> macro names, are written
|
|
all-uppercase, with optional underscores, and must never be shorter than
|
|
five chars. So these both are okay:
|
|
</p>
|
|
<pre>
|
|
#define MYMESSAGE "This is a message"
|
|
#define MY_MESSAGE "This is a message"
|
|
</pre>
|
|
|
|
<p>but all the following are NOT:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
#define MSG "This is a message"
|
|
#define MyMessage "This is a message"
|
|
#define mymessage "This is a message"
|
|
</pre>
|
|
|
|
<p><strong>In plain C, mixed case in identifiers is never used</strong>, and
|
|
never means never.
|
|
</p>
|
|
<p>In C++, we use CamelCase for everything related to object-oriented
|
|
programming and abstract data types (BTW, you don't confuse these two
|
|
completely different paradigms, do you?) This means, effectively, that
|
|
CamelCase (okay, every word starts with a capital, all the other letters
|
|
are lowercase... hence, the first letter is always uppercase, do we make it
|
|
clear?) for names of classes and methods. And that's all.
|
|
</p>
|
|
<p>Structure names are written in CamelCase only when they are not, actually,
|
|
structures as they are in plain C — e.g., if your structure has
|
|
methods, or if it has some private members, then it is no longer a
|
|
structure. It is up to you whether to use the <code>class</code> keyword
|
|
for all such structures, or stick with <code>struct</code> sometimes, but
|
|
they are no longer structures, so please name them in MixedCase.
|
|
</p>
|
|
<p>Everything else, including</p>
|
|
<ul>
|
|
|
|
<li>functions which are not methods (even if they accept or return objects
|
|
of classes),</li>
|
|
|
|
<li>fields, that is, members that aren't methods, even if they are actually
|
|
objects,</li>
|
|
|
|
<li>variables, even if they are of a class type</li>
|
|
|
|
</ul>
|
|
<p>— is named all-lowercase.
|
|
</p>
|
|
<p>Please note we never use identifiers such as <code>isEmpty</code>,
|
|
<code>getValue</code>, <code>feedTheCat</code> and the like — that
|
|
is, mixed case starting with lowercase.
|
|
</p>
|
|
<p>Furthermore, we never use underscores in mixed-case identifiers.
|
|
</p>
|
|
<p>And one more thing: all <em>globally-visible</em> identifiers must be
|
|
reasonably long and as meaningful as possible. On the other hand, local
|
|
variables should be named short, with rare exceptions. For example, if
|
|
you're going to write a <code>for</code> loop with an integer loop variable
|
|
that just increments or decrements (may be with inc/dec step other
|
|
than 1), it would look stupid to name that variable anyhow longer than
|
|
just <code>i</code>, <code>j</code>, <code>n</code> and so on. However, it
|
|
is strictly prohibited to use 1-char identifiers <code>l</code>,
|
|
<code>o</code>, <code>I</code> and <code>O</code>, because they can be
|
|
confused with digits (yes, even the lowercase <code>o</code>, and yes,
|
|
there are a lot of people around who don't use syntax highlighting), as
|
|
well as any multichar identifiers that consist of only these four chars,
|
|
such as <code>Ill</code>, <code>IO</code>, <code>loo</code> and so on.
|
|
</p>
|
|
|
|
|
|
<h2 id="restrictions">More restrictions</h2>
|
|
|
|
<h3 id="typedefs">No commitee-invented typedefs</h3>
|
|
|
|
<p>Are you already used to all these <code>size_t</code>, <code>off_t</code>,
|
|
<code>time_t</code>, <code>uint32_t</code> and the like? Now (at least if
|
|
you work on Thalassa CMS code) please start avoiding these as long as it is
|
|
possible.
|
|
</p>
|
|
<p>Unfortunately, it is not <em>always</em> possible. For example, if you use
|
|
a syscall or a standard library function which accepts or returns a
|
|
<em>pointer</em> to such type, you can blame the commitee that invented it,
|
|
but you actually have to obey. Fortunately, it is unlikely you'll need
|
|
such calls (getgroup, accept, recvfrom and the like) in Thalassa CMS.
|
|
</p>
|
|
<p>The well-known <code>time</code> syscall gives a perfect example of a
|
|
situation where you <em>can</em> avoid these idiotic type names. Instead
|
|
of
|
|
</p>
|
|
<pre class="wrongcode">
|
|
time_t tm;
|
|
time(&tm);
|
|
</pre>
|
|
|
|
<p>please write
|
|
</p>
|
|
<pre>
|
|
long long tm;
|
|
tm = time(0);
|
|
</pre>
|
|
|
|
<p>(replace the <code>0</code> with <code>NULL</code> for plain C code; in
|
|
C++, keep the zero as it is <a href="cpp_subset.html#nokeywords">the</a>
|
|
representation for a null pointer).
|
|
</p>
|
|
|
|
|
|
<h3 id="sideeffects">Side effects</h3>
|
|
|
|
<p>There are two rules for side effects, each with one exception. The rules
|
|
are: </p>
|
|
<ol>
|
|
<li>no more than one side effect per <em>expression statement</em>;</li>
|
|
<li>no side effects in conditional expressions.</li>
|
|
</ol>
|
|
|
|
<p>The first rule means it is not good to write, e.g.,
|
|
</p>
|
|
<pre class="wrongcode">
|
|
x = v[n++];
|
|
</pre>
|
|
<p>Instead, two statements must be written:
|
|
</p>
|
|
<pre>
|
|
x = v[n];
|
|
n++;
|
|
</pre>
|
|
|
|
<p class="remark">
|
|
|
|
BTW, this means we never make use of the difference between
|
|
<code>i++</code> and <code>++i</code>, so we always write <code>i++</code>.
|
|
These STL addicts may argue we should definitely always write
|
|
<code>++i</code> instead, but the fact is that we don't use STL, so their
|
|
reasoning isn't valid for us.
|
|
|
|
</p>
|
|
|
|
<p>The obvious exception is when you need to call a function which has a side
|
|
effect but nonetheless it returns something important as its returning
|
|
value. In most cases we shouldn't ignore such values, and sometimes
|
|
attepmts to ignore them effectively make our program obviously wrong, like
|
|
with the <code>read</code> syscall. Hence, the very minimum we have to do is
|
|
to <em>assign</em> the value to a variable, and assignment is a side
|
|
effect, too. So, statements like
|
|
</p>
|
|
<pre>
|
|
res = func(arg1, arg2);
|
|
</pre>
|
|
|
|
<p>are considered valid, despite there are two side effects here, but the
|
|
expression in such a statement <strong>must only consist of the function
|
|
call and the assignment operator</strong>. No additional operators are
|
|
allowed, and no side effects are allowed for the function arguments, so the
|
|
following (provided that <code>func</code>, <code>foo</code> and
|
|
<code>bar</code> all have side effects):
|
|
</p>
|
|
<pre class="wrongcode">
|
|
res = func(arg1) + 1;
|
|
res = foo(bar(arg2));
|
|
</pre>
|
|
|
|
<p>both are not allowed.
|
|
</p>
|
|
<p>The second rule means you must not write anything like this:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
if (close(fd) == -1) {
|
|
</pre>
|
|
|
|
<p>nor like this:
|
|
</p>
|
|
<pre class="wrongcode">
|
|
if (-1 == close(fd)) {
|
|
</pre>
|
|
|
|
<p>Despite the latter is better than the former, it is still bad enough,
|
|
because <code>close</code> has a side effect (actually, this side effect is
|
|
what it exists for, heh...), and there must be <strong>no side effects in
|
|
conditional expressions</strong>. However, for this rule there's one
|
|
exception, too.
|
|
</p>
|
|
<p>In practice, we often need to construct a loop according to the "get,
|
|
check, handle" model. Examples for such a loop are reading from a stream
|
|
and the main loop in an event-driven application; well, other examples
|
|
exist, too.
|
|
</p>
|
|
<p>The problem is that the check has to be placed between getting and
|
|
handling, which means “in the middle of the loop”.
|
|
Programming languages
|
|
don't provide us a statement for this, in the best case they provide loops
|
|
with precondition and postcondition, but not with a
|
|
“in-the-middle-condition”. So, what is better, this?
|
|
</p>
|
|
<pre>
|
|
n = 0;
|
|
c = getchar();
|
|
while (c != EOF) {
|
|
if(c == '\n') {
|
|
printf("%d\n", n);
|
|
n = 0;
|
|
} else {
|
|
n++;
|
|
}
|
|
c = getchar();
|
|
}
|
|
</pre>
|
|
|
|
<p>Or, maybe, this?
|
|
</p>
|
|
<pre>
|
|
n = 0;
|
|
for (;;) {
|
|
c = getchar();
|
|
if(c == EOF)
|
|
break;
|
|
if(c == '\n') {
|
|
printf("%d\n", n);
|
|
n = 0;
|
|
} else {
|
|
n++;
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>Or, well, finally this?
|
|
</p>
|
|
<pre>
|
|
n = 0;
|
|
while ((c = getchar()) != EOF) {
|
|
if(c == '\n') {
|
|
printf("%d\n", n);
|
|
n = 0;
|
|
} else {
|
|
n++;
|
|
}
|
|
}
|
|
</pre>
|
|
|
|
<p>Honestly speaking, all the three are ugly. But the first version involves
|
|
duplication of the “get” in “get, check, hangle”
|
|
— lucky we are
|
|
if it is only a getchar, but consider the well-known <code>select</code>
|
|
syscall with all the preparations (such as filling in the sets, computing
|
|
timeout until the closest time-based event, all that), and you won't be any
|
|
longer happy with duplicating such amount of code.
|
|
</p>
|
|
<p>The second version might look better, but when an average reader of your
|
|
program sees the <code>for (;;)</code> (or <code>while (1)</code>, no
|
|
matter), (s)he expects a real <em>endless</em> loop. It is okay for a main
|
|
event loop in an event-driven program, because in that case loop only ends
|
|
together with the program itself, but for a simple stream reading or the
|
|
like, it might look misleading.
|
|
</p>
|
|
<p>So, here is the exception to our second rule: it is only acceptable to have
|
|
a side effect within the conditional expression of <code>while</code> loop
|
|
(but not <code>do-while</code>, nor <code>for</code>) in case the loop is
|
|
built according to the “get, check, handle” scheme and the side
|
|
effect
|
|
corresponds to the “get”.
|
|
</p>
|
|
<p>Please note that there are no similar exceptions for <code>if</code>,
|
|
<code>switch</code>, <code>for</code> and <code>do-while</code>. Side
|
|
effects are NOT allowed in their conditional expressions.
|
|
</p>
|
|
|
|
<h3 id="goto">Goto is only allowed in two situations</h3>
|
|
|
|
<p>Many people argue <code>goto</code> must never be used at all. Some say
|
|
exactly the opposite: that there's nothing wrong with <code>goto</code>
|
|
(well, at all). BTW, Linus Torvalds often tells this in his interviews.
|
|
</p>
|
|
<p>Okay, they are wrong. Even Linus Torvalds.
|
|
</p>
|
|
<p>It is really easy to turn a piece of code into a complete mess, and
|
|
<code>goto</code> is an efficient tool for that (although, surely, other
|
|
tools exist for the same purpose).
|
|
</p>
|
|
<p>However, those who prefer to deny <code>goto</code> once and forever, seem
|
|
to be missing one important thing. The final goal is to make the code as
|
|
clear as possible. Once again, the goal is <strong>not</strong> to make
|
|
the code free of <code>goto</code>s or whatever else, it is <em>to make the
|
|
code clear</em>.
|
|
</p>
|
|
<p>There are exactly two situations when <code>goto</code> obviously makes the
|
|
code easier to read, and attempts to write the same code without
|
|
<code>goto</code>s surprisingly complicate the code. Always remember what
|
|
is the final goal; whenever we see we're doing something that moves us away
|
|
from the goal, it means we're doing wrong.
|
|
</p>
|
|
<p>The first of the two situations is simple: it is when we need to
|
|
<strong>bail out from inside several nested statements</strong>, such as
|
|
loops and the <code>switch</code> statement. With only a single statement,
|
|
we can use <code>break</code>, but it doesn't work for more than one
|
|
statement.
|
|
</p>
|
|
<p>Certainly, some obvious measures must be taken in order not to let the code
|
|
become messy. The label must have a meaningfull and self-descriptive name,
|
|
and it must be placed right after the outmost of the loops (or, well, loops
|
|
and switches) we're jumping out. But if we do so, everything will be fine.
|
|
</p>
|
|
<p>Some people will tell you it is easy to go without goto here. Yes, it is
|
|
really so. We can isolate the nested statements into a separate function
|
|
and do a <code>return</code> from it; we can add a flag checked in outer
|
|
loops, set it in the innermost loop and do a <code>break</code>; we can
|
|
invent other things as well. But the truth is that <strong>in this
|
|
situation the code with <code>goto</code> will be the clearest
|
|
one</strong>. Try it yourself if you don't believe.
|
|
</p>
|
|
<p>The second situation is simple, too. Suppose you grab something valuable
|
|
at the start of your function, and you need to, well, <em>ungrab</em> it
|
|
before you return. The role of “something valuable” is most often played
|
|
by dynamic memory, but it can also be, e.g., an open file (okay... it could
|
|
be a mutex as well if we didn't ban multithreading, but
|
|
<a href="banned_techniques.html#multithreading">we did</a>). Anyway, you've
|
|
got to do something right before you're done, no matter how your function
|
|
finishes. And now you need to... guess what? quit your function from its
|
|
middle.
|
|
</p>
|
|
<p>Okay, you can duplicate all your cleanup code from the end of the function
|
|
into every place where you're going to place another <code>return</code>.
|
|
<strong>Please don't</strong>. Better write exactly one
|
|
<code>return</code> as the last line of your function, place all the
|
|
cleanup right before it, and <strong>mark the cleanup code with a
|
|
label</strong>. The label should be named somehow short and meaningful;
|
|
<code>quit</code> or <code>cleanup</code> may be good choices, just to name
|
|
a couple. To quit the function “from the middle”, use
|
|
<code>goto quit</code> instead of <code>return</code>.
|
|
</p>
|
|
<p>Please note that in both cases <code>goto</code> is to be used to jump
|
|
<strong>forward</strong> in the code, and at least one level from inner to
|
|
outer code constructions. If you feel like doing goto in the backward
|
|
direction, please recall there are <strong>three</strong> different loop
|
|
statements both in C and C++ (namely while, do-while and for), so please
|
|
don't invent another one with jumps. Please also don't jump from one point
|
|
to another when they are at the same nesting level — this is exactly
|
|
how <code>goto</code> turns your code into a snake wedding.
|
|
</p>
|
|
</div>
|
|
|
|
</div>
|
|
<div class="navbar" id="bottomnavbar"> <a href="cpp_subset.html#bottomnavbar" title="previous" class="navlnk">⇐</a> <a href="devdoc.html#coding_style" title="up" class="navlnk">⇑</a> <a href="scriptpp.html#bottomnavbar" title="next" class="navlnk">⇒</a> </div>
|
|
|
|
<div class="bottomref"><a href="map.html">site map</a></div>
|
|
<div class="clear_both"></div>
|
|
<div class="thefooter">
|
|
<p>© Andrey Vikt. Stolyarov, 2023-2026</p>
|
|
</div>
|
|
</body></html>
|