Wednesday, April 30, 2008

Ternary operator considered harmful

The distinct syntax of the ternary operator confused me until I read about it in a book.

(predicate) ? (if-true-value) : (else-value)

for example,

indecipherable = (true ? more : less);

In this case, the operation of the ternary operator always makes the code more indecipherable.

I think it is always bad news if the explanation for an operator is required before one can read it effectively. I wouldn't be surprised if the original author of this device eventually regretted his decision. It may have been created before the existence of the qwerty keyboard- and typists couldn't manage more than a few words per minute.

I prefer a more verbose and natural style:

if (predicate)
{ statements... }
{ statements... }

If you discount the braces and other syntactic conventions, you aren't really gaining much typing efficiency by using the ternary operator. You are losing understandability and maintainability. As soon as more than one statement is required, the ternary operator must be refactored into an if-else block anyway.

I know there is a party of programmers out there that are keen on jamming a whole bunch of logic into the smallest space possible. I think that sort of thinking is short-sighted. Understandability always trumps conciseness in programming. The ternary operator is a device from a long-gone era when this principle was not well understood.

Friday, October 26, 2007

Multiline statement style

There are occasions when it makes sense to extend a programming statement beyond one line:

std::cerr << "The following execution is " <<
"probably best placed on multiple lines. <<

or, in another case:
if (the_job_is_finished &&
    there_are_no_jobs_left &&
    (timer > 10))

Novices might be more acquainted with this formatting:

std::cerr << "I am a novice programmer, and"
    << " I am still learning the best-practices"
    << " for programming style."

Which better serves the paradigm of programming maintenance? I cannot contend against the aesthetics of the second, amateurish attempt. The alignment of the dual brackets is nothing if not captivating. But do they serve the purpose of easing the reading of the code? No. To demonstrate this inefficiency of this method, consider reading the code, one line at a time.

std::cerr << "I am a novice programmer, and"
Great- a line is being printed to stderr.

<< " I am still learning the best-practices"
...Wait- what does that "<<" mean- I've got to look at the previous line... Okay, now I see- we're continuing output.

The "..." indicate a momentary interruption to the left-to-right, top-down flow of code. This momentary interruption is significant because it requires the reader to investigate context in code that he has already read. This extra effort could be avoided with proper style.

Now, consider the more efficient and excellent way:

std::cerr << "The following execution is " <<
A line is being printed to stderr and it appears the next line will continue the output.

"probably best placed on multiple lines. <<
Clearly this line is a continuation from the previous line of code.

Hundreds of years of typesetting have used hyphenation to indicate when a word extends beyond the line in which it commenced. I do not expect to ever see the follow
-ing in a book. The practice of prefixing multiple line statements with an operator is akin to prefacing a line with a comma, or a hyphen- it's bad practice in typesetting- and it's bad practice in coding.

Tuesday, October 16, 2007


I'd like to apologize for the previous post on build procedures. The subject matter had too much substance to be considered part of this blog. Nor did it give the opportunity for my nemesis to rebut the claims made therein. I've removed the post and put it in a more appropriate location.

Wednesday, October 10, 2007

What is a header file?

My nemesis' previous post is a little disturbing, both in its dense verbiage, and the lack understanding of a fundamental programming principle.

The purpose of a header file is to define part of the contract for a function's invocation. If you leave out the names of function's parameters, in most cases there will not be enough information for the average developer to use a function. Consider the following function signature:

void draw_Foobar_widget(
    char *

Now consider the following:

void draw_Foobar_widget(
    int width,
    int height,
    char *title

Which is clearer?

I suppose Mr.Cuddleyourbraces is trying to optimize the DRY principle at the detriment of the information hiding principle. The DRY principle encourages maintainable code. The information hiding principle encourages usable code. The developer who sees the first function definition is compelled to investigate the source code for the initializeFoobarWidget- and thus must confront all the complexity thereof. It is therefore a well-accepted convention to document the external view of a function in the header file. That means providing a function signature that is understandable without consulting the source code. And named function parameters.

Using named function parameters in two places does violate the DRY principle, which is probably why some programming languages don't require a prior function signature (Java, perl, python, php, etc.) But you lose a lot more than you gain by observing the DRY principle to the extreme of not naming function parameters in your header files. The compiler cannot enforce good documentation via a man-page or something of that nature. The compiler can enforce the existence of function signature. A well-disciplined programmer knows how to use them.

Choose your poison:
  1. Slightly more maintainable code that is never used.

  2. Usable, well-designed software that takes an extra dash of discipline to maintain.

Wednesday, October 3, 2007

Against CamelCase

I can think of a few reasons why someone would choose camelCase in favor of underscore_delimited text.

1. The underscore character doesn't have a convenient place on the keyboard.
2. The shift key isn't working.
3. You need to inscribe your code on gold leaf, so the extra character costs $.
4. You forgot how to do hungarian notation.

I prefer underscore delimited text. Perhaps in the past there was a premium to be paid for writing software with variable names that were a few characters longer. Back then it made good sense to use a compressed notation. But nowadays, that notion is gone. So why does it continue to perpetuate itself? These are traditions that need to be overcome. I acknowledge that it is important to maintain the coding style of legacy software modules when modifying those modules. But how many new modules of code still adhere to the old conventions?

I prefer underscore notation because it most closely mimics the convention used in natural language for delimiting words. And I don't think you need to capitalize the first letter of nouns (unless you're German), for every New_Word that appears in a variable name. I think the English rules for spelling and clauses are sufficient. I feel the same way about abbreviations. Making abbreviations for simple things like "copy -> cp" might have made sense when working with 2K of system memory, but it doesn't simplify anyone's life when 5-6 years later someone doesn't know if you're trying to "compose", "compress", or "copy" someone's code. The whole camelCase fiasco promotes this idea; since it already looks outlandish- what's the harm in making it more outlandish by incorporating abbreviations?

Abbreviations and camelCase in software implicitly encourage the notion that the hardest part of writing code is typing. Typing is the easy part. If it were the hardest part, we might expect a single developer to be able to write 50,000 lines of code in 6 weeks (that's 50,000 extremely dense lines at 60 wpm, 40 hrs/week).

Nothing says "This is solid, tested code" like:
void initialize_distributed_system(
    int number_of_clients,
    int seconds_before_timeout,
    const char* log_filename)

{ ... }

and nothing induces nausea like:
void initDSys(
    int clientNm,
    int tmoutSec,
    const char* logFname
{ ... }

"Long live the underscore_delimiters!" -Dune (1984)

Friday, September 28, 2007

What does ++i mean?

What does "++i" mean? Well, it means "add 1 to i and return the value of i". This differs from "i++", which means "add 1 and return the value of i-1" or "return the value of i, then add 1 to it". Honestly, I'd prefer that such mechanisms weren't used as components of evaluated expressions. They are similar syntactically, but they differ in their results. These results are different enough to cause all sort of problems if not understood.

Performance-wise, historically "++i" has been favored. I'm not sure exactly why it had better performance. But I'm convinced that this performance difference is unobservable in the light of current CPU and compiler technology.

I must admit that the behavior of the "i++" operation is more confusing. Unlike most expressions, it returns the value of the variable before the operation occurs, rather than after. This is a break in coherence with the "+" operator. Or any operator, for that matter.

So why am I supporting the postfix "i++" operator over the prefix operator? Not because its returned value is more sensible. I think I've established that using these operators in evaluated expressions is a no-no. Rather, I prefer "i++" because it more closely resembles its mathematical analogue: "i = i + 1". What does "++i" resemble?: "1 + i = i". That lacks any coherence with mathematics- it suggests an operation opposite of its actual result. In the end "i++" trades coherence within the programming language for coherence with the language of mathematics. And "++i" does the opposite. And I prefer mathematics.

Thursday, September 27, 2007

An endless rebuttal

In response to this travesty of an argument...

a while loop is really a syntactic sugar for the more general for loop in which a condition is expected to be false at some time during its execution.
Who says a condition is "expected to be false"? Maybe we're writing server code and the loop really is intended to continue forever.

for (;;) {
It simply and clearly states that a loop will continue to iterate until some internal condition dictates that it is now time to break out.
That's poppycock. How many steps in translation does it take to get from "for(;;)" to "loop forever". Once you've learned that "for" means "loop", then you've got to figure out that "(;;)" means "forever". I know that seems pretty simple to all you veteran programmers out there, but I'd prefer to save brain-space for something more useful. The "while(true)" construct doesn't compel me to maintain two definitions for "while" in my brainspace (I speak english natively). While means while and true means true. Granted, we could extend the syntax with prepositions. But no semicolons, that's for sure.

I know what you're thinking: "English isn't C and C isn't English. If you're overlapping languages in this way, you're bound to mess something up." I agree. I'm just saying that there's no sense in creating new syntax if the old syntax is just as effective at accomplishing the same task. In this case, while(true), English is just as effective. (When I say effective, I mean "well-designed").

Finally, I think we could both agree that any language would be better served with a "loop" construct that doesn't require any condition for execution. My opinion is that the following construction is preferable to either while(true), or (the ghastly) for (;;):

    if (something_happened())

I've learned from sad experience that the order of execution in a loop can make a big difference when it comes to the correctness of the code. Because the exit condition is explicitly located relative to the processing, there's no ambiguity.