How Can I Explain This?: coding style

Showing posts with label coding style. Show all posts

Thursday, March 4, 2010

Coding styles that make me twitch, part 8

One of the worst programming habits I know of, is assuming that everything will work out; you just perform the command/function call, and it has to work. In other words: do not bother checking for error conditions or propagating them, because they are not going to happen anyway. right?

Wrong.

But the code works. Mostly. Except when an error condition occurs. And it will, eventually.

In some instances, such code is a glorified shell script. I have ranted about that before. But catching error conditions properly can be tricker than with in-Perl functions and modules. Especially if you have no control over what the external program does, but the original programmer did.

Even with in-Perl functions and modules, you might run into, ehrm, interesting usage.

Usually, the root of the problem is that the original programmer did not foresee that some his code could grow into something else.

While it may have been a good idea to create an SQL query wrapper to hide parts of what DBI does for auto-committing statements:

sub do_sql {
    my $qry = shift;
    my $ignore = shift;
    if (!$dh) {
        &slogin();
        return if (!$dh);
    }
    my $sh = $dh->prepare($qry);
    if ($sh && !$sh->execute) {
        if ($sh->errstr =~ /(MySQL server has\
 gone away)|(Lost connection to MySQL server)/) {
            &log("Lost MySQL, reconnecting.");
            &slogin();
            return if (!$dh);
            $sh = $dh->prepare($sqlstmt);
            undef $sh if ($sh && !$sh->execute);
        } else {
            undef $sh;
        }
    }
    &log("SQL failed: '$qry'.") if (!$sh && !$ignore);
    return $sh;
}

that is no guarantee that the implementation will be future-proof, when the subroutine is used like this:

my $sh=&do_sql("SELECT * FROM tab1 WHERE\
 x<>'$cgi->{data}' AND y<>42");
if (my $res=$sh->fetchrow_hashref) {
    $cgi->{ref}=$res->{ref};
}
&do_sql("UPDATE tab1 SET x='$cgi->{newdata}',\
 y=23, z='$cgi->{ref}'");
&do_sql("UPDATE tab2 SET tab1_changed='yes'");
&log("Updated tab1, ready for externals");
# Process the changes made above:
system("/usr/local/bin/changestuff.pl");
&log("Finished processing");

Imagine now that changestuff.pl also performs changes in the database tables mentioned above, and that the code above is called in a cron job every minute or so.

Here is an attempt at listing the worst parts:

do_sql only pretends to do proper error checking, it mostly does not do anything useful about the error situations.
The query result seems of little consequence.
There is an obvious need for bound variables in the prepared statement, but do_sql does not support that. So the code pretends the problem does not exist.
When do_sql is used for updates, there is no check whether the returned statement handle ($sh) is empty or not, there is no way of knowing whether we are clobbering the database.
Why do we not care whether the external commands succeeds or not?
SQL transactions, anyone?
Yeah, other parts of the style sucks, too.

Now imagine the example above multiplied to thousands of lines of inter-dependant code.

I am happy to say that I do not see things like this too often.

However, cleaning up code like this is a PITA, and it is often easier just to close your eyes, add your own code, and leave well enough alone.

Thursday, February 25, 2010

Coding styles that make me twitch, part 7

Parentheses and precedence

This week, I will strive to be better at explaining the problem scope, while heroically and miraculously maintaining brevity.

I used to be annoyed at seeing extraneous parentheses with the ternary conditional operator pair ? :, but then I discovered that the PHP language designers decided to break precedence (see example #3). So I have stopped twitching as much as I used to when seeing parentheses galore in that context.

In Perl Best Practices, Damian Conway notes that even though the low-precedence logic operators and, not and or may seem nicer, they can confuse the reader regarding the intended meaning of the code. But he does not — as far as I can recall or find — address how these relates to parentheses and precedence (I suppose that should be "parentheses && precedence").

I start twitching when someone insists on using parentheses like this (imagine that the example was more convoluted):

if (expr1 ||
    (expr2 && expr3) ||
    expr4
    ...)

That does not help understanding. It confuses me, as the read, regarding your intentions. Is there a piece of code missing, perhaps a little bit of || expr2b or || expr3b?

Sometimes, I even see misguidedly mixed-in letter-literal operators:

if (expr1 or
    (expr2 && expr3) or
    (!expr4)
    ...)

I much prefer seeing

if (expr1 ||
    expr2 && expr3 ||
    expr4
    ...)

if (expr1
    || expr2 && expr3
    || expr4
    ...)

depending on what floats your boat in the most stylish way imaginable.

I would have thought that the logical and/or/not part of operator precedence would be easy to understand. It is essentially the same in most programming languages: Ruby, Python, Perl, Java, C, ... and not even PHP managed to mess this one up.

The chance that anyone is going to be confused by your code because of these parentheses not being there is far, far smaller than the chance that they are going to be confused by their presence.

I have heard the defense "but the code is so non-obvious that I had to add them!" Well, make your code obvious instead.

Thursday, February 18, 2010

Coding styles that make me twitch, part 6

Edit: There are, apparently, strong feelings that I should not post my personal preferences. Reader discretion advised: this post expresses my personal opinions regarding a limited use case for file handling and filehandles, and must not be read as general advise on how to deal with filehandles in Perl, and so on.

Today's theme is: unnecessary variable clutter.

my $fh = new Filehandle("/usr/bin/somecmd someargs |");
while (my $l=<$fh>) {
    if ($l=~m/foo bar zot/) {
        # lots of code that does not depend on
        # a variable file handle, nor creative
        # usage of $l where $_ could not be used
        # implicitly just as easily
    }
}

Sure, there are situations where it makes sense to assign filehandles to variables, or using $l instead of $_, but the above example is not one of them. I found no particular reason why the original programmer had used new Filehandle, either.

Perl 5 has, on purpose, made this easier for us than in the above example:

open my $FH, '-|', '/usr/bin/somecmd someargs'
    or croak "OMG\n";
while (<$FH>) {
    if (/foo bar zot/) {
        # lots of code
    }
}

Monday, October 19, 2009

Coding styles that make me twitch, part 5

Let us say that we have some code using DBI - old-fashioned, but it still works, kindof.

How would you like to see the following in a 6000 line CGI script you're supposed to debug?


    my $sth=&query("SELECT id FROM invoices WHERE invoiced='N'");
    my $invoiced=$sth->rows;

    my $sth2=&query("SELECT count(*) nusers FROM users \
LEFT JOIN people ... # long SQL statement
    my $row2 = $sth2->fetchrow_hashref;
    my $nusers = $row2->{nusers};

    my $sth3=&query("SELECT count(*) npeople FROM users \
LEFT JOIN people ... # long SQL statement
    my $row3 = $sth3->fetchrow_hashref;
    my $npeople = $row3->{npeople};

    my $sth4=...

And then, after carefully checking scope, you discover that the variables $sth2, $row2, $sth3, $row3, $sth4, $row4 etc. are not used anywhere else within the same scope.

Would you develop a tick?

I did, and the tick didn't lessen in strength when I discovered unsafe variable interpolation as well as a sub query that used $dbh->prepare($qtext) but didn't allow passing of usable parameters, such as, you know, bind variables.

I've started on "refactoring" the real life code this example is based on, but I get depressed from the sheer amount of work in fixing many problems at once. Maybe I should pick up drunken coding to become less perfectionist.

Wednesday, July 8, 2009

Coding styles that make me twitch, part 3

Today's twitchiness is sponsored by ... no, wait, I don't have sponsors. Ah, well.

I have an issue with people who insist on using <"> as a string delimiter when the (static) string itself contains that very character. It gets fugly all too soon:

my $html_output = "<a href=\"http://www.example.com/foobar/$pagename.html\" title=\"Oh, a link to $pagename\"> ...\n";

It's so easy to avoid having to quote the <"> while still allowing variable interpolation:

my $html_output = qq(<a href="http://www.example.com/foobar/$pagename.html" title="Oh, a link to $pagename"> ...\n);

Since the example is HTML (and could be e.g. SQL), and it might be multi-line, why not ...

my $html_output = <<EOL;
<a href="http://www.example.com/foobar/$pagename.html"
   title="Oh, a link to $pagename>
EOL

That wasn't so hard? Or fugly?

Saturday, May 30, 2009

Coding styles that make me twitch, part 2

I'll make today's post short, because it's all about line length.

Limiting the length of code lines is something I try to be good at, not because I think the next guy will have a VT100 terminal and needs a friendly piece of code, but because of basic readability.

When we read an ordinary text, e.g. in a book (you remember books, right?), there is usually quite few characters printed on each line. Here are a few from Douglas Coupland's JPod:

the door to see that all my new furniture was gone, and my
original furniture hadn't come back. Fuck. I phoned Greg,
but realized he was on Cathay Pacific 889, headed to Hong
Kong. I phoned Mom.

Notice how the text area is even narrower than that in this friendly blog?

This comes from a long tradition in printing, it isn't as if they couldn't have squeezed twice as many words in there, if the print was only smaller. But if the print was smaller, they'd probably make the book narrower as well.

It's far easier to read something when you don't have to move your eyes around too much from line to line. This is important to both slow and quick readers.

So, back to coders who make lines of 100+ characters:

What in friggin' Ruritania are you thinking? Not much, that's what.

*GROWL*

Friday, May 15, 2009

Coding styles that make me twitch, part 1

We'll see how long this particular series gets.

I'll try to come up with some example of coding styles that annoy me, and post about it.

First off is the appended conditional at the end of long one-line Swiss knife code snippets:


my @var = sort { length($b) <=> length($a) } split /[-.,_+ ]/ , $input{longvariablename} if defined $input{longvariablename} && length $input{longvariablename} > 4;

(Yes, that's supposed to be one line, though it doesn't look like it.)

Please, pretty please, don't append conditionals at the end of long one-liners.

Really, just don't, m'kaaay.

Code should, unless it's a one-off one-liner in your shell prompt, be maintainable for others. "Others" includes yourself some time in the future, when you've forgotten what the (insert mst-inspired expletives here) you were thinking at the time you coded the stuff.

The above example isn't particularly complex, or difficult to understand, but it's all on one line, and hardly is easy to parse even if you've got that 170 char wide window to code in.

A few parentheses, a helper variable and a few more lines -- preferably keeping well within 80 columns -- surely won't hurt that badly.


my $lvn = $input{longvariablename};
if (defined $lvn && length($lvn) > 4) {
    my @var = sort { length($b) <=> length($a) }
              split (/[-.,_+ ]/, $lvn);
}

Of course, these are just my personal opinions, and I won't be knocking on your door with a baseball bat in hand if you don't do as I suggest.

How Can I Explain This?