How Can I Explain This?: 2009

Thursday, December 31, 2009

2009 In Perl

Repeat after me: I will not pretend to be an analyst or doomsayer, even though the end (of 2009) is nigh.

In 2009, Perl grew up a bit more, both as a language and as a community.

Language Development

Perl 5.10.1 came with a pony to those of us who fear the .0 releases.

The Perl 5.11 development tree got started, and it looks like it is rolling on rails. At this rate, we will see 5.12.0 quicker than you can say antidisestablishmentarianism.

... Perl 6 has made progress both on the specification side and in implementations -- yep, that is plural. It is sometimes confusing when naming changes under your feet, but it is acceptable while the spec is still settling.

Community

In 2009, I think I saw more openness regarding the internal conflicts in the Perl community as a whole; there were abundant admissions that we were not communicating nearly as well as we should, that there was at least a small amount of internal bickering over the present and future state of the onion -- onions, I must inject, tend to come in many shapes and flavours, and are not always the same inside -- really, which way we are going, are we having a conflict or not (yes we are -- no we are not -- huh, are we talking? -- pass the chips), and get off my lawn before I shoot or hug you.

In brief, it looks to me like 2009 was the year when the community showed renews signs of self-awareness.

But much more happened. We got a closer focus on Perl visibility, from my POV mainly owing to Matt S. Trout's lightning talk challenge from NPW 2009, plus a whole range of people working on other PR aspects for ourselves. And mst still keeps his hair colour. Wow.

Other Stuff

I made new friends, I learned a lot, I even got to help out a bit, and I hope that this will continue in 2010.

I hope you will too.

Happy new year!

Monday, December 21, 2009

Dice Roller Deconstructed

As promised, here are the elements of last week's dice rolling code:

use v6;

This is a nice way to say that we are in Perl 6 land.

subset D10 of Int where 1..10;

A "D10" is a 10-sided die, and it can only have integer values in the range 1..10. Subtyping Int is an acceptable way of taking care of that.

sub is_success (D10 $roll, D10 $target) {

Here, I am already using the subtype D10 of Int. This subroutine compares the rolled die $roll with the target number $target, and is called from the subroutine roll() for each die in the dice pool. I chose to create an explicit subroutine because it seems a bit clearer what happens in the special case of a rolled 10, which means that you get to re-roll that die for a potential new success.

    my $n = 0;
    if ($roll == 10) {
        say "10 again";
        $n += roll 1,$target;
    }

If we roll a 10, then the roll() subroutine is called with a dice pool of 1 and the same target number as we got originally for determining success.

    $roll >= $target ?? $n + 1 !! $n;
}

We always return the number of successes from the roll for the "10 again" rule (if it happened), and in case this roll was a success, we return an additional success.

sub roll (Int $poolsize where { $_ > 0 }, D10 $target? = 8) {

The dice pool size can of course not be negative, but it also cannot be zero; you always get to roll a die, so I have added a type constraint for that. The target number is optional, defaults to 8, and has to be possible with a D10.

    my D10 @rolls = (1..10).pick($poolsize, :replace);

From left to right:

@rolls is an array that will contain the results of the normal die rolls
(1..10).pick($poolsize is a way of picking $poolsize dice having possible values in the range 1..10 and "rolling" (randomizing) each of them.
pick($poolsize, :replace) means that we not only pick a result, but we also make it possible to achieve the same result again. Specifically, it is important for us that each die can have ANY value, not just values that have not been picked before. The semantics of pick() are explained in .pick your game (the 15th gift in the Perl 6 Advent Calendar).

    say "Roll: " ~ @rolls.sort.join(",");

@rolls.sort.join(",") sorts the elements of the @rolls array and stringifies them joined with a comma, e.g. "1,2,3,3,4" for @rolls = 4,1,3,2,3

    [+] @rolls.map: { is_success $_,$target };
}

This piece of code maps is_success $_,$target on every value in the @rolls array and creates a sum of those results. In other words, it sums up the number of successfull die rolls.

given @*ARGS.elems {

The @*ARGS array contains the command line arguments to the program, and .elems therefore is the number of arguments used.

    when 2   {
               say "Target number: " ~ @*ARGS[1];
               continue;
             }

This block only runs in case we have two arguments, but it explicitly says that we may not be done yet: the continue statement counters the default implisit break to ensure that we can match the input value against other tests.

    when 1|2 {
               my $n = roll |@*ARGS>>.Int;
               say "Successes rolled: " ~ $n;
               $n >= 5 and say "Exceptional success!";
             }

We start off with a junction to say that either 1 or 2 is fine by us, we want both to match. Then we call roll() with the same arguments we got in, but each converted to Int. White magic. We store the value, and exclaim that the result is an exceptional success if it is.

    when *   {
               $*ERR.say("roll.p6 poolsize [target]");
               exit(64);
             }
}

This is the equivalent of C's default, the catch-all that handles remaining uncaught cases. We print a helpful usage string to STDERR ($*ERR in Perl 6) and exit with the correct Unix exit code, praying that nobody uses a different kind of system.

Wednesday, December 16, 2009

Dice Rolls for Role-Players

I realize that the title of this post is a bit of an oxymoron, because a Real Role-Player of course doesn't roll dice often. ;)

But in the cases where the Real Role-Player does roll dice, wouldn't it be nice to have a computer program to forget at home rather than some even more easily mislaid dice?

The Perl 6 Advent Calendar provided some inspiration for this post.

A problem with many minor programming examples you see on the net, is that they do not take into account the needs of a role-player. Role-players play many different systems, with different criteria for success in dice rolls. D6 (the regular six-sided cubic dice used for playing Monopoly, Yahtzee, etc.) are not used much in the majority of systems.

Therefore, I'll look at the Storyteller System, which is used in the World of Darkness series of games.

The general principle is that you have a pool of dice to roll, and you count your successes, which in this system is the number of dice that have a value greater than or equal to a given target number for the roll. The standard target number is 8 in most implementations. Five successes in the same roll is an exceptional success. Obviously, it's nice to have many dice to roll!

Here's a real Perl 6 program that works with Rakudo today: it accepts two command line parameters, the first being the size of the dice pool, the optional second parameter defines the target number for success:

use v6;

subset D10 of Int where 1..10;

sub is_success (D10 $roll, D10 $target) {
    my $n = 0;
    if ($roll == 10) {
        say "10 again";
        $n += roll 1,$target;
    }
    $roll >= $target ?? $n + 1 !! $n;
}

sub roll (Int $poolsize where { $_ > 0 }, D10 $target? = 8) {
    my D10 @rolls = (1..10).pick($poolsize, :replace);
    say "Roll: " ~ @rolls.sort.join(",");
    [+] @rolls.map: { is_success $_,$target };
}

given @*ARGS.elems {
    when 2   {
               say "Target number: " ~ @*ARGS[1];
               continue;
             }
    when 1|2 {
               my $n = roll |@*ARGS>>.Int;
               say "Successes rolled: " ~ $n;
               $n >= 5 and say "Exceptional success!";
             }
    when *   {
               $*ERR.say("roll.p6 poolsize [target]");
               exit(64);
             }
}

Thanks to moritz++ for ironing out two annoying mistakes!

Here are a few usage examples:

$ perl6 roll.p6
roll.pl poolsize [target]

$ perl6 roll.p6 5
Roll: 1,2,7,8,9
Successes rolled: 2

$ perl6 roll.p6 5 2
Target number: 2
Roll: 1,2,2,4,9
Successes rolled: 4

$ perl6 roll.p6 5 4
Target number: 4
Roll: 6,8,9,10,10
10 again
Roll: 8
10 again
Roll: 2
Successes rolled: 6 - Exceptional success!

There are no comments in this piece of code, I want people to try to understand it as-is, based on the Perl 6 Advent Calendar. If you have any questions, comments, corrections, etc., don't hesitate, just write!

In my next blog entry, I'll pick the program apart and comment on what I've done and why, and who knows, maybe someone has come up with an elegant solution to the same problem.

Wednesday, December 9, 2009

GCD - A Small Language Enthuser

fun gcd (x:int,y:int) : int =
    case x of 0 => y
  | _ => if x < 0 then gcd(y,0-x) else
         if y < 0 then gcd(0-y,x) else
         if y > x then gcd(y-x,x) else gcd(x-y,y);

"But that's not Perl!"

Yeah, yeah, I hear you.

I'll rectify that minor detail in a bit.

But first, an anecdote.

Back in the late nineteennineties, I was studying computer science, and one of the classes was about program specification and verification.

Several of the students already had a background with several programming languages, some were functional, some were imperative, and other languages were a bit confused about what they really were.

When studying program specification and verification, you either become rather obsessed with program correctness -- and hopefully elegance -- or you fail spectactularly.

There are several ways to muster enthusiasm when dealing with such studies; they can be rather, ehrm, theoretical.

I therefore flitted about, flirting with various programming languages, comparing them with the eagerness that young idealists do.

For some reason, I found Euclid's GCD algorithm to be particularly fascinating, for reasons unknown to men to this day.

The Perl version I saw was rather awful, and technically incorrect:

sub gcd {
    if (!$_[0]) {
        return $_[1];
    }
    if ($_[1] > $_[0]) {
        return gcd ($_[1]-$_[0],$_[0]);
    }
    return gcd ($_[0]-$_[1],$_[1]);
}

Yikes. I mean, eep. And Perl does have a modulo operator.

sub gcd {
    my ($x, $y) = @_;
    $y ? gcd ($y, $x % $y) : abs $x;
}

I won't claim that the above code is the epitome of elegance, but it solves the problem in a general and easily read way (I admit a prejudice against $_[N]), while retaining correctness.

This is, BTW, one place where some golfers miss the boat; the GCD cannot be a negative integer. That's why the ML code at the top is so verbose.

Small challenges like these kept me going, and it can be an inspiring way to learn details in a new language. So, what would it look like in Perl 6?

sub gcd (Int $x, Int $y) {
    $y ?? gcd($y, $x % $y) !! $x.abs;
}

What's your favourite algorithm for playtesting languages?

Tuesday, December 1, 2009

Oslo.pm Past and Future

In case the title wasn't a give-away: this is a non-technical blog entry.

I became an Oslo.pm member by signing up for the mailing list shortly after the Nordic Perl Workshop 2009. That's cheap (well, free!), easy, and therefore newbie-friendly.

Last week, I dropped in at the general assembly and exercised my speaking and voting rights, and got an inside scoop on how this Perl organization works. The board members were, after all, the guys who did a terrific job arranging not only this spring's workshop, but also mostly the same people who held the workshop of 2006, which also went quite well.

From my point of view, Oslo.pm has come from being an anonymous group to a rather solid little volunteer organization. Before 2006, I'd have said "huh?" if someone asked me who might have anything to do with Perl in Norway, afterwards, I knew there was something called Oslo.pm, and so did a few people in Europe and the USA. After NPW 2009, I think it's safe to say that the organization is now known as a stable and capable group of Perl mongers. That's a decent achievement, especially in this age, when it seems like almost nobody (in Norway) is willing to do anything free of charge.

So what did they think about themselves, and what's going to happen in the near future?

True to the Norwegian spirit, they were modest and self-disparaging, but they were very happy that the attendees were apparently happy, even months later.

Salve J. Nilsen, the Great Leader of 2009, bowed and said farewell to the post of chairman, and now Marcus Ramberg is at the helm.

The new Oslo.pm board will attempt to increase local activity, and there will probably be some kind of technical talk on the first Wednesday of almost every month in 2010. They aim to increase cooperation with local Perl-using companies, as well as aiming for some cross-language and language agnostic sessions.

First out is tomorrow's Perl 5.10 session at Redpill Linpro, which I'm sure will be technically rewarding for those who show up. I plan to!

Wednesday, November 25, 2009

The morality of helping

Yesterday, a friend asked me, "are you a Perl expert?"

I answered in the only way possible: "eeeeh..."

It turned out that my friend did not ask for help for himself, but for someone else, who had posted a programming class question on a non-programming bulletin board.

The poor fellow was struggling with a question of parsing a two-column input to generate a certain output format, essentially also two-column, except with a slightly different layout.

On Usenet, there was - and maybe still is - a long-standing tradition of not solving people's homework for them. The reasoning behind this is that we do not learn quite as well when people solve our problems for us, as when we struggle with them ourselves.

In some cases, school questions would be met with derision, in other cases with genuinely unhelpful and false answers, and sometimes with helpful clues about how to solve the problem; where to look, tips for using stepping debuggers, which book chapter or manual page would clarify the problem, etc.

Okay, that is fair enough.

The guy had gotten only one answer, from another guy who regretted that he had not touched Perl in ages, and therefore could not help. And I thought that Perl was like learning a particularly catchy, but annoying song: you might think that you have forgotten, but then someone hums or whistles the tune, and WHAM - there it is, stuck in your head again.

So I had a look. Maybe I could provide a hint or two, you never know. I know my way around some of the less scary parts of Perl 5 City, anyway.

This guy had essentially nailed the problem semantically, but he was evidently struggling with his code, it just did not work.

I immediately saw a few major concerns:

Some parts were copy-pasted from bad textbook Perl.

Some parts must have come from a poor programming education.

The code was overly complex and verbose.

There was no error checking or debug print-outs.

And it would take me more time to helpfully point out these things than write something that might be a solution myself.

The moral dilemma then was:

Should I help the guy out by tearing his code apart and pointing out all the flaws that made it thoroughly lousy code, thereby provoking a true emo-American melodrama?

Or should I just write an alternate solution, with sound error-checking, simplicity, and debug print-outs?

In this case, I thought the latter was the way to go. I put the code up anonymously somewhere, gave the link to my friend, and perhaps the fellow with the problem now has a better understanding of how simple and beautiful Perl can be.

Yeah, right. :D

Sunday, November 15, 2009

What stops me from using Perl 6, today?

Since I got hooked on the Perl community, and got a taste of Perl 6, I've been wondering about:

what, exactly, is it that I could use Perl 6 for, right now?
why am I not actively using Perl 6 now?

Those are easy questions, but answering is hard, so this may be a long post.

Sure, the points listed below are not exactly Perl 6 specific; I could probably have picked some other programming language, but I somehow feel more comfortable in the way that Perl 6 still is Perl.

What I could use Perl 6 for right now

I think it's fair to say that using Perl 6 today mostly means using Rakudo, and that I wouldn't use it in what we popularly call a "production setting". But many of us programmers, sysadmins, geeks and nerds have perfectly suitable hobby projects, where we won't have clients wringing our necks if there is three minutes of downtime in a month, or if we don't deliver the Speedy Gonzalez of services; we have projects that are neither computing performance constrained or stability constrained.

So that's where I could have started using Perl 6 half a year ago, and of course still can.

I know I can use Perl 6 for e.g. a fairly complex web site using Web.pm and Squerl for a SQLite backend. It will probably work just fine, for a lot of projects.

I know I can use it for lots of one-liner scripts.

I know that in some regards, Perl 6 will outperform classic Perl 5 in terms of programmer time spent. An example is the given-when control structure, which (to me) is semantically superior to if-elsif-elsif-elsif. Programmer time is important to me, I hate coding too much for menial tasks. And I'm sorry to say that Perl 5.10 doesn't do it yet for me, as I cannot rely on its presence, even for hobby projects.

And I know I can use Perl 6 to refresh some of the knowledge about programming language specifics (terminology, technique, methodology, etc.) that I've allowed to rust since I left university in 2001.

Concrete projects, in no particular order

web page for registration of pool billiards tournament results; it's not performance critical, and the users could check and verify the dataset themselves after input

conversion of historical results data in CSV format to a database; one-time job, needs manual verification no matter what programming language I use to do it

contributing to the Temporal.pm specification and implementation in Perl 6

personal web gallery generation; I positively loathe most of the online galleries, because they sooner rather than later are discovered to have HUGE, GAPING security vulnerabilities

blogging tool; I'm not very comfortable with blog software running on servers, either, and whatever blogging I do, it's not actual work

That's quite a lot, isn't it? It ought to have been enough to get me going in a jiffy!

Why I'm not actively using Perl 6 now

This may be a surprise to some: it's not because of a lack of matureness in the tools, a lack of confidence in the language or tools, stability issues, etc. As I tangentially mentioned above, I believe there is no technical hindrance for me to start coding on a hobby project.

I have plenty of hobby projects to choose from. They are also quite manageable in terms of eventual lines of code.

However, there is something holding me back, and that's a certain degree of perfectionism mixed with procrastination fever.

mst mentioned during the NPW hackathon this spring that perfectionism was a barrier against getting started. If you're too obsessed with getting things right at first, at wanting to avoid failure, procrastinating is too easy. Getting slightly intoxicated (yup, drinking alcohol, which of course is only a recourse for adults) is a way of reducing your own perfection anxiety. This is almost exactly what Randall Munroe's xkcd calls the Ballmer Peak:

But I don't sleep too well after drinking alcohol, and I also tend to do hobby projects in my "running breaks" during work hours, in which case alcohol intake may be a very bad idea.

In addition, my time at work is a series of interruptions, which really isn't conductive to sitting down and learning something new and complex.

When I get home from work, I'm usually so fed up with computers that I don't want to have anything to do with them.

So my spare time, whatever is left of it, usually isn't spent on programming. Note that I don't even do these projects in a programming language I already know well; they are on hold regardless of that.

All in all, there's nothing much wrong with Perl 6.

Blaming the immaturity of Rakudo would just be a silly excuse. There's something wrong with my capacity for finding the time to get down and dirty with it, that's what; I'm apparently not currently capable of saying honestly:

This is my Perl 6 hour. This hour, I'm going to do Perl 6 stuff, and this time is sacred.

Monday, November 9, 2009

What the #perl6 IRC bots do

Do you feel like a n00b on #perl6, like I do, and wonder what the different bots do?

I keep forgetting what they are, so here's a list for you and me both:

dalek

Announces commits (mainly to rakudo, nqp-rx and the Perl 6 book)

hugme

Used for hugging another user without "direct" contact:

hugme: hug masak

ilbot2

Near-realtime IRC logs with automatic link generation to irclog.perlgeek.de. The original ilbot sucked, according to moritz.

ilogger2

Another logging bot

lambdabot

Keeps track of karma ("moritz++" adds one to moritz's karma score, "frettled--" subtracts one from mine)

lisppaste3

Announces entries pasted to http://paste.lisp.org/new/perl6 (which is where we paste code and other stuff, so that we avoid spamming the channel too much, and also don't have to worry about creating our own temporary web pages)

masak

Submits rakudo bugs. Aw, okay, then, he's not a bot, just a really nice guy!

mubot

Also tracks karma, attempting to be slightly less annoying than lambdabot. mubot is clever enough to recognize that your nick may vary slightly from time to time and channel to channel. mubot is written in Perl 6!

p6eval

Perl 6 code evaluation bot. We use this for live testing of code that may be of interest to others; it chats back to the channel. perl6: my $a; will result in a test against several Perl 6 interpreters (elf, mildew, mildew-js, pugs, rakudo, sprixel), nqp: say('foo') tests nqp-rx, std: my $a will parse the expression with STD.pm.

phenny

Our secretary. Sample usage:

phenny, tell frettled I'll get back to you on that

phenny will then let me know when I become active on the channel again.

pointme

Provides links to projects tracked by proto. Example usage:

< carlin> pointme: rssbot
< pointme> carlins's rssbot is at http://github.com/carlins/rssbot

pointme is written in Perl 6!

pugs_svn

Tracks commits to the pugs repository, most of which are changes to the test suite and spec.

zaslon

Tracks blog posts from a certain group of bloggers. Zaslon is written in Perl 6!

Thanks to carlin, Juerd, jnthn and moritz for late night clarifications!

Tuesday, November 3, 2009

Checking and fixing Unix file modes

Among other tasks in my sysadmin role at a web hosting provider, I work with fallout from poorly designed PHP code - which is ubiquitous - and I use Perl 5 to perform a bunch of semi-automated tasks.

If you just want to see how I utilize the Fcntl module, look further down.

One of the many things that tend to go wrong is the assumption that PHP always runs as mod_php (blatantly disregarding php-cgi, suphp, and other per-user PHP frameworks), and therefore directories (folders) and files used by PHP "must be" prepared with chmod 777 chmod 666. The latter number is a BIG FRIGGIN' HINT, it's the number of the beast.

Whenever documentation tells you to use the number of the beast for your chmod command, that documentation is also telling you to lube up and bend over.

Unfortunately, customers and users don't necessarily see this gotcha; no matter how good the hosting provider's documentation is, they'll naturally trust the software monger's instructions.

That means that the hosting provider ought to have tools available for identifying and fixing such writable directories and files. There are many wrong ways of doing it, one is:chmod -R, since that touches ALL files, recursively, overwriting the ctime stamp. An okay way is to use find, which (in most versions) allow you to fix things up quite neatly (bash/sh compatible syntax, GNU find compatible options, $dir represents the directory to fix recursively, $user is the username whose files you want to change):

find $dir -xdev -user $uname \
\( \( -perm /og=w -exec chmod og-w {} \; \) -o \
   \( -perm /g=w -exec chmod g-w {} \; \) -o \
   \( -perm /o=w -exec chmod o-w {} \; \) \)

Phew, that was quite a mouthful, but it's rather nice in resource usage, and it doesn't cross filesystem boundaries (-xdev).

So why would I want to do this in Perl, you ask?

"Eeerrr. Good question, let me tell you why!"

There are many other problems to look for, which it is sensible to look for while you're at it, just to mention a few:

Backdoors
Hidden IFRAMEs
Malicious JavaScript
Malware URLs
Malware redirects (e.g. in .htaccess)
Outdated software versions
Root exploits
Spamming scripts
Viruses

These things belong in a program, not a teeny weeny Unix one-liner, or even a set of them.

While you're at it, you might want to create a log of what you found, and perhaps which line numbers are relevant for which files, both for pointing out where to consider fixing things, as well as having something to use for debugging your false positives.

The code

Here's how I identify the files with too liberal write permissions, utilizing the Fnctl module. The filename is stored in $_, the user's real UID in $r_uid, and I also store various file information and what kind of file we're looking at.

use Fcntl ':mode';

my %badperms;
my ($dev,$ino,$mode,$nlink,
    $uid,$gid,$rdev,$size) = lstat($_);
my $g_write = $mode & S_IWGRP;
my $o_write = $mode & S_IWOTH;
my ($isfile,$isdir,$islink) = (-f _, -d _, -l _);

my $filetype = $isfile ? "File" : \
   $isdir ? "Directory" : \
   $islink ? "Symlink" : "Other";
my $fn = $_;

# Writable for others                                                    
if (!$islink && ($g_write || $o_write)) {
    $badperms{$fn} =
        sprintf ("%s: [%04o] %s\n", $filetype,
                 S_IMODE($mode), $fn);
}

In a future post, I'll try to get back with how this fits in my bigger picture of vulnerability detection.

As always, suggestions for improvements and questions are very welcome.

Monday, October 26, 2009

Small and cute shell and Perl scriptlets

Once upon a time, the world was full of small and cute shell and Perl scripts that served a small but useful purpose, and which were shared freely - just because hackers were nice guys. So we bring some of these around with us, in slightly modified or improved versions.

I'd like to carry on the tradition of sharing some of these before they are lost, even though they are about as trivial as you can get.

Eval

Here's one that I picked up at the University of Oslo 10-15 years ago, which I still use, just out of habit. The file is traditionally named Eval, and I believe this particular version was concocted by Kjetil T. Homme (who has some other nice hacks, BTW). The idea is to have an easy-to-use command line calculator:

#! /bin/sh -

exec perl -e '$a = ('"$*"');
 if ($a == int($a)) {
   print $a, "\n";
 } else {
   printf ("%.3f\n", $a);
 }'

So what's the point?

For me, it's the ease of use, and not least the ease of installability. All I need is a Perl version that understands a very simple set of commands. I don't have to worry whether bc or dc is installed, I don't have to open a GUI, and I can work with fairly advanced expressions. They just have end up as valid Perl.

7

Did you ever forget exactly which octal code to use for the ASCII letter 'h', or the hex number for the backtick (`) character? 7 to the rescue!

#! /usr/bin/perl -C

for (32..126) {
    printf "%3d 0x%02X 0%03o 0b%08b %c\n", ($_)x5;
}

And if you still haven't made the transition to Unicode/UTF-8, printing the printables in e.g. a Latin character set may still be interesting; create a copy of the file called 8, which works on the range 160..255 instead. Or be fancy with basename checking and all that. :)

Answer

When I write scripts that do potentially Dangerous Stuff on production servers (that happens way too often), I usually like to include some code that tells the user to think a bit before continuing. That is, I don't want all scripts to be fire-and-forget. This is another classical piece of code, which I've massaged for my own needs (and perhaps your needs as well?):

print "About to run $code.\n\n";
print "Are you sure? (y/N): ";
my $answer = <STDIN>;
chop $answer;
if ($answer !~ /^y(es)?$/i) {
    die "Okay, aborting.\n";
} else {
    print "Okay, continuing!\n";
}

Monday, October 19, 2009

183 days of Iron Man blogging

Today it's 183 days - more than half a year - since I started blogging, spurred on by mst's lightning talk at NPW 2009.

That ain't half bad.

But he's still not losing his bet yet, argh! :D

Coding styles that make me twitch, part 5

Let us say that we have some code using DBI - old-fashioned, but it still works, kindof.

How would you like to see the following in a 6000 line CGI script you're supposed to debug?


    my $sth=&query("SELECT id FROM invoices WHERE invoiced='N'");
    my $invoiced=$sth->rows;

    my $sth2=&query("SELECT count(*) nusers FROM users \
LEFT JOIN people ... # long SQL statement
    my $row2 = $sth2->fetchrow_hashref;
    my $nusers = $row2->{nusers};

    my $sth3=&query("SELECT count(*) npeople FROM users \
LEFT JOIN people ... # long SQL statement
    my $row3 = $sth3->fetchrow_hashref;
    my $npeople = $row3->{npeople};

    my $sth4=...

And then, after carefully checking scope, you discover that the variables $sth2, $row2, $sth3, $row3, $sth4, $row4 etc. are not used anywhere else within the same scope.

Would you develop a tick?

I did, and the tick didn't lessen in strength when I discovered unsafe variable interpolation as well as a sub query that used $dbh->prepare($qtext) but didn't allow passing of usable parameters, such as, you know, bind variables.

I've started on "refactoring" the real life code this example is based on, but I get depressed from the sheer amount of work in fixing many problems at once. Maybe I should pick up drunken coding to become less perfectionist.

Monday, October 12, 2009

Use Digest::MD5 - it's easy

In my previous entry, I presented and purposefully ignored a rather non-portable way of getting an MD5 sum from a file:

$md5 = `md5sum $filename.new | awk '{ print $1 }'`;
$md5 =~ s/\n$//;

There are stupider and better ways of doing this in the system call, but it completely ignores the problem that the command is not always called md5sum.

Digest::MD5 comes to the rescue!


use autodie; # Hee-hee
use Digest::MD5;
my $digester = Digest::MD5->new;
open(FH,"<$filename.new");
$digester->addfile(*FH);
my $md5 = $digester->hexdigest;

Okay, that looks slightly over-complicated, one might argue that Digest::MD5 should hide the file handle fiddling from the user. The value added comes when you have a chunk of data already in a variable, then you just do use Digest::MD5 qw(md5_hex); and call md5_hex($data).

Oh, and I snuck in something else again, didn't I.

Tuesday, October 6, 2009

Coding styles that make me twitch, part 4

All too often, I see code like this:


$md5 = `md5sum $filename.new | awk '{ print $1 }'`;
$md5 =~ s/\n$//;
if ($md5 ne $origmd5) {
  system("mv $filename $filename.old");
  system("mv $filename.new $filename");
  …
}

Well, you get the idea.

Sure, Perl can pretend to be a glorified shell script, but there are perfectly well-functioning internal functions for these things.

Let's ignore the non-portability of the md5sum command for the time being, and also avoid shaving that awk call down with cut and an enclosing echo -n $(…) - we're here for the Perl, not the shell, right?

The above system() calls can easily be replaced with the following to save yourself a few unnecessary forks:


rename($filename,"$filename.old") and
rename("$filename.new",$filename);

Note the slight refinement of using and to avoid clobbering the file. But the documentation for rename() suggests that we use move() from File::Copy instead:


use File::Copy qw(mv);
mv($filename,"$filename.old") and
mv("$filename.new",$filename);

Similarly, there are nice built-in functions for many other frequent victims of system(), e.g.:


chmod
chgrp + chown
link
mkdir
rmdir
symlink
unlink

And of course there is a bunch of more or less helpful modules on CPAN (in addition to the already mentioned File::Copy), e.g. File::Path.

Wednesday, September 30, 2009

Contributing to Perl

Inspiration is a big bother, at least when you don't have it.

I never felt that I had anything to contribute to the Perl community until the Nordic Perl Workshop this year, when I suddenly was involved in a small way.

I'm still involved in a small way, and only occasionally, but that works rather well.

In retrospect, my expectation that contributing would be a Big Deal turned out be wrong.

I don't have to solve all the problems in the world, and I don't have to solve the Big Ones, either.

It doesn't even take much of my time, and this level of contribution is one that suits many people.

Inspiration comes from the impression I have that there are many more people like me, and that all those little trickles of water we add to the pool are enough to keep the more active contributors going.

What is difficult, however, is to keep up with the Iron Man blogging challenge. I knew it would be when I got started, but I figured that there would always be something to write about without seeming too inane.

Today's post shows that I was wrong on that count. :D

Wednesday, September 23, 2009

I blame society

This is only vaguely Perl-related.

This weekend, I visited Abigail and her spouse in their home in the Netherlands.

I know Abigail from several settings, and Perl is one of them; Abigail has taught me many cool finesses in Perl 5, which has saved me considerable time both personally and professionally.

The friendships and connections we make "virtually" can also be very strong IRL, and I feel that has been the case with us.

If there is a point to this post, then that is: visit your Perl friends!

Sunday, September 13, 2009

Some ways that Perl 6 is grand, part 3 of ?

I'll be really brief this time, I promise!

Several things about Perl 6 are there to save programmer time. Some of them even will do so. ;) This week's favourite is pure laziness. Let's imagine we have one of those lists with month names in them again, and for some odd reason only want to do something to each third month.

for ^12 :by(3) {
  say @months[$_];
}

I think that's just neat.

The caret notation is the "upto" operator, and shorthand for a range from 0 and up to its argument. So in the example above, we're asking for a range from 0 and up to 12.

The by adverbial is new in the specification. It denotes the increments for the upto operator, and allows e.g. real numbers, so that you can increment by 0.25 if it makes sense for your code to do so.

The above example then essentially iterates over 0, 3, 6, 9.

The by adverbial isn't yet implemented in Rakudo (well, duh, the spec just saw it added), where you have to settle for this for now:

for ^4 {
  say @months[$_*3];
}

… and something different if you want to increment by 0.25. I hope you don't want to increment through the list of months by 0.25.

Sunday, September 6, 2009

Print-and-log in Perl 6

In one of my early blog entries, "Simple print-and-log subroutine", I shared a small piece of code that has been a nice, every-day tool - in Perl 5.

Today, I set about converting that to a naïve Perl 6 version, and being quite the Perl 6 n00b still, I ran into a few hurdles along the way.

The hurdles were easy enough to avoid, once masak++ and moritz++ had bonked my head sufficiently.

I'll walk through the code, step by step, to illustrate what I learned today, and what others might need to know.

use v6;

Ah, first, remember this statement. It's a nice hint for other software if you want to continue using the .pl file suffix instead of .p6l etc.

my $verbose = 1;
my $logfile = "/tmp/test.log";

my $*PREFIX = "plog";

Variables starting with the twigil $* is what we call contextual variables, and they just started working yesterday in Rakudo. Contextual variables follow a dynamic call path. When we reuse this variable later, it will depend on which call path we followed.

sub plogwarn ($msg)
{ 
  my $*PREFIX = "plogwarn";

$*PREFIX now has a different value only for the cases where we've called the plogwarn subroutine, and the subsequent call to plog inherits this value.

  plog $msg,1;

Oh, BTW, it's really, REALLY important to keep track of where you add whitespace or not. I'd been sleeping and class, and forgotten completely that plog($msg,1) would call plog with the parameters $msg and $1, while plog ($msg,1) would call plog with the single parameter ($msg,$1) (yep, a list). It may be safer to avoid parentheses altogether when you're not dealing with lists - except when you have to.

}

sub plogdie ($msg)
{ 
  my $*PREFIX = "plogdie";
  plog $msg,2;
  exit 1;
}

sub plog ($msg is copy, $level?)

The trait is copy means that I want to make a copy of the incoming parameter $msg, so that I can modify it inside of plog. Normally, parameters in Perl 6 are read-only and pass-by-reference (though not "reference" as you know it from Perl); the parameter as named is merely an alias for the actual variable. I can change the parameter in the caller if I use is rw, but that's not what I want to do here. The question mark in the second parameter - $level? - means that the parameter is optional, and I've used that in both plogwarn and plogdie above.

{ 
  my $dt = Time.gmtime.date.iso8601
         ~ " "
         ~ Time.gmtime.time.iso8601;

Normally, I'd use localtime, but this is NYI (not yet implemented) in Temporal.pm.


  $msg = ($dt,"[$*PREFIX]",$msg).join(" ");

Here, I abuse that is copy to save myself one precious variable, just as an example.


  if $verbose {
    given $level {
      when 1 { warn $msg; }
      when 2 { die $msg; }
      default { say $msg; }
    }

The given…when construct is similar to the switch…case construct in other languages, but subtly different, as it allows more possible values and value types than most. Coming from a world of Perl 5.8 and older, this is simply lovely.

  }
  # Append message to $logfile
  my $*OUT = open $logfile, :a;

Here's another use of a contextual variable, but this time it's the one normally associated with STDOUT. print FILEHANDLE $msg is no longer necessary, because you can always assign $*OUT in its own scope. The open statement has :a to indicate that I'm opening for append (so the syntax is less unixy than Perl 5's >>).

  say $msg;
  $*OUT.close;
}

This is how we can use the three subroutines, and the output should illustrate the different contextuals nicely:


plog("Oh, hello, there, sweetie.");
plogwarn("Consider yourself warned");
plogdie("Die, frackin' monster, die!");

…outputs…

2009-09-06 19:59:32 [plog] Oh, hello, there, sweetie.
2009-09-06 19:59:32 [plogwarn] Consider yourself warned
2009-09-06 19:59:32 [plogdie] Die, frackin' monster, die!

And here's the complete example code, which works with the current Rakudo:

use v6;

my $verbose = 1;
my $logfile = "/tmp/test.log";
my $*PREFIX = "plog";

sub plogwarn ($msg)
{ 
  my $*PREFIX = "plogwarn";
  plog $msg,1;
}

sub plogdie ($msg)
{ 
  my $*PREFIX = "plogdie";
  plog $msg,2;
  exit 1;
}

sub plog ($msg is copy, $level?)
{ 
  my $dt = Time.gmtime.date.iso8601
         ~ " "
         ~ Time.gmtime.time.iso8601;

  $msg = ($dt,"[$*PREFIX]",$msg).join(" ");

  if $verbose {
    given $level {
      when 1 { warn $msg; }
      when 2 { die $msg; }
      default { say $msg; }
    }
  }
  # Append message to $logfile
  my $*OUT = open $logfile, :a;
  say $msg;
  $*OUT.close;
}

Monday, August 31, 2009

Some ways that Perl 6 is grand, part 2 of ?

Okay, this is really part 1b of ?, but…

In my earlier post, I used the zip operator to join two lists into a hash.

There was one obvious use of the operator that escaped me at the time, and that was how I sometimes need to create a new hash from the keys of two hashes, or keys and values. And now I think it's starting to look neat:

my %A = { a => 1, b => 2 };
my %B = { z => 9, y => 8 };

my %AB = %A.keys Z %B.keys;
# { "a" => "z", "b" => "y" }

%AB = %A.keys Z %B.values;
# { "a" => 9, "b" => 8 }

However, this is a bit unpredictable, since the hash key order is undefined. So if you expect sorted keys, do that at the same time:

%AB = %A.keys.sort Z %B.keys.sort;
# { "a" => y, "b" => z }

# Sort by B's values - two variants
%AB = %A.keys.sort Z map { %B{$_} }, %B.keys.sort;
%AB = %A.keys.sort Z %B.sort.map( { .value } );
# { "a" => 8, "b" => 9 }

The equivalent Perl 5.10 version would be:

use List::MoreUtils qw/zip/;
my @k = sort(keys(%A));
my @v = map { $B{$_} }, sort(keys(%B));
%AB = zip @k, @v;

I now have a nice-ish argument for upgrading to Perl 5.10.1 on $workplace's servers. :D

masak++ for helping a tired me with the map expression.
Chas. Owens++ for spotting the missing use statement for Perl 5.
Pm++ for another way of sorting by value, just what I was hoping for!
isec++ for spotting a missing sort() for Perl 5.

Sunday, August 23, 2009

Autovivification - a reminder

Most of you already know this by heart, but the odd reader may have forgotten.

Autovivification is what we call the process of automatically creating entries in built-in data structures (Perl 5: array/list and hash), usually at the time we check whether an inner element exists or not.

This can be a royal PITA, if you don't pay attention to the problem. That's why it keeps being mentioned.

Here's a simple example:


my %hash;
my $n;
while (!exists ($hash{x}) && $n < 5) {
    $n++;
    if (!exists ($hash{x}{y}) {
        print "hash{x}{y} does not exist: $n\n";
    }
}

Q: How many times does the above while loop run in Perl 5?
A: Once.

The simple matter of checking the existence of the inner hash resulted in an entry being created for $hash{x}.

That means that tests like these should be written more carefully:


…
    if (defined ($hash{x})) {
        if (!exists ($hash{x}{y}) {
            print "$n: hash{x}{y} does not exist.\n";
        }
    } else {
        print "$n: hash{x} is undefined.\n";
    }
…

This prints $n: hash{x} is undefined. (with an incrementing $n) five times.

Edit 2009-08-24 18:49 UTC: MST commented that there is an autovivification module on CPAN that lets us say no autovivification; - and it's even lexically scoped! That's just $notreallyanexpletive brilliant! Thanks, Matt, and thanks, Vincent!

Oh, BTW, Perl 6 has a useful specification for autovivification, which illuminates the problem further.

Sunday, August 16, 2009

Hash key sort order - a Perl 6 community sunshine story

The order of hash keys is implementation dependent and arbitrary. Unless %hash is altered in any way, successive calls to .keys, .kv, .pairs, .values, or .iterator will iterate over the elements in the same order.

S09 - Hashes

This is new. It may not actually say much, but it does say what was implicit before, so that there is little room for doubt.

So here's my little sunshine story about how easy it is to clarify a part of the spec.

In my previous post, I used an imaginary case for showing off some features of Perl 6 - some of which also are available in Perl 5.10, as mentioned by Robert 'phaylon' Sedlacek in a comment.

This wasn't the only useful comment, I think there's a bit to be learned by reading those, so please do.

But I digress from the point of this post, which is a question that was raised in another comment to last Sunday's post:

By the way, do you know where the behavior of ~%h is spec'ed? I keep getting the keys back in the same order I put them in and don't know if that is an implementation quirk or a feature.

- Chas. Owens

The spec wasn't very clear about this; S32/Containers - Hash said that certain iterator methods iterate "… the elements of %hash in no apparent order, but the order will be the same between successive calls to these functions, as long as %hash doesn't change."

S09 - Hashes didn't say anything about it at all.

I said I would ask around. Thanks to the excellent community channel #perl6 on Freenode, I got an answer similar to this: no, this is unspecified/undefined behaviour, but feel free to come up with a better way of saying it, and update the synopses.

And how hard is it to update the synopses? Not at all! If we want to contribute, we get access. It's as easy and simple as that.

First, you need to check out the (part of) the svn repository that you want to contribute to:

svn co http://svn.pugscode.org/pugs/docs/Perl6

Then you change whatever you want to change, preferably discuss it with some of the experienced souls on #perl6 or e.g. the Perl 6 language mailing list, show diffs on e.g. gist.github.org or paste.lisp.org, and if you think you're doing the right thing - commit the change.

"But I can't commit, I only have read access" you might say. Just ask in any of the mentioned fora for a "commit bit" and state your e-mail address and preferred username, and someone will help you out with that part.

As we say on #perl6: community++

I got started, will you join me?

Sunday, August 9, 2009

Some ways that Perl 6 is grand, part 1 of ?

(After YAPC::Europe in Lisbon, we no longer say "awesome", we say "grand". ;))

In Lisbon, there were several talks that aided us to a better understanding of how Perl 6 may be more pleasant and useful than Perl 5.8, or even Perl 5.10.

I thought I'd try to illuminate some of these as I progress in my own knowledge of Perl 6.

First choice: the zip operator and the new quoting syntax for generating lists and selecting items from hashes:


my @months = <Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec>;
my %days = @months Z 31,28,31,30,31,30,31,31,30,31,30,31;
say "Days in March: " ~ %days<Mar>;
say "Days in June: {%days<Jun>}";
say ~%days;

That was quite a bit in one go. :) First, I create a new list of abbreviated month names. Then I combine the list of months with a list of days in that month (for 2009, obviously) using the zip (Z) operator, and print the number of days in March and June. Finally, I pretty-print each month with the corresponding number of days.

The result looks like this:


Days in March: 31
Days in June: 30
Jan     31
Feb     28
Mar     31
Apr     30
May     31
Jun     30
Jul     31
Aug     31
Sep     30
Oct     31
Nov     30
Dec     31

Now imagine what that would have to look like in plain Perl 5.

There are several new things here that makes your programming days easier:

The well-known sigils $, @ and % are now used in a consistent manner. @ signifies a list, % signifies a hash, also when you access items by index.
Interpolation of non-scalars is now handled with curly brackets {}.
Building a list can be done using commas, no parentheses are necessary.
Building a quoted list can be done using angle brackets, which are auto-quoting.
Accessing a hash item is done using angle brackets, which should be welcome to those of use who don't use US/UK keyboards (bye-bye to {'argh'}, but it still works).
say is a new pretty-printing friend, which automatically adds newlines.
The concatenation operator is now tilde (~) instead of a period.
As a unary prefix operator, tilde by default stringifies a hash pair-wise with newline separators (while a list is stringified with space separators). It is possible to change this behaviour in derived subclasses.

Edit 2009-08-10 08:34-09:18 UTC: Clarified and added interpolation.

Tuesday, August 4, 2009

Mini report from YAPC::Europe 2009

The second day of the conference is now finished.

Yet again, the lightning talks have been a great source of entertainment, but the best regular entertainment were probably Paul Fenwick's and Damian Conway's sparring ... in Klingon.

If you can lay your hands on the video of their respective talks, see them in order, and enjoy the show.

You might also learn some really interesting Perl as well, or fall deep into the pits of insanity.

Friday, July 31, 2009

YAPC::Europe 2009

Here we go again!

Thankfully, my employer thought that this year's YAPC::Europe (Corporate Perl) could be relevant for me, and so I'm going, hoping to learn yet more from all the excellent minds present there (modulo swine flu, I suppose ...).

The schedule contains way too many interesting subjects at the same time, but I'm happy to say that there's plenty of breathing room and time for socializing.

And this time, I'm bringing a computer, so that I can pretend to be doing something useful. Haha.

See you there!

Monday, July 27, 2009

Perl, open source and gender bias

Recently, I have been reading a discussion regarding gender bias in the tech industry.

I realize it is a problem, but for the past years, it has been more of an intellectual realization than something that hit me in the gut. Perhaps it has something to do with the work place culture where I have been working, combined with the informal technical fora I have seen, as well as the general state of things here in Norway.

However, these things are perhaps not talked about quite as much as they should, and today I re-learned that although things may seem mostly rosy, they are not necessarily so.

Kirrily "Skud" Robert has been active in various open source contexts for years, and should be no stranger to the Perl community. She recently held a presentation at OSCON -- Standing Out in the Crowd -- where she discussed some of the challenges when women are in minority, and experiences with open source projects where they are in majority.

What kicked me in the gut this time was the rather low percentage of women involved in Perl. While better than open source in general, it seems excessively low.

But Skud does not sling mud at the Perl community. Instead -- if I understand her correctly -- she aims to help open source communities -- and I suppose particularly new projects -- to utilize the resourcefulness of women in development, to reduce the gender bias by increasing the overall activity.

I can only recommend that you read the presentation, and contemplate the issue for yourself.

It certainly made me think. Again.

Tuesday, July 21, 2009

A comment on sandboxing and web security

In "What I Want In Firefox (Parrot)", Ovid expresses his desire for Perl 6 scripting functionality, with the caveat "if the sandbox is secure enough".

Aha, there's your problem.

The Firefox developers clearly didn't have sandboxing in mind when they designed the browser, scripts are essentially free to do what they wish at least with the DOM, as well as access many central browser functions, plus a bunch of things that we really don't want security vulnerabilities for.

Another problem is that JavaScript is fundamentally entrenched in Firefox; too much of Firefox internals are based on JavaScript, and I suspect that a Perl 6 addition would only work so-so at best, even if that hypothetical secure sandbox existed.

Oh, and BTW: the example commits one of the great no-nos of web programming: pushing server side security into the browser, which essentially is no security at all, as seen form the server's point of view.

Why am I being so negative all of a sudden? Well, it might be sudden for this blog, but I've been working with security related cleanup and detection in the context of Linux system administration for a few years now. It's not enjoyable, I can tell you, and it definitely colours my perception of these things. Also, I've had a fair bit to do with thinking about security earlier.

Recommended reading (that should be required) for web programmers:

Innocent Code by Sverre H. Huseby (a buddy of mine), ISBN: 978-0470857441

Friday, July 17, 2009

Iron Man going strong

90 days on ...

English, German, Hebrew, Japanese, Magyar, Perl 5, Perl 6, Russian, ...

We're positively flourishing here!

When mst announced the Iron Man concept in his lightning talk at NPW 2009, I felt inspired, but I didn't expect it to work this well.

There are many extremely interesting blogs out there, sharing knowledge and experience at a level I definitely didn't expect.

It feels unique, it's wonderful, and I hope this will continue snowballing!

Thanks, guys!

Wednesday, July 8, 2009

Coding styles that make me twitch, part 3

Today's twitchiness is sponsored by ... no, wait, I don't have sponsors. Ah, well.

I have an issue with people who insist on using <"> as a string delimiter when the (static) string itself contains that very character. It gets fugly all too soon:

my $html_output = "<a href=\"http://www.example.com/foobar/$pagename.html\" title=\"Oh, a link to $pagename\"> ...\n";

It's so easy to avoid having to quote the <"> while still allowing variable interpolation:

my $html_output = qq(<a href="http://www.example.com/foobar/$pagename.html" title="Oh, a link to $pagename"> ...\n);

Since the example is HTML (and could be e.g. SQL), and it might be multi-line, why not ...

my $html_output = <<EOL;
<a href="http://www.example.com/foobar/$pagename.html"
   title="Oh, a link to $pagename>
EOL

That wasn't so hard? Or fugly?

Thursday, July 2, 2009

Ruby users anonymous

Last weekend, I visited a couple of friends -- Ina and Stig -- and being nerdy geeks, we (well, Stig and I, anyway) ended up chatting about programming and stuff.

Stig has just started programming Perl again, after a long period where he'd probably claim to be apostate.

Everybody, meet Stig. Stig used to program Ruby, but now he's recovering.

Edit: Stig's blog address injected.

I hope this shows that there's hope for the ones who got lost.

Wednesday, June 24, 2009

Virus scanning with File::Scan::ClamAV

This is almost ridiculously easy.

Problem: a bunch of user directories need virus scanning and per-user reports

Solution: A Perl script using File::Scan::ClamAV

Prerequisites:

A unixy OS
Perl 5.8 or 5.10
A functional and running clamd, preferably listening to a socket
The module File::Scan::ClamAV (and its dependencies)

The following code could (after some sensible adjustments) run in a loop through all usernames on your system.


use File::Scan::ClamAV;
# (...)
# $dir contains the full path of the user's directory
my $av = new File::Scan::ClamAV (find_all => 1,
                                 port => '/tmp/clamd.socket');
# find_all means that we wish to recurse directories.
# /tmp/clamd.socket is where my clamd has its socket.
# Other clamd configurations may differ.

unless ($av->ping) {
    plogdie "clamd isn't running, aborting virus scan";
} else {
    plog "Performing virus scan for $uname";

    # Save virus information per username ($uname).
    # Note! scan() returns a hash.
    $a_viruses{$uname} = $av->scan($dir);

    if ($a_viruses{$uname}) {
        my @vfiles = sort keys %{$a_viruses{$uname}};
        plog "$uname has ".@vfiles." viruses.";

        # Home assignment: print contents of $a_viruses{$uname}
        # to a file, using the sorted list @vfiles.
    }
}

Monday, June 15, 2009

Frequently freaky freakin' one-liners

So, hey, I'm sitting here without anything good to blog about, probably like most people on the net.

I'm wondering what daily Perl usage that's even vaguely useful that I do, which could be improved upon.

Ah, of course, triple-f one-liners!

As a tool, the perl command often seems to replace a jungle of echo + egrep + cut + tr + sed + awk and whatnot. perl -nawe and ctrl+r (reverse i-search) in bash are good friends of mine, but after using the same one-liners a few times in a row, I usually end up converting them to tidy files with Getopt::Long, comments and other insanities.

And at some stage later, I say to myself: damnit, I should've coded this more generally, I start a recode, get distracted, solve a new problem with one-liners, and the circle of life goes on.

Do I need professional help?

Monday, June 8, 2009

DateTime performance hit

This is mainly in answer to the comment to my previous post on revisiting plog.

plog is a piece of code ready for copy+paste into whatever codebase you're in at the moment, and almost completely agnostic. Yes, in code where I'm already using DateTime, then I use those routines.

You could also create a small sub to hide the "nasty idiom".

Or you could use Date::Format instead (ca. 10 kB codebase instead of 110 kB, and between one and two orders of magnitude quicker for this particular purpose):

require Date::Format;
@lt = localtime();
$dt = strftime("%Y-%m-%d %T",\@lt);

Of course, that leaves you with the problem of whether the TimeDate packages are installed or not, and another test for that.

Besides, the bloat is not exactly insignificant if you have code that's running repeatedely.

The following example with 100,000 repetitions may seem ludicrous, but I actually have production code where timestamped log entries run into that order of magnitude. And yes, I would very much like to save that extra time.

Edit 2009-06-09 00:32 UTC: thanks to Dave Rolsky for the simplified testing code, and to Ilmari for reminding me of POSIX::strftime, which I'd previously rejected based on other people's claims that it was dog slow. I've removed the home-grown tests in favour of the results from a slightly modified version of Dave's sample test script.

Here are the numbers for 100k iterations, times derived from the rates:

Solution	Time	Rate
DateTime	490.2 s	204/s
DateTime (cached tz)	76.7 s	1 304/s
Date::Format	6.4 s	15 528/s
POSIX	3.3 s	30 030/s
localtime	0.6 s	163 934/s

Sunday, June 7, 2009

Print-and-log revisited

A month ago, I made a post with a simple print-and-log subroutine called plog.

I was recently asked two questions about this piece of code, and I'll answer them briefly now:

"What's up with the curly braces on a separate line after the sub declaration?"
"Why don't you use DateTime?"

Okay, those should be easy to answer while retaining the illusion of clue:

That's merely a personal preference, it quickly aids me in noticing whether something is a subroutine or a control statement block. I am aware that many Perl programmers disagree, and prefer all blocks to start in the same way regardless of what kind of block it is.
DateTime is an external module which, believe it or not, is not installed on all systems. Some people would also argue that it's bloated and slow. But in case you want to use DateTime instead, and/or check whether it's available run-time, then ... well, see below.

sub plog
{
  my $msg = shift;
  my $level = shift;
  my $dt;
  eval {
    require DateTime;
    $dt = DateTime->now(time_zone=>'local')->strftime("%F %T");
  };
  if ($@) {
    my @lt = localtime;
    # Format current datetime manually:
    $dt = sprintf("%d-%02d-%02d %02d:%02d:%02d",
                  $lt[5]+1900,$lt[4]+1,
                  $lt[3],$lt[2],$lt[1],$lt[0]);
  }

(...)

As you can see, there isn't much code saved by using DateTime, even if we know it's already installed and don't need to add paranoia. The method of extracting data from localtime() is well-known, proven, and fairly nice on resources. Use DateTime if you want to, but perhaps it's best to save it for when you need to do more complicated stuff.

Saturday, May 30, 2009

Coding styles that make me twitch, part 2

I'll make today's post short, because it's all about line length.

Limiting the length of code lines is something I try to be good at, not because I think the next guy will have a VT100 terminal and needs a friendly piece of code, but because of basic readability.

When we read an ordinary text, e.g. in a book (you remember books, right?), there is usually quite few characters printed on each line. Here are a few from Douglas Coupland's JPod:

the door to see that all my new furniture was gone, and my
original furniture hadn't come back. Fuck. I phoned Greg,
but realized he was on Cathay Pacific 889, headed to Hong
Kong. I phoned Mom.

Notice how the text area is even narrower than that in this friendly blog?

This comes from a long tradition in printing, it isn't as if they couldn't have squeezed twice as many words in there, if the print was only smaller. But if the print was smaller, they'd probably make the book narrower as well.

It's far easier to read something when you don't have to move your eyes around too much from line to line. This is important to both slow and quick readers.

So, back to coders who make lines of 100+ characters:

What in friggin' Ruritania are you thinking? Not much, that's what.

*GROWL*

Friday, May 22, 2009

Querying quotas with Quota.pm

I never claimed that this blog would be an exercise in expertise. ;)

Prerequisites: Unixy OS with quotas enabled, Perl 5.8.8, Quota.pm 1.6.3.

I always like to provide some sort of progress display for my programs. As a sysadmin, there are times when it might be prudent to check users' files without inspecting them by hand, e.g. when checking for parasitical exploits in websites. The number of files and/or amount of disk space used seem like reasonable measurements for keeping track of that progress.

So you use Quota; and get coding, right?

Except that you may not know beforehand whether you're scanning a local file system, or a remote file system, and Quota.pm requires that you have a magical device identifier before asking what the quota is.

The manual says that you should do something like this:


my $r_uid; # User's real UID
my $dev = Quota::getqcarg($directory);

my @quotadata = Quota::query($dev, $r_uid);

Now we've got some nice quota data, in the following order:

Current blocks used, block soft limit, block hard limit, soft block time limit, current inodes used, inode soft limit, inode hard limit, soft inode time limit.

But no, there's a catch! If you run this on the local file server, rather than via NFS/RPC, then Quota::query() will barf, because $dev is erroneous. How did that happen?

Well, Quota::query() doesn't work if the device is local!

So we have to do this after calling Quota::getqcarg():


if ($dev =~ m{^/dev/}) {
    $dev = "127.0.0.1:$directory";
}

The irony is then that there appears to be a need for an RPC listener on the loopback device, at least.

Anyway, I hope this is useful; it helped me to make tings Just Work.

Friday, May 15, 2009

Coding styles that make me twitch, part 1

We'll see how long this particular series gets.

I'll try to come up with some example of coding styles that annoy me, and post about it.

First off is the appended conditional at the end of long one-line Swiss knife code snippets:


my @var = sort { length($b) <=> length($a) } split /[-.,_+ ]/ , $input{longvariablename} if defined $input{longvariablename} && length $input{longvariablename} > 4;

(Yes, that's supposed to be one line, though it doesn't look like it.)

Please, pretty please, don't append conditionals at the end of long one-liners.

Really, just don't, m'kaaay.

Code should, unless it's a one-off one-liner in your shell prompt, be maintainable for others. "Others" includes yourself some time in the future, when you've forgotten what the (insert mst-inspired expletives here) you were thinking at the time you coded the stuff.

The above example isn't particularly complex, or difficult to understand, but it's all on one line, and hardly is easy to parse even if you've got that 170 char wide window to code in.

A few parentheses, a helper variable and a few more lines -- preferably keeping well within 80 columns -- surely won't hurt that badly.


my $lvn = $input{longvariablename};
if (defined $lvn && length($lvn) > 4) {
    my @var = sort { length($b) <=> length($a) }
              split (/[-.,_+ ]/, $lvn);
}

Of course, these are just my personal opinions, and I won't be knocking on your door with a baseball bat in hand if you don't do as I suggest.

Thursday, May 7, 2009

Simple print-and-log subroutine

I find that I have a use for this almost all the time. It's a silly little set of subs, but in my role as sysadm, I often need to go back and see what all those printed messages were.

So here is my not-so-elegant workhorse for when I need to stuff things into logs, and shuffling modules isn't an option. I hope it's useful for someone else, too.

Dependencies, assumptions and prerequisites:

Perl 5-ish
Pre-defined global variables:
- $level - log level (undef = normal, 1 = warn, 2 = die)
- $logfile - a logfile that we can append to
- $msgprefix - a program or subroutine specific prefix
- $verbose - whether to print to STDOUT
Preferably disabled output buffering

Usage:

&plog("Log this");
&plogwarn("Warn about and log this");
&plogdie("Log this and die");

The code:


sub plogwarn
{
   my $msg = shift;
   &plog ($msg,1);
}

sub plogdie
{
   my $msg = shift;
   &plog ($msg,2);
   exit 1;
}

sub plog
{
   my $msg = shift;
   my $level = shift;
   my @lt = localtime;
   # Format current datetime sensibly:
   my $dt = sprintf("%d-%02d-%02d %02d:%02d:%02d",
                    $lt[5]+1900,$lt[4]+1,
                    $lt[3],$lt[2],$lt[1],$lt[0]);
   warn "$dt $0: sub plog: No message!\n" unless defined $msg;
   unless (open(F,">>$logfile")) {
       warn "$dt $0: sub plog: Failed to open logfile ($logfile) for write.\n";
   } else {
       print F "$dt $msgprefix$msg\n";
       close F;
   }
   if ($verbose) {
       unless (defined($level)) {
           print "$dt $msgprefix$msg\n";
       } elsif ($level == 1) {
           warn "$dt $msgprefix$msg\n";
       } elsif ($level == 2) {
           die "$dt $msgprefix$msg\n";
       }
   }
}

Thursday, April 30, 2009

Schwartzian Transform - Perl 5 vs. Perl 6

This isn't quite news, but it's a cool little bit of code anyway.

Perl 5:


@sorted = map  { $_->[0] }
          sort { $a->[1] cmp $b->[1] }
          map  { [$_, foo($_)] }
               @unsorted;

Perl 6:


@sorted = @unsorted.sort: { .uc };

I'm willing to claim that Perl 6 makes this a bit more readable, in spite of the smoke, mirrors and curtains.

Read more about the Schwartzian Transform in Wikipedia.

Monday, April 27, 2009

Perl 6 - how to get started

Are you curious about Perl 6, and wonder how to get started?

Use proto, Carl Mäsak's Perl 6 installer (which will download parrot and rakudo for you).

Just make sure you have Perl 5 (5.8.8 or 5.10.0), git and svn installed first!

jani@knuth ~/prog/proto >./proto

*** CONFIG FILE CREATED ***

Greetings. I have created a file 'config.proto' that you may want to review.
Next time you run './proto', these settings will be used to bootstrap your
system into a working Perl 6 installation.

If you're new to this, or if configure settings scare you, you probably want
the default settings anyway. The most important ones are:
Rakudo   -> /ping/knuth/home0/jani/prog/rakudo
Projects -> /ping/knuth/home0/jani/prog

jani@knuth ~/prog/proto >./proto
Downloading Perl 6...downloaded
Building Perl 6...

This part may take a while; parrot is now building stuff for you, and when it's finished, you can run the perl6 binary:

Building Perl 6...built
jani@knuth ~/prog/proto >cd ../rakudo
jani@knuth ~/prog/rakudo >ls -F
CREDITS       MANIFEST  Test.pm  parrot/  perl6.o    perl6_s1.pbc  tools/
Configure.pl  Makefile  build/   perl6*   perl6.pbc  src/
LICENSE       README    docs/    perl6.c  perl6.pir  t/
jani@knuth ~/prog/rakudo >./perl6
> sub sum (*@numbers) { return [+] @numbers; }; say +(sum <1 2 3>)
6

See? Easy! Now get testing!

Tuesday, April 21, 2009

Enlightened Perl Iron Man Competition

Well, if this isn't inspiring ...

This week, I hope to dedicate some time to what I said I was going to do at the NPW 2009 Hackathon, which was to spec complex number representation and presentation.

Thanks to pmichaud, I've at least confirmed my suspicions on which parts of the spec to do my changes in.

Maybe I fail, maybe I don't, but I think Perl 6 is definitely worth the effort.

Sunday, April 19, 2009

Group Photos

LR:
ilmari, nothingmuch, claes, szabgab, krunen, ingy, baest, masak, TimToady, Ellen\
trafl, jrockway, batman, pnu, jnthn, pmichaud, abigail, Gloria\
mst, sjn, mberends, marcus, sadrak
Photographer: frettled

And, finally, as prompted by Ingy:

NPW 2009 Hackathon

The NPW 2009 Hackathon is well into its second day, and I've learned quite a bit about Perl 6 and more about Perl 5 than I expected.

I didn't originally intend to participate in the Hackathon, and I'm not doing much, but it's definitely worth it.

Yesterday, I stated a goal of adding to the spec a description of how complex numbers should be represented and presented, since that was apparently at least partially unclear.

This lead me into a quagmire of other things I needed to do first, and the "ooh, shiny!" phenomenon lead me astray a few times, ganged up with my general desire to have a functional working environment (Unicode strings in PODs didn't go down well with my current Latin 1-based working environment and the then-installed Perl version), as well as problems getting irssi on my side
But I did get to participate and act as distraction in an entertaining and useful discussion regarding types in Perl 6, as well as host a social dinner for those who wanted a break from the hackathon.

Today, I'm picking up where I left off, trying to form a mental picture of the spec that's good enough to add and/or change relevants bits of it.

The rest of the guys are adding code and doing other useful things. :)