I had two gigs this week that both involved emergency analysis and repair on web applications that had been around for a few years without much trouble, only to blow up right around the time that a critical demo or business evaluation was about to take place.

Of course, in both cases, the app didn’t blow up just then – after looking into things, it became clear that things had been broken for quite some time.  One slipped off the radar because it was effectively on hiatus (no current customers, no active marketing, and while there was an active signup page, no new “random” signups, though as it turned out there might have been attempts but the app was broken,) and the other had a certain set of buffer mechanisms so a failure wouldn’t get noticed for some time.

As it happens, both broke after a server move, even though both were supposedly tested afterwards, but this can also happen after an OS update, a change to a third party API (here’s something else that came up: Salesforce passwords can expire!) or other changes not directly linked to the application source code, and these kinds of errors generally won’t hit the home page, which is where a lot of supposed uptime monitors check the “health” of the site (usually with an HTTP request that triggers an alert if it doesn’t return with a 200 (OK) response code.)

For your consideration, a simple list of things to do to prevent “old age” outages:

Written test plans. Yes, they’re a pain to write.  Yes, they’re a pain to update.  Know what?  They’re the easiest things to delegate.  You can give a proper test plan to just about anyone, inside your group or remote, and they can run through it.  If your app has a decent amount of end user activity, this should only be necessary prior to an update (to the app or the server environment) since you’ll hear about outages quickly enough (though I’ve been amazed on some B2B applications how quiet customers can be,) but if your app is parked or only a percentage gets used in day to day operation, schedule a test run regularly.

Automated testing. As I mentioned above, most uptime monitoring sucks.  You can find out that your home page is loading (and I’ve seen apps that break the home page but still return OK) and you can monitor disk space, CPU load, etc, but why not take advantage of the modern UI testing tools like Selenium and have something hitting specific parts of your production website on a regular, scheduled basis? Note that this isn’t a substitute for actual documentation – if the whole team leaves, the new gang is going to have a hard time figuring out what the app is supposed to do in “normal” use, and sadly, might not be proactive in finding out (I was paid well this week simply because someone put an app in production without the first clue of how it worked or what resources it required.)

And really, that’s pretty much it to get started – I’ve worked with a number of clients over the years, and most don’t even do this stuff.  There’s obviously more that can be done, but if mere baby steps could be taken beyond “home page loads” I’d be happy for now.

I had a need to add some extra content to a WordPress home page for visitors from Google who were searching for something in particular.  In past CMS systems, I’d just put some code in the template to check the HTTP_REFERER server parameter and emit my custom message directly, but this site was using WP-Cache, which I’m pretty sure doesn’t care about the referrer field, so the end result would have been, depending on the moment the cache was filled, either everyone got the message or nobody did, which doesn’t really help things.

I went for something quick and dirty that might not catch everyone, but it’s more than enough to meet my immediate goals – you’re welcome to expand on my solution, and hey, leave a comment if you do!

I opted to do it in Javascript, since that would get handled by the client’s browser regardless of caching.  Basically, I grabbed the HTTP_REFERRER server field from the document.referrer property in Javascript (which might not be supported by all browsers, which is why we default to not showing the message to play it safe.)  Here’s the code:

<script type="text/javascript">
  if(document.referrer != '') {
    var params = document.referrer.split("&");
    for(var i = 0; i < params.length; i++) {
      if(params[i].length > 3 && params[i].substr(0, 2) == 'q=') {
        if(params[i].substr(2) == 'your_keyword_here') {
          document.getElementById("id_of_your_div").style.display = 'block';
        }
      }
    }
  }
</script>

Somewhere else on the page you’ll have a div with an inline style=’display:none;’ in it with the id that you’ll fill into the id_of_your_div spot above.  As I mentioned earlier, we default to display:none because there are some browsers that won’t be able to handle this for various reasons, so it’s better not to show them the special-case message if there’s a chance it won’t apply to them.

A few notes:

  • You can use this technique for any referrer, not just a specific search term on Google – for example, a “Welcome, Digg user!” banner if you get Dugg.
  • I don’t actually check if it’s coming from Google, because I figure some other referrer with a “q=” parameter is probably another search engine and so I lump the treatment all together.  Your needs may vary.
  • My comparison is case-sensitive, because it’s a single word term and clearly I’m too lazy to add a “.toLowerCase” to the end of my first term.
  • When you’re testing this, you might not notice results at all, and that’s very likely because your page is still cached (I don’t know if WP-Cache checks the template file date or not; I did mine with a hook in the Thesis framework [affiliate link] and the cache overrode anything I changed.)  To get around this, just edit and update (without changing) the first post on the home page and the cache will reset.

Windows phone 7Earlier this week I had an opportunity to catch a Windows Phone 7 briefing by Cory Fowler at the January Metro Toronto .NET User Group meeting.  As someone with mobile experience primarily in the iOS space but with strong ties to the Microsoft platform it was interesting to note the contrast between the two systems.

Historically, my opinion has been that Windows phone development has always seemed to target the hobbyist who has a day job using Microsoft tools – prior to app stores becoming all the rage, I knew many developers who yearned for one of the limited options running (at the time) Windows Mobile because they could make it do stuff.  Of course, very few actually bought a phone (usually an iPAQ) and far fewer still actually made anything even approaching a Hello World app.  I think there are a lot of reasons for that, but it doesn’t matter right now.

Now that there are so many stories about indie developers striking it rich (or winning the lottery, depending on your viewpoint) with the Apple app store, indie developers on the Microsoft side of the fence seem to have renewed incentive to get things started, and here’s the thing: that seems to be the only market Microsoft’s been targeting with their evangelism.

Sure, they did a lot of work pre-launch to get major developers on board with ports of their platforms, including music matching app Shazam and local heroes Polar Mobile, but there seems to be a massive segment missing here, and since it’s the one where I make most of my money, I think it’s kinda worth noting.

If you’re developing Windows Phone 7 applications on behalf of a client, either as a branded promotional app or a simple outsource job, getting feedback and approvals from clients sounds like it’s going to be hell.

Here’s how it works in the iPhone/iPad world: there’s this thing called ad hoc deployment, where you send me your device ID, I add it to the application as a tester (and I can add 100 of these IDs,) and I can just email you the files.  You drag the files into iTunes, sync your phone, and boom, you’ve got the app running.  You can try it out, give me feedback, and most importantly say “yes, this is great, ship it, and here’s a cheque.”

On Windows Phone 7, unless I’m missing something, it works a little differently.  If you want to load an app I send you to your phone, you need a developer account, which costs $99 a year.  And you have to “unlock” your phone (not like a GSM unlock; this just puts it into a developer mode which your carrier may or may not use as an excuse not to support you if your phone later has problems for any reason.)  And you need special software to load the app on.

In my experience, clients who commission phones don’t have the skills to do this reliably.  That’s not a knock against them – they’re good at other things like figuring out that they need a mobile app, and there’s a reason they hired me.  So then there’s option 2: they can come to my office (I hope they work nearby!) and I can either show them my phone or I can load the phone for them (which still requires unlocking but at least I can do the technical stuff) – but there’s a catch, in that you can only have 3 physical phones in your developer profile.

These restrictions are apparently in place to prevent people from just loading apps onto phones themselves without the app store, but in my opinion it kills a large sector of the app market – forget about outsourcing overseas, for example!

This is version 1 of the system, so hopefully this will change, but then there’s the chicken or egg issue that the phone line might get discontinued before it gets the critical mass of apps needed for a consumer-level success.  Don’t think it won’t happen – Microsoft killed their last mobile phone initiative pretty much at launch, and Paul Thurott’s got some interesting insights on Microsoft’s near-term Windows deployment strategy (via Gruber) that suggests 7 might be the highest number we ever see.

On the plus side, I think that people familiar with Microsoft tooling are going to have an easier time making compelling apps in a hurry, which means the total development cost for Windows Phone 7 applications should be lower than an equivalent app on iOS.  The phone platform seemed a little rough to me (it’s more or less competitive with iOS 3.2, I reckon, which is almost 2 years old) but Visual Studio is a great environment with greater access to 3rd party libraries than Apple’s XCode.

Now if we can just make apps that have been properly tested.

Password management with outsourced work

by admin on December 16, 2010 · 0 comments

The past week’s Gawker password mess is as good an opportunity as any to talk about password management when you’re outsourcing work to 3rd parties.  As a programmer for hire, I’ve seen just about every questionable practice you can imagine, and I’d like to think that it’s because I’m so darned trustworthy, but some of these incidents have happened a little too early in the relationship for my liking.  Here are a few tips for your company, whether you’re the outsourcer or the outsourcee.

It’s not about trust

No matter how trustworthy the person you’re giving the keys to might actually be, things happen. Maybe they’ve set their browser to remember passwords and then someone gets access to the computer through theft or temporary use.  Maybe they’re logging in on open WiFi and your service doesn’t use SSL, as demonstrated recently to be a huge problem.  Maybe they’ve got a virus that captures any password input on any web page.  There are lots of ways a secure password can get compromised without any actual action or intent.

Every service gets a unique password

Gawker’s breach also highlighted a common problem: people use the same passwords on many different sites all the time, or they use a consistent scheme like “add the number 55 to the end of the site name.”  With apps like 1Password, there’s no reason not to use as many passwords as you have logins anymore, and there’s also no reason to make them easy to spell.

Every service gets an account, where possible

If you’re using a service that lets you authorize multiple users to the same system, by all means, create a user account for each person who needs access, and if there are secondary access controls like privilege levels, only give out as much power as is needed to get the job done.  With a shared login ID, it’s hard to tell who did what in the event of problems, and it’s much more disruptive when you need to revoke access to the whole team rather than just one user.

When I’m granted access to a service by a client, the first thing I do is create my own credentials and then recommend they change the master password to something else.  That way, I’m covered after I leave the project: I assume (granted, sometimes it doesn’t happen) that my account’s been revoked, so any incidents that happen later aren’t anything to do with me.

Keep a list

With services that have multiple accounts like I mentioned above, this is easy, but for everything else, keep track of who has access to the master login and password.  You’ll want this for business continuity purposes in the event that you have to change the password (see below, but I’ve also heard a decent argument that by changing the password and only telling it to people who ask you’ve got a good way to clean house.)

Adhere to a change schedule

You should do this for your own passwords, but also for any shared accounts.  You want to minimize the exposure window in the event that there is a breach at some point.  Remember, if someone gets your password, depending on the service it might be more profitable to just sit back and read without changing anything that would tip you off.

Consider the worst case scenario

With all services, what’s the worst thing that could happen if someone got access who didn’t have your best interests at heart?  How could you change your business to reduce the impact of this scenario?  How could you monitor to make sure that you catch the problem as soon as possible in the event of an incident?

Avoid email

Try not to send logins and passwords by email.  There are lots of gaps in the chain where someone could get at them, and the messages tend to hang around longer than necessary.  I’m sure I’m not the only one who used to have a “passwords” mail folder (I don’t do this anymore…)

If not email, then how do you communicate them?  Without resorting to encryption, over the phone or in person is good, or you could invent some kind of mechanism to deliver it in parts over different services.  Honestly, I haven’t seen a technique that I really like yet, so I’m open to suggestions (PGP and other encryption schemes are still too tricky for most users.)  I like some of what I see here, and there’s a good suggestion in there to require users to change the password after they log in so no records are left (though obviously this doesn’t work with shared logins.)

Speaking of email, if your email account password is compromised, you’re in a heap of trouble, since most services send password reset messages via that account, which means if someone has access to your mail, they own your life.  You should change these passwords fairly often.

Do as I say…

Do I follow all of these rules?  Personally, not always.  I’m human. For clients, I try to follow protocol all the time – for their sakes and for mine.

Security covers a lot of areas, but basic access control at the password level is a good place to start that there’s really not much excuse for avoiding.  If you’re not following at least the above tips, I urge you to start, and be wary of any 3rd party that seems to think these ideas aren’t important.

How to update a Joomla component

by admin on December 4, 2010 · 0 comments

JoomlaI’m doing some work for a client who has a number of Joomla-based websites, and he’d been hacked.  My usual tricks for figuring out the vector of attack failed me (the site had already been cleaned up by another firm, who may have disturbed my crime scene,) so all I was left with was the usual “find what holes I can, plug them, and wait for another intrusion to get more clues.”

Thanks to the work of @jeffchannell I didn’t have to look far for a starting point: the sh404sef SEO component had some vulnerabilities that go way, way back (Jeff tracked it to 1.0.20, but I saw it in a 1.0.11 install.)

This was my first Joomla component update for this client, so I took the opportunity to document the process for anyone new to Joomla work (or for myself, when I google “how to update a Joomla component” in 3 months…)

Get the site local

Do not update the site live and see what happens.  It’s going to be a mess.  Download the entire site and database dump, and set up a local website as a sandbox testing area.  Then put it in source control so you can revert easily while you try out your changes.

You’re going to want to pay attention to the PHP version in your sandbox.  If the Joomla version you’re using is before 1.5.15, PHP 5.3 won’t work, so you may need to install a different version on your development box, ideally the same as in production.  Yes, the next thing we’ll be doing is a core Joomla upgrade, but I want to spend a few more days getting to know how the site works before tackling that.

You’re also going to have to make some changes to your configuration.php file.  In theory you could change your host file to trick your computer into thinking that the production URL lives there, but you’ll want to be able to compare the test site and the live site, so to do that on the same computer you’ll have to at the very least update the $live_site variable to your local site URL.  For component updates you won’t be uploading the configuration, so this isn’t a big deal, and frankly, uploading the whole site from your test area isn’t wise either, in my opinion – read on…

Create and document a procedure

Try the update in your sandbox, writing down each and every step.  This is so you can replicate the process in production.  That’s right, you don’t want to update the site locally and just upload the whole thing, you want to do the process twice.  Why?  Maybe it’s a debatable thing, but your update might change something in the database you hadn’t thought of, or the component might update something remotely, or some kind of side-effect might show up that you can’t explain other than “it works on my dev box…”  Also, if you document the process you’ll be able to adapt and reuse it on other sites, which will save a ton of time, and someone else can do the deploy if you write it up properly.

Test test test

Having a local site means you can compare everything between dev and prod, and having the site in source control means you can revert as many times as you need (remember to store your database dump in source control too so you can revert that at will) to repeat a step in the process.

Prior to deployment, it’s a good idea to setup a virgin version of the sandbox again from your original downloads and try the process one last time – it’s amazing how many “oh, just patch that on the fly” events happen during an upgrade, and if you missed one it’ll break everything.  When you’re doing this deploy, work directly from the plan you wrote out, and pay close attention or you’ll find yourself skipping ahead.  You’re testing the component but you’re also testing your plan.  Even if you never use it again, this is a very important process that not enough developers adhere to – it’s brought joy to so many operations people I’ve worked with over the years, and that’s for a reason.

Deploy

Finally, run the procedure on your live system after backing up the whole site and the database.  Be sure to put the site in maintenance mode during the update so you can be sure that there’s no client impact (other than the site being offline) and that the database doesn’t get confused midway through the install from user activity around the component you’re replacing.  This is also key in the event you need to back out (replacing the site with your backup,) so no data is lost.

Lather, rinse, repeat

That’s how I do my upgrades for Joomla and just about everything else.  The sandbox setup is reusable, but you’ll want to re-download the full site and database every time you plan work to account for anything that might have changed on the site via public or other admin users.  It might seem like overkill, but in my honest opinion it’s a sign of a mature development cycle and the cost to do this is minimal compared to the cost of fixing problems on a live site after they happen.

Do you want this kind of care and attention for your website? Hire me!

QuickTime poster frames and aspect ratio

by admin on November 22, 2010 · 2 comments

QuickTime XWhen you’re embedding a QuickTime movie into a web page, you have to specify the width and height, and if it’s a movie that’s fully under your control, that’s not a problem, but if this is at all part of a CMS where users can and will upload things that don’t fit the specified design, you’re going to have a player that doesn’t look right.

The way to fix this is to use the scale parameter in your quicktime video embed.  If you set it to “tofit” then the video will fit the bounding box specified by the width and height parameters, but it’ll also stretch and/or squish the video to cram it into that box, which can be less than ideal.

And alternative is to use a value of “aspect” for your scale parameter, which will respect both the bounding box and the aspect ratio of the video.  It won’t necessarily fit the box exactly, but it’ll look OK without breaking the rest of your page’s design.

HOWEVER.

If you use what’s called a “poster frame” in QuickTime, you can load an image that displays before the video plays, like some kind of “click to start” image, and when the user clicks the image, the movie starts.  This is done by setting the src parameter to the image file and adding an href parameter that’s the URL of your actual movie.

All great, except for this: the video won’t respect the scale parameter or the width and height.  It’ll cram itself into the box, but if it’s bigger than the box then the controller won’t show, nor will chunks to the left, right, and bottom of the video.

If you want to use a poster frame for your movie, you’re going to need to go outside of the plugin.  In my last project, I put an image in the markup with the poster image, and attached an onclick handler to it that removed the image and replaced it with the QuickTime embed on autoplay, which did the trick.

And yes, alternatives are to use a Flash player, HTML5 where supported, a back-end job to resize the movie file, and so on, but sometimes you’ve got to go with what the client wants.

Adding a prompt to select_tag in Rails

by admin on October 22, 2010 · 0 comments

Rails has a number of nifty form helpers,  but the select tag can throw you off your game at first.

For starters, there are two different sets of form helpers, one for models (for use in an edit form, for example) and one for standalone forms (perhaps in a filter.)  The FormOptionsHelper select method works great for models.  Here’s an example from the documentation:

select("post", "person_id", Person.all.collect {|p| [ p.name, p.id ] }, {:prompt =&gt; 'Select Person'})

Now, if you don’t have a model, you’re stuck with the select_tag helper in FormTagHelper:

select_tag "results_per_page", options_for_select([25, 50, 100])

The catch is that select_tag doesn’t have a handy :prompt option, which is key for validation.  In the example above, you could just add it to the array, but if you’ve got your options stored in your model or some other array, it might not be obvious how to add the prompt.

Since options_for_select takes an array, all you need to do is do a little array arithmetic (extracted from a HAML view, but erbists should get the idea):

- sample_options = [['Option 1', '1'], ['Option 2', '2'], ['Option 3', '2']]
= select_tag "sample", options_for_select([['Select...', '']] + sample_options)

If you want to keep your code cleaner and avoid repetition, you can also hack select_tag like Taryn did (I haven’t tried it,) but I think the simple technique above is useful to help grasp the basics of what’s going on in the helpers.

Contractor smell: no source control

by admin on September 23, 2010 · 0 comments

photo of a safe by rpongsaj

When your code is locked up, who benefits the most?

When I get involved in a project, it’s usually to clean up a mess in an are I haven’t worked before. One of the first things I need to do is grab the source code, which for web projects means either logging into the client’s source control system or pulling the site content down via FTP.

Guess which one I prefer?

OK, it’s a trick question, because unless the system has a fully automated deployment scenario (and some way to ensure it’s enforced) I don’t have a guarantee that what’s in source control is what’s on the server.  Still.

Source control gives me a few edges that speeds up my work:

I can see the history. Code that’s more than a few hours old usually has gone through a few revisions, and in the age of object-oriented design and MVC, code for one change is often spread (sometimes badly) across a lot of files.  I can get a much better sense of the scope of a feature if I can track how it was added and later maintained, which means I don’t have to take as much time doing forensic work and instead can do the work I’m being paid for.

I can show my progress. If there’s ever a dispute about how I spend my time (which hasn’t happened to date, but I work remotely almost all the time so there’s potential) I have logs of what I’ve been doing.

I can prove I didn’t wreck stuff down the road. The work I do is often of the rescue variety from the aftermath of a bad outsourcing experience or something that went off the rails.  The next person to touch the code might be of the same level as the previous people, and it’s easier to point fingers than to accept responsibility for bad code.  If I’ve got logs of everything I’ve changed, that can be handy.

It doesn’t surprise me much that I usually don’t get source control access when I step into a code rescue project: there usually isn’t any, or if there is, it’s not up to date.  Keeping in mind that I tend to get involved when things are going badly, here are a few key reasons source control doesn’t get used:

(Hint: these are pretty much reworded versions of my “pro” reasons above)

History is more valuable if you’re the only one who knows it. If you’re insecure about your abilities, you might think that your domain knowledge is your job security, because if you hide things from your client, it’ll be way more expensive to bring a new person on than to keep giving you the work.  Personally, I think my ability to document and log work is a greater asset, because – and this is a paradox – I always work in a way that makes me replaceable, which ends up making me indispensable.

The client can see the lack of progress. Did you spend all your day playing Farmville but still need to pay the rent?  Not a problem if you can tell a good story.  Of course, days without a checkin kind of wrecks that story, and the source control log is your worst enemy in this case. I don’t want to paint everyone as dishonest – it’s also a project management nightmare when you can’t see any actual progress in a work item, and I’ve seen developers make up all kinds of stories about progress that didn’t pan out just to cover something they’re stuck on.

It’s harder to blame someone else. If there’s no log, and the client isn’t a programmer, you can always blame the last guy when something goes wrong.  Granted, that’s what I tend to do, but that’s because I’m being brought in because the last guy screwed up, and I don’t need a log for that.  But if there’s no record of what the last person did, there’s just the new guy’s word.

Get with the program

Seriously, if you’re not using source control in a professional engagement, there’s something wrong, and if I’m doing my job right, you’re going to get found out someday.

It doesn’t matter what’s in place when you sign on – if I’m involved in a project that doesn’t use source control, then surprise, it does now!  I’ll set up a local repository, both so I can deal with my own mistakes but also so I can get the benefits I mentioned above (I usually have a clause saying I’ll hold on to source code for x days after sign off to reduce setup costs later in case I’m needed again.)  The repository might be public to the client, or it might not (most of the people paying the bills aren’t on a technical level where they care about this directly, but they appreciate when I can pull logs if needed) but the key is that I’m operating as if it’s a shared public repository – which it might be at some point if I work with the team long enough.

Back when I was in a management role, I used to have amazing interviews with potential team members who would only have source control experience if some job made them do it.  I’d like to think that things have improved over the past few years, but I suspect it hasn’t based on my contract work this year.

If you think you don’t need to do it, I offer this as a final reason: eventually, you’re going to get asked why you don’t, and it’ll be in a context where the answer is going to cost you money.

(Photo by rpongsaj)

Storing phone numbers in the database

by admin on September 22, 2010 · 0 comments

Pizza Pizza

What? Testing makes me hungry.

I had a weird code rescue project the other day where the client was complaining that they couldn’t update profile phone numbers.  My test data for phone numbers is usually 416-967-1111 (a Canadian pizza chain,) and it was working fine.  I thought maybe there was an issue with brackets or dashes or something, but every permutation saved fine for me.

Then I looked at the schema.  The phone numbers were being stored as integers.

Ah.

There’s nothing wrong with storing numbers as numbers, and phone numbers are numbers, so let’s review the positives here: the storage requirements in the database are smaller than they’re going to be with string storage (char or varchar,) and it lets you take out the formatting information so that could, in theory, be moved to the presentation layer (in this case, it wasn’t, but let’s not dwell.)

On the other hand, you have to format the numbers every time (phone numbers in a web app are basically display-only, though mobile web has some interesting bits coming up,) and there’s an important gotcha that the previous developer missed:

Numbers typically don’t pad left with zeroes.

In other words, “001 234 89439″ is going to be stored as 123489439. (No, I don’t understand international phone numbers. It’s an example, OK?)  This means that your parsing code can’t make assumptions about the length of the number if there’s a chance that zeroes will be at the beginning.

In this case, it got worse, and explained why I couldn’t recreate the problem: the code was parsing the individual components of the phone number, so for North American numbers (it knew the region already) it would break it down to 1-xxx-yyy-zzzz (with optional extensions added to the end) so a number stored as 4169671111 would render as 1-416-967-1111.

Except.  The zeroes.  In each section, now.

Basically, the parsing was using something like sscanf($number, ‘%1d%3d%3d%4d’) (this was in PHP) to break it down and then there was some concatenation on the elements.

Unit testing would have been helpful here (I ended up adding some,) because then you might have noticed that 14160380042 would ultimately get displayed as 1-416-38-42, because the leading zeroes get parsed but not displayed.

Anyway, between some badly needed unit tests and changing each %d to a %s, the numbers displayed correctly.  What’s neat here is that everything was in fact saving fine, so no data was lost – it just didn’t render correctly, which was a relief to the client.

I’ll be honest, I usually store phone numbers as text (I rarely work on systems with billions of phone numbers, so the space hit isn’t a big issue compared to other optimizations I could make,) and then I can strip and parse things as needed while keeping the original input as entered (it’s similar to how I like to store HTML unfiltered, but that’s a story for another day.)  After some reflection, storing the phone number as an integer isn’t as big a red flag as I thought it might be, but you definitely need to watch for the edge cases that your typical test cases (pizza, anyone?) won’t catch.

Many languages have a decent way to assign or output multiple lines of text without actually making you concatenate a bunch of strings and newlines. They’re usually called “here documents” or heredocs, and in Ruby, it works line this:

    expected = <<EXPECTED
                <object width="500" height="411">
                  <param name="wmode" value="transparent"></param>
                   wmode="transparent" width="500" height="411"></embed>
                </object>
EXPECTED

(That’s from a test I was working on involving modifying an embed, but I took the long lines out for formatting reasons. I’m not sure it’s the best approach to testing, but at least there was a test – thanks, autotest!)

In the above case, I’m saying “assign this string, and keep going until you see the EXPECTED symbol. I could also have used “expected = <<-EXPECTED” and the dash would have told Ruby that I want to indent the closing token. You know, for looks.

But wait! That code has a bug! True story – it won’t run as I’ve typed it here, even though it looks totally valid.

The closing token can’t have any trailing whitespace.

If it does, you’ll get a “can’t find string ‘EXPECTED’ anywhere before EOF (SyntaxError)” error. And of course most editors aren’t set up to show invisible tokens, so you might find yourself wailing and gnashing your teeth if you haven’t run into this error before (I blog about my simple errors often, and I know they make me look a little dense at times, but it helps ensure I won’t do it again, and hopefully someone else who doesn’t have the benefit of a peer review will find it and save some time as well.)

After (finally!) figuring out why my code wasn’t working, I checked to see if the code would work if the trailing spaces matched on the setup and closing tokens – i.e. if there were two spaces after both instances of “EXPECTED” – but then you get a different error about how the first line didn’t end like it was supposed to.

Interestingly, in BBEdit, which is my editor of choice, the syntax colouring gets all screwed up when there are spaces, because BBEdit knows how heredocs are supposed to work even if you don’t, so there’s a clue too.

Ruby also has some features involving multiple heredocs starting on the same line, and some fun stuff with quotes around the tokens, but this isn’t the definitive Ruby heredoc post, just a “get out of heredoc whitespace hell” post that might prove useful to some of you.