November 6, 2006

Iron Puzzler

I've been extremely busy lately-- too busy even to blog. But not, apparently, too busy to devote an entire weekend to a new Seattle puzzle event, Iron Puzzler. Inspired by Iron Chef and a similar event in the Bay Area, in Iron Puzzler all the puzzles are created by the competing teams themselves. Four secret ingredients were announced at 9 AM on Saturday: CLOCK, L, MERCURY, and SPOON. Each team had 15 hours to create one paper puzzle and one non-paper puzzle, each using at least one of the theme ingredients. Sadly, neither Alton Brown nor Will Shortz was on-hand for color commentary ("I see a bunch of ones and zeroes on the challenger's side, I believe he's going to turn that into Morse, a fairly traditional but versatile preparation.").

Fourteen teams participated, meaning each team had to turn in 30 copies of the paper puzzle (2 per team, including the organizers) and 15 copies of the non-paper one. Then, at midnight (as it turned out, closer to 1 AM), the puzzles were distributed and the 15-hour solving period began. Teams scored points for each puzzle solved (with a small bonus for being one of the first three teams to solve each puzzle), and each puzzle earned its creators points according to how many teams solved it (with the sweet spot at 10 teams). At the end of the event teams also awarded each puzzle points on quality/fun, presentation, and use of the ingredients.

Creating a puzzle by committee under time pressure proved to be a challenge. Our group largely focused on the physical puzzle first, because we knew we'd have no trouble putting together a paper puzzle. Meanwhile one member of the team went off on her own and produced a terrific paper puzzle independently. In the end, only two teams solved our physical puzzle (which was a solid puzzle that we overstreamlined, removing a couple of hinting elements that we should have left in place) while everyone solved the paper puzzle and voted it their favorite. A rather disheartening result, actually, since it undermines what for me is the most interesting aspect of the event-- creating puzzles collaboratively. These results suggest that we'd be better off splitting into individuals or pairs and developing puzzles independently, then coming together to test them and pick the two strongest. That's a less interesting challenge than producing puzzles as a group. On the other hand, we could also interpret the results as an indication that next time we should value elegance less than solvability and internal hinting.

As expected, there were a lot of puzzles that involved periodic tables and clock faces, but surprisingly little semaphore. I think most teams thought as we did: "Oh my God, CLOCK-- everyone's going to do semaphore, so let's do something else." Interestingly, that kind of reasoning resulted in two spoonerism puzzles presented as bags of plastic spoons with parts of spoonerisms written on each spoon. We also got no less than four crosswords (one involving spoonerisms!), but sadly none of them cryptic.

The overall quality of the puzzles was surprisingly high. One of the things I found most interesting about the event was the different interpretations of the ingredients. Quite a few puzzles incorporated the periodic table or other chemical elements, for example, but none focused on the planets or car manufacturers. Analog clocks were prevalent, but nobody made a digital clock puzzle. And regrettably, nobody did a superhero battlecry puzzle ("Spoooooon!"). As a constructor, it also seemed we could leverage the fact that everyone would view every puzzle through the CLOCK/L/MERCURY/SPOON filter, enabling them to make leaps that might otherwise seem unfair.

The event was put together in about a month. It was an undeniable success, but the rough edges show. The scoring system in particular needs some massaging. The number of teams solving a given puzzle does not seem like a good basis for awarding points. The premise that a puzzle with 10 of 13 teams solving it is more desirable than one solved by all teams is specious. I'd argue that a puzzle that everyone solves is more desirable than one only 75% of teams solve. The catch is that you want puzzles to be a challenge-- you don't want everyone to solve it immediately. Nobody wants to see teams submit a 6 letter anagram and call it done. But I don't think we need an entire scoring vector to capture that. If a puzzle is too easy or too hard, teams can reflect that in their ratings. If a puzzle is just right in difficulty but not fun to solve, teams can dock it points.

This event was created to fill a gap formed by the delay of the next MS Puzzle Hunt, but it was intriguing enough in its own right to warrant repeating. Since it requires far less effort to organize than traditional events, the chance of that happening seems high. In fact, if someone creates a back-end that allows teams to register their puzzles and answers and then allows teams to submit answers to solved puzzles electronically, organizers could play along, too. Allez cuisine!

Posted by Peter at November 6, 2006 2:29 PM