November 06, 2006

Iron Puzzler

I've been extremely busy lately-- too busy even to blog. But not, apparently, too busy to devote an entire weekend to a new Seattle puzzle event, Iron Puzzler. Inspired by Iron Chef and a similar event in the Bay Area, in Iron Puzzler all the puzzles are created by the competing teams themselves. Four secret ingredients were announced at 9 AM on Saturday: CLOCK, L, MERCURY, and SPOON. Each team had 15 hours to create one paper puzzle and one non-paper puzzle, each using at least one of the theme ingredients. Sadly, neither Alton Brown nor Will Shortz was on-hand for color commentary ("I see a bunch of ones and zeroes on the challenger's side, I believe he's going to turn that into Morse, a fairly traditional but versatile preparation.").

Fourteen teams participated, meaning each team had to turn in 30 copies of the paper puzzle (2 per team, including the organizers) and 15 copies of the non-paper one. Then, at midnight (as it turned out, closer to 1 AM), the puzzles were distributed and the 15-hour solving period began. Teams scored points for each puzzle solved (with a small bonus for being one of the first three teams to solve each puzzle), and each puzzle earned its creators points according to how many teams solved it (with the sweet spot at 10 teams). At the end of the event teams also awarded each puzzle points on quality/fun, presentation, and use of the ingredients.

Creating a puzzle by committee under time pressure proved to be a challenge. Our group largely focused on the physical puzzle first, because we knew we'd have no trouble putting together a paper puzzle. Meanwhile one member of the team went off on her own and produced a terrific paper puzzle independently. In the end, only two teams solved our physical puzzle (which was a solid puzzle that we overstreamlined, removing a couple of hinting elements that we should have left in place) while everyone solved the paper puzzle and voted it their favorite. A rather disheartening result, actually, since it undermines what for me is the most interesting aspect of the event-- creating puzzles collaboratively. These results suggest that we'd be better off splitting into individuals or pairs and developing puzzles independently, then coming together to test them and pick the two strongest. That's a less interesting challenge than producing puzzles as a group. On the other hand, we could also interpret the results as an indication that next time we should value elegance less than solvability and internal hinting.

As expected, there were a lot of puzzles that involved periodic tables and clock faces, but surprisingly little semaphore. I think most teams thought as we did: "Oh my God, CLOCK-- everyone's going to do semaphore, so let's do something else." Interestingly, that kind of reasoning resulted in two spoonerism puzzles presented as bags of plastic spoons with parts of spoonerisms written on each spoon. We also got no less than four crosswords (one involving spoonerisms!), but sadly none of them cryptic.

The overall quality of the puzzles was surprisingly high. One of the things I found most interesting about the event was the different interpretations of the ingredients. Quite a few puzzles incorporated the periodic table or other chemical elements, for example, but none focused on the planets or car manufacturers. Analog clocks were prevalent, but nobody made a digital clock puzzle. And regrettably, nobody did a superhero battlecry puzzle ("Spoooooon!"). As a constructor, it also seemed we could leverage the fact that everyone would view every puzzle through the CLOCK/L/MERCURY/SPOON filter, enabling them to make leaps that might otherwise seem unfair.

The event was put together in about a month. It was an undeniable success, but the rough edges show. The scoring system in particular needs some massaging. The number of teams solving a given puzzle does not seem like a good basis for awarding points. The premise that a puzzle with 10 of 13 teams solving it is more desirable than one solved by all teams is specious. I'd argue that a puzzle that everyone solves is more desirable than one only 75% of teams solve. The catch is that you want puzzles to be a challenge-- you don't want everyone to solve it immediately. Nobody wants to see teams submit a 6 letter anagram and call it done. But I don't think we need an entire scoring vector to capture that. If a puzzle is too easy or too hard, teams can reflect that in their ratings. If a puzzle is just right in difficulty but not fun to solve, teams can dock it points.

This event was created to fill a gap formed by the delay of the next MS Puzzle Hunt, but it was intriguing enough in its own right to warrant repeating. Since it requires far less effort to organize than traditional events, the chance of that happening seems high. In fact, if someone creates a back-end that allows teams to register their puzzles and answers and then allows teams to submit answers to solved puzzles electronically, organizers could play along, too. Allez cuisine!

Posted by Peter at November 6, 2006 02:29 PM
Comments

"The premise that a puzzle with 10 of 13 teams solving it is more desirable than one solved by all teams is specious. I'd argue that a puzzle that everyone solves is more desirable than one only 75% of teams solve."

Agreed. The 10-of-13 thing actually led me into grievous folly. We, mostly at my urging, made our puzzles much, much too hard--my theory was that since a puzzle with 5 solves scored just as well as a puzzle with 13, and since the narrowed field consisted of the strongest Hunt teams, it was safest to err towards difficulty. Wrong, wrong, wrong. Wrong for us and wrong for the poor teams who had to hack at our stupidly hard puzzle.

Making an electronic version of this event is a very, very interesting idea though...

Posted by: Stephen Beeman on November 6, 2006 10:50 PM

No autos or planets? You must have played in a different event than I did. The multi-tiered word search featured both. The spinny spoon thing featured the astronomical symbols of the first six planets. Our crossword, which had L as the primary focus, used Mercury as four or so of the themed entries. Our solar-system set of campaign posters used Mercury as the starting point; one of the four paths took solvers through the nine planets.

It's surprising you only got two solves for your clock thing. It truly was a beautiful puzzle. We got hung up for a while trying to pair up answers differently before hitting on the L connection. For instance, we thought that FANTASY and JEOPARDY went together, which gave us something completely different. After only finding two such pairs, someone finally hit upon the correct strategy.

Posted by: Derek on November 6, 2006 11:58 PM

I think I did play in a different event! I didn't work on the word search, the spoon sculpture, or the campaign posters, and I never had time to read through all the answer sheets for the puzzles I didn't do. I *DID* solve your crossword, but MERCURY was used as a clue four times, each time resolving to a different interpretation of the term. I'd say that crossword leveraged the (ahem) mercurial nature of "Mercury" rather than focusing on any one aspect. Had each theme entry resolved to a model of Mercury automobile, that'd be entirely different.

Posted by: Peter on November 7, 2006 01:28 AM

I was going to say, mercury as a planet was very important to the spoon sculpture puzzle, as you had to order the planets in distance from the sun.

As for your physical puzzle, I was really, really in like with it on the first step. Dug the whole clock thing, looking for the Total Recall and the like. We ate up the first step and then hit a giant wall. What now? How can I use the clock again to recurse? I was convinced that the number of clues was not a coincidence, that we would recurse twice and end up with a single item that was the answer. We started with 27, and consumed 3 items with each answer, so it seemed perfect. The solution I read seems to go off in a pretty different direction, just adding letters to each individual word. I see that, and kept coming back to Iron Chapel and trying to recurse. We eventually gave up and cursed it. Reading the solution, I find the puzzle itself to be very elegant and a nice go at it. I needed a hook into the next step.

Posted by: Dave Heberer on November 7, 2006 11:22 AM

"The premise that a puzzle with 10 of 13 teams solving it is more desirable than one solved by all teams is specious."

Perhaps that's true in our local puzzle community. I don't know whether that that's the case in the puzzle community where the original idea came from, and we didn't have a good feel for whether it would apply here. It would no doubt take a couple iterations to tune this event more accurately for the local crowd.


Having said that, I'm not fully convinced that there's a need for this to be a regular event. PuzzleHunt allows a far greater number of people to participate, in a form that doesn't require an application to get in. Because there is more time for the organizers to create and test puzzles, there are likely to be fewer bugs in the puzzles, which ultimately should lead to less frustration all around. (Please don't take that as criticism of this weekend's puzzles though - when you're given only a few hours to create two puzzles, it's not reasonable to expect two dozen perfect puzzles.) And requiring people to have puzzle-writing experience is something of a Catch-22 bar to admission; you can't get in if you don't have experience, but it's hard to get experience if you don't have opportunities to write puzzles for general consumption.

I'd certainly play (or hope to play) if someone ran another Iron Puzzler around here, but I wonder whether it might be more desireable to channel that effort into making the twice-yearly puzzlehunt rotation happen, rather than trying to throw a new event into the mix.

Posted by: Jessica on November 7, 2006 06:11 PM

while i appreciate the effort that went into iron puzzler, the time, people, and effort required to run a puzzlehunt is staggering in comparison. it would be much easier to have one puzzlehunt a year and a few iron puzzlers. there aren't that many groups that can produce dozens of puzzles, but clearly there are many groups that can produce 2.

Posted by: dana on November 9, 2006 08:57 AM
Post a comment