# Mechanical computer uses matchboxes and beans to learn Tic-Tac-Toe

James sez, "I just completed a working build of Donald Michie's MENACE (Matchbox Educable Noughts And Crosses Engine), an early (1960) example of machine learning. MENACE uses 304 matchboxes to play Noughts and Crosses (or Tic Tac Toe in the US) - and learns over time to play it better. I built it for a talk at the UK games conference Playful, about Awesomeness and Miracles, particularly focussing on the work of Charles Babbage - and culminating in a surprisingly large version for playing Go..."

MENACE is a machine that plays noughts and crosses, built out of 304 matchboxes. Each matchbox corresponds to one of the 304 board layouts that the opening player might face (there are actually 19,683 possible board layouts, but we only need to calculate the opening player's first four moves, and many are rotationally or reflectively identical). In turn, each matchbox contains a number of glass beads corresponding to each possible next move. When it is MENACE's turn to play, the operator simply selects the matchbox corresponding to the current state of play, shakes it, and opens it to see which move has been chosen. Each matchbox contains a small nook into which one bead falls--and MENACE plays in the square corresponding to that bead.

But what's really clever is that MENACE learns. Every time it wins a game, an additional bead is added to each matchbox played, corresponding to each winning move. Likewise, every time it loses, a bead corresponding to each losing move is removed. As a result, over time, MENACE becomes more likely to play moves that have previously resulted in wins and less likely to play moves that have resulted in losses.

A New THEORY of AWESOMENESS and MIRACLES Being NOTES and SLIDES on a talk given at PLAYFUL 09, concerning CHARLES BABBAGE, HEATH ROBINSON, MENACE and MAGE

### 22

1. Klaus Ã†. Mogensen says:

IIRC, that’s the method used in Fred Saberhagen’s first Berserker Story “”Without a Thought” (a.t. “Fortress Ship”) from 1963.

2. Church says:

Wasn’t this method used in one of Saberhagen’s Berserker stories? IIRC, it was used to teach a sort of alien monkey the rules of a game so that it could defeat a Berserker (the plot points that made this a necessity escape me.)

3. Felix Mitchell says:

Eventually each box will only contain one type of bean, which will represent perfect play.

So this approach only works for games where a perfect strategy exists… and surely Go is not one of those? My mind boggles at how many matchboxes you’d need for Go.

4. Tynam says:

@Felix: Actually, Go is one of those.

Formally: Go is a zero-sum finite perfect
information game; all such games have an equilibrium point in pure strategies.

Skipping the game theory jargon: Go has a perfect strategy, and in principle it’s findable by MEGoE, the Matchbox Educable Go Engine. Of course, Go is very deep and would require very many matchboxes. So we have no idea what the winning strategy is. But there is one. Either Black can always win, or White can, or both players can guarantee a draw.

5. Felix Mitchell says:

Tynam:

I’m guessing you’d need a shit ton of beans too – since each matchbox must contain enough beans for each subsequent path to be tried once.

The first matchbox would need as many beans as there are games of Go.

1. octopod says:

a hill of beans indeed. number of boxes would be ~10^170 vs number of atoms in the universe 10^81

6. chazlarson says:

I made one of these for the elementary school Math Night a couple years ago. The kids had fun with it. Virtually no one believed that a bunch of matchboxes would learn in short order to be unbeatable, but it was fun to watch it happen as the kids took turns playing games against it.

I learned about it from a Martin Gardner book.

7. MattF says:

I guess I’ll be Mr. Wet Blanket here. What’s the big deal? Given that tic-tac-toe is very finite, any plausible strategy with a memory will eventually figure it out– and in less than one human lifetime. In fact, this particular strategy looks to me to be a good candidate for ‘least optimal method that will always work in a finite period of time’. Am I missing something?

1. SamSam says:

Um, I don’t think it’s supposed to be that surprising to you. More that it’s surprising to kids that a bunch of matchboxes might be able to “learn” a game.

It also teaches the basics of machine learning, the idea that “memory” might be implemented using beans, and even a rudimentary idea of an evolutionary algorithm (the original set with all the beans represents a large genetic pool; the fitness is the ability to win the game; and the system evolves to find the perfect solution).

Also, it’s like a mechanical computer. A difference engine. What’s not to like?

8. k012957 says:

I remember making one of these while in Jr. High school in the late sixties. Only I remember it being a four-by-four grid with pawns. Unfortunately, I don’t remember what magazine it was in.

1. SamSam says:

I remember this being applied to a different game, and it may also have been the one that k012957 referred to. Actually, over the years I’ve kept remembering it and wishing I could recall how the game was played.

Basically, the game was a simpler version of tic tac toe, so one didn’t need to make 304 different match boxes in order to make a learning machine. Instead it was a project that probably only required 20-30 matchboxes, making it much simpler for an after-school group or kid on a rainy day to make and learn about.

If anyone can recall the simpler version of the game, I’d love to hear it.

9. Anonymous says:

I am very grateful to whoever posted this…

My scholarship is personal and self directed and I sometimes take a long time to discover what I’m looking for. I have imagined but failed to produce something quite like this for years…

why bother? because this method is extensible. If that doesn’t tickle your brain then take up gardening or something, because the promise(threat?) of machine learning speaks for itself to me. This example is of unusual explicative quality because many people can puzzle out the strategy of TTT, connecting to a ruleset which needs no teaching.

I’m still hoping to actually get a peek at one of these old paper computers I sometimes hear of as well.

It all seems anchored around the experience of reading Magister Ludi while I was out to sea once…

10. Anonymous says:

The Fred Saberhagen story mention is WIthout A Thought also published under the alternate title Fortress Ship. The story appeared in a number of publications including the collection BERSERKER and in the Prentice Hall Anthology of Science Fiction eBook.

Joan Saberhagen

11. krs says:

I built something this when I was in elementary school around 1972. The school library had a copy of the book We Built Our Own Computers. Their matchbox computer used glass beads to play a game of Hexapawn. I don’t remember much else about the book, except for the drum memory made from an oatmeal carton and metallic tape.

12. Omir the Storyteller says:

Martin Gardner described a similar “computer” that learned a game called “Hexapawn.” There’s a Wikipedia article on Hexapawn, but it doesn’t mention the matchbox computer that he described in conjunction with the game.

13. SamSam says:

Thanks guys. I think Hexapawn is definitely the game that I first saw discussed in this context.

14. andygates says:

MENACE is very, very sneaky. Nice work!

(Makes me want to build Lego logic gates…)

15. Anonymous says:

I remember something similar called Socrates and Mr. Hound, involving a deck of cards where each suit was a cardinal directional move for Socrates the fox, and Mr. Hound plodded along a square track near the edge of the board. Whenever Mr. Hound caught Socrates, the last n cards would be removed and reshuffed. Eventually, Socrates would learn to outwit Mr. Hound. I built it out of Lego in the early 1970’s.

Anybody remember what book this was from?

16. Anonymous says:

“there are actually 19,683 possible board layouts”
Wrong. 19,683 is 3^9, but it includes impossible things like:
XXX
XXX
OOO
and so on
And MENACE will fail at the very beginning in case of correct moves folowed by the wrong ones: correct beads will be removed as well as wrong beads, so where will no possibility to make correct move (correct beads absent) in the future