Solving classic NES games computationally

Dr. Tom Murphy VII gave a research paper called "The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel . . . after that it gets a little tricky," (PDF) (source code) at SIGBOVIK 2013, in which he sets out a computational method for solving classic NES games. He devised two libraries for this: learnfun (learning fuction) and playfun (playing function). In this accompanying video, he chronicles the steps and missteps he took getting to a pretty clever destination.

learnfun & playfun: A general technique for automating NES games (via O'Reilly Radar)


24 Responses to “Solving classic NES games computationally”

  1. Mark Dow says:

    The learned Tetris strategy is brilliant, a direct refutation of Searle’s Chinese room argument ( for mind-information processing dualism.

    • retchdog says:

      i agree that it’s amusing, but how does it refute the chinese room? i think the chinese room is stupid in the first place, but i don’t see how this is a good argument against it.

      • Mark Dow says:

        The machine knows the meaning of the symbols in the room — to beat the game, use the pause symbol. And we humans watch the output and also understand (if we are trying to learn Tetris/Chinese). Who but Searle could argue that the program does not understand something of Tetris/Chinese?

        •  Or you could say that it only knows that by pressing the start button it freezes the objective function and that all other inputs have a negative effect on that function. It doesn’t ‘know’ what pause it just that it has a way to not decrease the objective function.

        • retchdog says:

          As Tyler said more politely, anyone who understands what an objective function and a search algorithm are would argue that the program doesn’t understand tetris.

          The reason why this is just a chinese room is, simply, that it’s a formally defined system. End of story. The fact that you find it surprisingly capable is, philosophically, meaningless. I’m impressed by Tom7′s work too.

          However, a good intuitive argument for why this work is, in fact, a chinese room is that if it were to play a new version of tetris, one which was coded independently of the one used in the training but identical in design, it would have to learn everything from scratch because the memory map of the new tetris would almost certainly be completely different from the old one. That is, it won’t generate a meaningful analogy between the two, as humans do automatically.

          The fact that this isn’t a counterargument to the chinese room is exactly what’s wrong with the chinese room. By removing all consideration of the mechanism in its hypothetical, the chinese room begs the question and is basically unachievably narrow.

          • Mark Dow says:

            ” …anyone who understands what an objective function and a search algorithm are would argue that the program doesn’t understand tetris.”

            Not anyone. I don’t believe there is a qualitative difference between the algorithmic and what I do.

          • retchdog says:

            okay. to put this another way, it’s using a lexicographic ordering to search the move space. the program is literally trying every combination in sequence (the sequence is kinda clever, but the program doesn’t know that) to see which one gets the most points.

            i wouldn’t call that understanding if a human did it, and i don’t call it understanding if a machine does it.

            you’re capable of that, yes, but as a human being you’re also capable of much more.

          • Mark Dow says:

            Yes, I’m good with that. Much more, but not qualitatively different. Is an algorithm.

            I would argue that the playfun algorithm’s ‘look into 40 futures’ capability is much more than mine.

          • retchdog says:

            yes, of course, but is it understanding?

          • Mark Dow says:

            Ask it. Like a twitter char limit, the question must be < 2k bytes and the answer is deterministic.

          • retchdog says:

            thanks. i’ll take that as a no.

    • merreborn says:

      The Chinese Room comparison is definitely apt — most attempts at building software to play video games involves attempting to build far more semantic context: the software has a concept of game state — it models enemies/hazards, etc.  On the other hand, this software only identifies a handful of variables, and attempts to find an input sequence that increases those variables.  It has no notion of game state.  It just pushes buttons until the score goes up.

  2. Wow haven’t seem Tom7 on the internets in ages.  Loved his VSTs back in the early 2000s, also created two Album-A-Days  Glad to see he’s still doing the crazy stuff.

    • allenhaft62bf says:

      my buddy’s aunt makes $70/hour on the laptop. She has been unemployed for 5 months but last month her check was $15418 just working on the laptop for a few hours. Read more on  Jive8.c­om

  3. Michael Banck says:

    The paper is dated April 1st, though?

    • Lindsey Kuper says:

      “Hi! This is my software for SIGBOVIK 2013, an April 1 conference that usually publishes fake research. Mine is real! It’s software that learns how to play NES games and plays them automatically, using an aesthetically pleasing technique.” —

  4. stumo says:

    The other factor I suspect is that in Tetris the next block is random (I assume?) so any learning pattern that doesn’t take into account at least the current block will fail (and one as good as a human will look at the next block too).

    • retchdog says:

      afaik, the nes doesn’t have a hardware random number generator or even an external clock, which means that all of the randomness comes from player input, which is then mapped through chaos.

      this means that a search algorithm like this one could, in theory, play in such a way as to manipulate the RNG to give it the best shapes. :)

    • Ryan Cousineau says:

      Tetris is…funny. First, to be clear, the NES version he tested against apparently uses a truly random pattern. 

      But modern implementations of Tetris are built against reference rules that  are defined by the Tetris Company, yes really. Part of those rules is the “Random Generator.” From the Tetris wiki:”Random Generator generates a sequence of all seven one-sided tetrominoes (I, J, L, O, S, T, Z) permuted randomly, as if they were drawn from a bag. Then it deals all seven tetrominoes to the piece sequence before generating another bag.”

  5. Mohammady Mahdy says:

    This is just brilliant :) love the pause!

  6. Try it on Kazio Mario World:

  7. space says:

    There are actually a lot of humans who are dumb enough to use that same Tetris strategy to avoid losing.

    Ever played Mario Kart DS online ? ” If I can’t win, nobody will. *disconnect* “

  8. Jellodyne says:

    Interesting algorithmic gymnastics.

Leave a Reply