Plagiarism claims roil chess coding scene

Rybka, a powerful chess program, was stripped last year of its titles and its author publicly disgraced. Declared a plagiarist by the International Computer Games Association, Vasik Rajlich was also handed a lifetime ban on competition and ordered to return thousands of dollars in prize money. But the investigation's conclusions are now being challenged, opening a fissure in the computer chess community.

Debate centers on chess-playing algorithms found both in certain versions of Rybka and another program, Fruit. Both programs emerged in the mid-2000s, outpacing established competitors in short order. But while Fruit appeared first, it was Rybka that came out on top, claiming world championships from 2007-2011 and forging a path to commercial success.

The rancor shows how traditional ideas of plagiarism blur when a development community is built around a set of technical problems so specific it's nigh-impossible to avoid following the leader—and where a limited market makes open source a dangerous place to put cutting-edge ideas.

Backed by a technical report in which Rybka's binary code was disassembled to reveal its inner workings, the ICGA found in June 2011 that Rajlich infringed others' copyright and failed to disclose the origins of his software on competition entry forms.

"Not a single panel member believed him innocent," the report stated. "Vasik Rajlich's claims of complete originality are contrary to the facts."

The ICGA, founded in 1977, organizes international computer chess tournaments and publishes a trade journal. Since last year's findings by its 34-member panel, however, furious exchanges between Rajlich's critics and supporters have turned some of chess programming's popular forums into war-zones. The imbroglio culminated last week in a 31-page rebuttal published by Chessbase, which sells Rybka at its online store.

"The ICGA's findings were misleading and its ruling lacked any sense of proportion," wrote its author, Dr. Søren Riis, a computer scientist at Queen Mary, University of London. " … It is clear that Rybka is an original program by any reasonable standard."

In his report, published in four parts, Riis argues that the investigation exaggerated the infraction's significance and was informed by fabricated "recreations" of Rybka code presented as the real thing. Riis also suggested that the ICGA's actions served some panelists' commercial interests.

"Rybka competitors, individuals with obvious conflicts of interest, and individuals who had publicly expressed their predetermined conclusion of guilt were allowed to join the investigation." Riis wrote "… This attitude, I think, is a classic example of losing the plot."

Within days, ICGA president Dr. David Levy issued a fresh rebuttal, insist that Riis' objections missed the mark.

"[Riis] does nothing in my view to make the case for a miscarriage of justice to have taken place," Levy wrote. "It is, put simply, biased reporting."

The issue is muddied by the highly specific nature of the chess-playing code at hand, which seems to hover at the point where ideas and algorithms meet the copyright-protected expression of computer software. For his part, Rajlich continued to insisting that his work on the latter was entirely original, while saying that he had never denied using Fruit's ideas in his work.

"This document is horribly bogus," Rajlich said of the original report. "All that 'Rybka code' [in it] isn't Rybka code, it's just someone's imagination."

If its critics are right, the ICGA hung Rajlich on a vague technicality to serve the interests of his commercial rivals, even though many of them may have likely benefited from similar practices.

"I don't think Dr. Levy was in an easy situation," Riis said. "His organization would vanish without chess programmers, and most of the active chess programmers in his organization wanted him to take action against Rybka."

If the ICGA is right, however, the most powerful program in the game is built on a fraud, by a man who misled his peers and parlayed others' open-source work into a proprietary and profitable empire.

"How would we view an Olympic athlete found guilty of taking performance enhancing drugs if he performed superbly, winning races by huge margins, breaking world records and taking gold medals?" writes Levy, in his most recent missive. "Would he be forgiven his drug taking just because his performances were outstanding? No, of course not!"

Dr. Levy said he would be unavailable for further comment this week, but planned to do so soon.

YOU SURE DON'T TALK LIKE A MACHINE

Chess is a simple game of marvelous depth. Though it uses a small set of deterministic rules and is played out on a board of just 8×8 squares, even the most powerful computers cannot see far into the maze of possibilities that fan out from a complex position.

As a result, computers cannot use brute tactical force to defeat a human master's capacity for long-term strategy and pattern recognition. Instead, heuristics are required, in the form of code used to explore possibilities, recognize powerful piece formations, and so on. Chess programs are exquisitely tuned to discard unlikely lines and to ignore material values in favor of gambits and subtle plans.

It's one such heuristic in Fruit—its algorithm for evaluating the strength of a chess position—that early versions of Rybka allegedly stole.

Fruit, created by Fabien Letouzey, emerged in 2004, and was initially an open-source project, its code clearly visible for others to read. It placed second in the 13th World Computer Chess Championship in Reykjavik in 2005—a striking result that put it in contention with top-flight commercial engines such as Fritz, Shredder and Junior.

Enter Rajlich, a U.S. and Czech citizen born in 1971 in Cleveland, Ohio. Though he attained International Master status, Rajlich realized that he would never reach the game's top echelons. Determining instead to be the world's best chess programmer, he set out in 2003 to create an engine able to compete with the greatest.

An early version performed poorly, finishing near the bottom in a crowded 2004 tournament. But at the 5th International Paderborn Computer Chess Championship, in December 2005—a few months after Fruit's winning debut—the strength of Rajlich's program increased dramatically. Within a year or so, Rybka was the strongest chess machine in the world.

"Rybka did not merely win nearly every tournament it entered," Riis wrote. "It won them with a near-90% success rate. It is difficult to overstate the degree of superiority that the Rybka team exhibited…"

It took a while for others to accept the new normal. In 2007, according to the New York Times, a clash between Deep Junior and Deep Fritz was billed as "The Ultimate Computer Challenge", even though by then Rybka was clearly superior to both.

And there were suspicions almost from the beginning, stoked by the evident similarities between Fruit and Rybka and the fact that the latter was a closed-source product, its code unavailable for inspection.

Rajlich, who now lives in Warsaw, Poland, always insisted that his work does not include "game-playing code" written by others: "Rybka is and always was completely original code, with the exception of various low-level snippets which are in the public domain," he said in 2007, responding to early rumors.

It stayed on top until the appearance of Robart Houdart's Houdini, which defeated Rybka in a series of 40 games in February 2011. The match-up led to renewed scrutiny of both programs, as Houdini itself was apparently derived from Ippolit, an engine claimed to be derived from Rybka. The claimant? Vasik Rajlich himself.

If all this sounds to you like quite a tangle, you're not alone. On Feb. 19, 2011, Levy published an article titled "Attack of the Clones", in which he complained about the growing trend of ripped-off chess engines. Of those identified, however, "The Rybka-Fruit Case" took center stage. Levy cited Rajlich's as an example of a "sophisticated cloning effort" in which efforts were made to "obscure the original source of the algorithms." Shortly thereafeter, Letouzey and 15 other chess programmers published an open letter asserting likewise.

"By using and deriving code, data and structure from Fruit 2.1, Vasik Rajlich was able to make dramatic and huge progress with 'his' program Rybka to the detriment of his fellow competitors," wrote the accusers. "In our view this has made competitions involving Rybka grossly unfair. As chess programmers we find this overwhelming evidence compelling. We believe Rybka is a Fruit derivative albeit an advanced one."

The ICGA investigation was soon underway.

EXIT. DEPART. LEAVE.

It's easy to check a computer program for plagiarism if you have access to the source code. As Rybka isn't open-source (Rajlich claims to have not kept the early versions of Rybka's source code, in any case) such a straightforward comparison is impossible. The only evidence, beyond Rybka's habits at the chessboard, is the data locked away in the compiled, executable app.

Disassembling Rybka does not get you an exact rendition of the original code. It does, however, reveal the cutting-edge heuristics which determine the machine's playing strength. This inspection was accomplished by Mark Watkins, a Research Fellow at the School of Mathematics and Statistics at the University of Sydney. And according to Watkins, it revealed the evidence of algorithm-cloning that the ICGA sought.

"We are convinced that the evidence against Vasik Rajlich is both overwhelming in its volume and beyond reasonable question in its nature," wrote the authors of the report, published June 28, 2011. "Vasik Rajlich is guilty of plagiarizing the programs Crafty and Fruit, and has violated the ICGA's tournament rules"

The sactions disqualified Rajlich and Rybka from each World Computer Chess Championship they had competed in; banned Rajlich for life from all ICGA events; and awarded the earlier championship prizes to the runners-up. Rajlich was told to return his trophies and prize money.

NO DISASSEMBLE

Supported by technical work by programmers Ed Schröder and Sven Schüle, Riis now offers several objections to the ICGA's conclusions.

First, he claims that the parts of Rybka found to be similar to Fruit were relatively insignificant, and that the tournament rule Vajlich allegdly violated is vague and obsolete.

Pointing out that Rajlich admitted using Fruit's ideas from the beginning—such as in a 2005 interview where he reports that he scrutinized the Fruit source code "forwards and backwards and took many things."— Riis says that any infraction of the rules could not possibly have been an intentional effort to deny Rybka's pedigree.

Riis also claimed that the status quo in chess programming is just as Levy feared it, with disassembly and algorith-borrowing so widespread that Rybka, as leader of the pack, is more victim than perpetrator.

His most forceful criticism, however, centered on "fabricated" C code included in Watkins's technical report. Presented as "Rybka", the code is in fact a recreation based upon the disassembled machine code.

"The ICGA used a report relying on hypothetical code presented as Rybka code to convince others that copying occured, when only the algorithms could have been copied," Riis said.

Riis also speculated that Rajlich was being attacked due to the competitive success of his work, and that the investigation's unauthorized disassembly of it was itself clearly illegal: "The whole process was an unprofessional disgrace."

Within a week, the ICGA responded to Riis's report, in the form of a letter penned by Levy and a further technical analysis from Watkins.

Levy pointed out that, all else notwithstanding, Rajlich still failed to declare Fruit on the competition entry forms, thereby breaking the tournament rules. He also reaffirmed the strength of the forensic case against Rajlich, as revealed by Watkins and others, and said that the panelists knew the report's recreated code was merely for illustrative purposes.

"He greatly minimizes the magnitude of Fruit/Rybka overlap", Watkins wrote, and "fails to address a number of additional Fruit/Rybka congruences that were detailed by the investigation."

"Riis omits any mention of the fact that Rajlich had previously plagiarized Crafty in private 2004 versions of Rybka, and furthermore that these versions had little internal similarity to the 2005 Rybka. The latter fact played a significant role in the Panel deliberations, strongly implying 2005 Rybka was a re-write, at the least."

Riis admits that Watkins uncovered some problems with his response, but remains unmoved: "Spin or rhetoric cannot change the fact that the most basic conditions for a fair trial were absent and the process by which Rajlich was convicted was seriously flawed."

FRANKIE, YOU BROKE THE UNWRITTEN LAW

Gone are the days when human champions could evoke the tension of the cold war. But the chess world's exceptionally intense personages are still not to be trifled with.

In their long, footnoted reports, the ICGA and Riis segue from complex chess programming issues to insults and back again without skipping a beat. In his four-part feature, for example, Riis embarks on a detour just to mock one Rajlich critic for his relentless torrent of forum posts on the matter. The latest "critical analysis" published by the ICGA likewise finds time to ridicule Riis for his "bleating."

And what of Rajlich himself? Usually willing to let others argue it out, he recently offered thanks to Riis and other supporters, and said 2012 would see the release of Rybka 5.

"Soren did a great job detailing the shenanigans pulled during the ICGA's investigation, from stacking the jury to premature public accusations to a comprehensive fabrication of evidence," he wrote at the official Rybka website. "Of course there is a clear influence of Fruit on Rybka. I haven't tried to quantify this influence or compare it to other engines from Rybka's generation. What I can say is that Rybka is original at the level of source code. In the context of source code, original means that the author either typed his own code or typed the code which generated his own code. For the super-geeks, yes, that can be applied recursively."

This pattern, whereby Rajlich's critics tend to speak of "algorithms" while he speaks of "source code", seems to reflect each side's understanding of copyright law's relevance to the case. Whereas the ICGA considers the algorithms used in chess software to pass tests of copyright-worthiness (explicitly mentioning the Abstraction-Filtration-Comparison Test), Rajlich instead focuses on their expression, insisting that his code is original—an approach that might prevail in a U.S. courtroom, if not the ICGA star chamber.

"The implementation of ideas and algorithms learned from other programs is universal practice in chess programming," Riis wrote, later adding via email: "You can certainly copyright code containing a formula or an algorithm, or a writeup containing these, but if someone goes in and just steals your formula or algorithm sans code, you can't claim copyright protection."

This distinction notwithstanding, the ICGA's conception of authorship clearly extends further down, to the algorithms. This is understandable, given that chess is a game with few rules, played under time pressure, where small, efficient programs are the currency of success. Even tiny details may confer an insurmountable advantage on the board.

In this respect, chess algorithms may be similar to typefaces, recipes and dance choreography: realms largely unprotected by copyright law, but where moral authorship is central to the professional credibility of those who make a living there.

But if there's one thing both sides seem to agree on, it's that authorship can be a somewhat vaporous thing in the world of chess programming. The ICGA ingestigation, after all, was heralded by Levy's exasperation at the amount of illicit "cloning."

"[It] not only damages the commercial opportunities for the original programmers, it also steals the kudos of tournament successes," Levy wrote. "Genuinely achieving a great result in a top level chess tournament requires
years of painstaking effort by a highly skilled and highly motivated programmer or team of programmers, yet the creation of a clone steals the glory and public acclaim from its rightful owner."

To Riis, though, the cat has been out of the bag too long, resulting in a "paradigm shift." Computer chess engines now quickly absorb one anothers' innovations: "Everyone in the top tier of chess programming learned from Fruit. Retrospectively, what now seems clear is that Fruit also unwittingly triggered a revolution in the whole ethos of chess programming. From the emergence of Fruit and going forward, the premise within the programming community was that it was perfectly fine to re-use and share ideas and algorithms from leading programs whether they were open source or not.."

When everyone copies everyone, like members of a highly-competitive research team working together on a single problem, everyone benefits. If the collaborative methods are rotten, however, the balance of power is always vulnerable to accusations of unfairness, betrayal and theft.

Fruit was open-source, but Rybka was not. Fruit itself went closed-source soon after its initial success. Though Rybka was suspected early of foul play, it took five years for action to be taken; five years in which a shifty world of inconsistently-tolerated disassembly and idea-cloning, outined by Levy and Riis alike, took hold.

"Its impossible to write a modern chess program without borrowing extensively from other programs or algorithms," said Riis. " … No serious chess programmer would program a chess engine from scratch … To the best of my knowledge all of today's top programs either a) started up by taking some existing program apart or b) have liberally used other programs to enhance their own design."

If as many other leading engines have incorporated Rybka's innovations into their own code as Riis implies, they too may be in a compromised position if the spotlight ever falls upon them: "Rajlich's original ideas have been lifted from various reverse-engineered editions of Rybka again and again and again—his work has been pilfered as comprehensively as anyone's in all of computer chess history."

"I am completely unconcerned about the reverse-engineering that has been done," wrote one Rajlich critic, Dr. Robert Hyatt, during a forum debate. "Seems like a fair way to 'even the playing field' by forcing a secretive author to expose secrets he has desparately tried to hide."

But now even that option is off the menu: in its latest versions, Rybka works as an internet service instead of as an application downloaded to a personal computer. As a result, it's now harder to look under the hood for a peek at its inner workings. This could threaten to upend the chess AI scene permanently, and it's easy to understand how frustrating this would be to rivals who tolerated Rybka's loose attitude to authorship so long as they could do likewise.

Asked if he thought it would be possible to make a competitive, commercial chess engine that remained open-source, Rajlich said that he did: "Of course competitors would quickly catch up, but that isn't much different than releasing an executable. An executable isn't that far from source code in terms of having value to competitors."

If all this is even slightly true, the most interesting part of Rajlich's punishment is not his disgrace and exile, but the act of public scrutiny forced upon his work. It is a reminder, to everyone, that Letouzey may have been right to begin with.