What do chess and data cleanup projects have in common?
At first glance, not much. One’s a game that has been played for over a thousand years and is often used as a symbol for human intelligence and creativity, with champions receiving worldwide fame and recognition. The other involves staring at a computer screen for hours on end, cleaning up typos and other minor errors, and often is relegated to high school interns.
In a 2010 article in the New York Review of Books, Garry Kasparov, the famed chess grandmaster, makes some comments about chess that are relevant to data cleanup projects.
Kasparov begins the article by detailing some of his famous matches against supercomputers, including losing to Deep Blue in 1997. Kasparov remarks that while the Artificial Intelligence community was certainly happy about the computer’s win, they were a little disappointed with Deep Blue’s strategy.
“Instead of a computer that thought and played chess like a human, with human creativity and intuition, they got one that played like a machine, systematically evaluating 200 million possible moves on the chess board per second and winning with brute number-crunching force.
…It was an impressive achievement, of course, and a human achievement by the members of the IBM team, but Deep Blue was only intelligent the way your programmable alarm clock is intelligent. Not that losing to a $10 million alarm clock made me feel any better.”
This is something that makes sense to pretty much anyone who has used a computer.
Computers are great at brute-force computations.
Humans are great at thinking of creative solutions.
But instead of comparing humans to computers, how would computer-human teams perform?
This happened in 2005 when the website playchess.com held an online “freestyle” chess tournament. The normal anti-cheating rules weren’t in place – in fact, participants were encouraged to work in groups and use any chess software they liked.
Under anonymous screen names, even a few grandmasters and chess-specific supercomputers (notably, Hydra Scylla and Hydra Chimera) entered the tournament, all drawn by the lure of fame and a substantial monetary prize for the winner.
The tournament results were interesting, to say the least.
Both Hydra computers were out of the running before the quarterfinals, handily beaten by human experts with cheaper laptops. According to Kasparov, “Human strategic guidance combined with the tactical acuity of a computer was overwhelming.”
There was an even bigger surprise at the end of the tournament. The winners weren’t grandmasters. They were two amateurs using three computers at once.
According to Kasparov, “Their skill at manipulating and ‘coaching’ their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.”
Just like when playing chess, the combination of a good process, a good algorithm, and human creativity is the best way to approach a data cleanup project. It doesn’t require a supercomputer. It doesn’t require a data cleanup expert. Even better, it doesn’t require you to become a data cleanup expert. It just requires solid algorithms for detecting matches, human decision-making for fuzzy matches, and a great process for tying it all together.
Want to learn more about how LeanData’s process works? Visit the Solutions page.