log in | about 
 

This is written in response to a Quora question asking to explain in layman's terms the difference between adversarial and evil AI. Feel free to vote on my answer on Quora.

This is an excellent question! For starters, in my opinion, the current AI heavily relies on statistical learning methods, which are rather basic. For this reason it is nowhere near producing sufficiently intelligent machines, let alone machines that can have feelings, emotions, free will etc. There are algorithmic and hardware limitations that I cover in my blog post (also available as a Quora answer).

Modern AI cannot be evil in a traditional sense human sense of the word, however, it can can cause a lot of harm as any other immature technology. For example, despite the famous claim by a Turing award winner G. Hinton that we would have to stop training radiologists roughly today, there is mounting evidence that deep learning methods for image analysis do not always work well.

Furthermore, statistical methods (aka AI) are becoming ubiquitous tools of decision making (money lending, job search, and even jailing people). However, statistical learning methods are not fair and can be biased against certain groups of people. From this perspective AI can be considered evil. Of course, humans are biased too, but human opinions are diverse and we, humans, tend to improve. Having a single black-box uncontrollable decision algorithm that becomes more and more biased is a scary perspective.

Modern AI is not reliable and immature: It works only in very constrained environments. Why is that? Because statistical learning is a rear-mirror-view approach that makes future decision based on patterns observed in the past (aka training data). Once the actual (test) data diverges from training data in terms of the statistical properties, performance of modern AI decreases quite sharply.

In fact, it is possible to tweak the data slightly to decrease the performance of an AI system. This is called an adversarial attack. For example, there is research showing that addition of distractor phrases does not confuse humans much, but it completely “destroys” performance of a natural language understanding system. For the reference, the modern history of adversarial examples started from the famous paper by Szegedy et al 2013. They showed that small image perturbations, which are too small to be noticed by humans, completely confuse deep neural networks.

In summary, the adversarial AI has nothing to do with the evil AI. It concerns primarily with devising methods to fool modern statistical learning methods with (adversarial) examples as well as with methods to defend against such attacks. Clearly, we want models that can withstand adversarial attacks. This is a difficult objective and a lot of researchers specialize in the so called adversarial AI.