Will natural language processing engineers find it hard to get work in the future (once computers are capable of near-perfect text and speech processing)?

Blog

Directory

Submitted by srchvrs on Mon, 07/16/2018 - 23:08

This is written in response to a Quora question, which asks if most NLP engineers will be out of jobs once computers are capable of near-perfect text and speech processing. Feel free to vote there for my answer on Quora! There is also an interesting recent blog post by a Carnegie Mellon professor Jeffrey P. Bigham, where he expresses similar concerns and argues that in the near future artificial intelligence should be treated as a human-computer interaction problem rather than purely an algorithmic problem. I fully agree with this point of view!

So, will NLP be engineers be out of jobs? Yes, absolutely! Future is highly uncertain. As believed by Marvin Minsky, an imperfect human race will create a new robotic race that will not suffer from human limitations and which will inherit the Earth and surrounding planets. As humans have outcompeted and replaced many other species, these robots will outcompete and replace humans. We should not, however, fear the future but fulfill our evolutionary destination and welcome our robot overlords.

As impressive as they are, existing AI systems only seem to be intelligent. We do not know how many dozens, hundreds, and, possibly, thousands of years it will take to create truly intelligent machines. In fact, we do not truly know what it means to be intelligent and what is required to be intelligent. A recent high-profile paper: “Building Machines That Learn and Think Like People”, 2016 by Lake et al., tries to find some answers, but its conclusions are far from being definitive.

Ray Kurzweil famously predicted singularity to happen in 2045 based on the exponential growth of computational capacity. However, the size of the transistor is already about 100x the size of an atom. My guess would be that the current technology has a potential of a 10x increase in capacity. It also seems that there is no production-ready immediate replacement on the horizon. In particular, it is not clear when (and if) 3-d chips will be available.

At the same time, the best GPUs have about 20 billion transistors, while the human brain has 100 billion neurons each of which has 10K connections (synapses) on average. How many transistors are necessary to create an artificial neuron? One of the most advanced custom neural chips TrueNorth implements one million spiking neurons and 256 million synapses on a chip with 5.5 billion transistors with a typical power draw of 70 milliwatts.

Thus, it takes about 20 transistors per synapse. Even if we assume that an artificial neuron is as powerful as the real one (which is likely very far from truth), the current technology is six freaking orders of magnitude behind a human brain! Size notwithstanding, power consumption is also an enormous challenge. According to the above cited report, if TrueNorth is scaled up to the size of the human brain it would require 10,000 times more energy!

Furthermore, it is highly unrealistic to assume that an artificial neuron is nearly as complex as a real one. For example, the following book argues that even a single-cell organism (albeit a rather large one) can exhibit extremely complex behaviors, which include sensing and hunting: Wetware: A Computer in Every Living Cell: Dennis Bray. C elegans has about 500 neural cells, but it has basic sensory system and muscle control! It can reproduce and mate.

As my co-author and friend Daniel Lemire noted, our planes do not fly like birds and submarines do not swim like fishes. We do not have to mimic human brain to solve artificial intelligence tasks. We may not even need a brain-like structure to create a truly thinking machine. However, I would argue that we—using the phrase of the Turing award winner Richard Hamming—simply do not have an attack, i.e., a reasonable way to approach this difficult problem.

Another good observation from Daniel Lemire is that we should expect the unexpected because experts can be easily wrong. For example, there were some predictions about impossibility of flight in early 20th century. Although we should expect breakthroughs anytime, I do not think that impossibility of flight for heavier than air machines is a good example. The first gliders appeared well before the first propelled planes. In fact, some people had very clear ideas about how planes should and could fly. This is not true for the general artificial intelligence and we have not built the first gliders yet.

Traveling to the stars is clearly a difficult problem. Few people would argue with that. However, for some reason everybody thinks that artificial intelligence is something that is just a few (dozens) years away. Well, it could be so. But it could also be harder than interstellar travel.

Even if we can create a human-size neural network, we do not know how to program it efficiently. A state-of-the-art approach to training a model consists in collecting a huge amount of data and making a neural network that finds a mapping from inputs to outputs. This approach truly revolutionizes speech and vision and improves to some degree text processing. However, it might be just a gigantic “fuzzy” memory.

This approach is also incredibly brittle and data greedy. We do not know if we can scale it from hundreds to millions of layers. There are a number of recent papers showing that it is very easy to “poison” training data. For example, in the IMDB sentiment dataset the error rate can be driven from 12% to 23% by adding only 3% poisoned data. State-of-the-art CNNs fail (accuracy drops from 90+% to 10-%) for color modified CIFAR-10 images that are easily classified by humans.

Another big success of neural networks is speech recognition. Perhaps, it is the biggest success so far. For clean speech we can get near human recognition rates. However, on noisy data and especially when multiple speakers are present (an infamous cocktail party setting) the results are quite subhuman. The cocktail party setting is especially bad. It is a big success if you can reduce the word error rate from 90% down to 50% or to 30% (i.e., a computer misses every second or third word).

One clear issue with the current approaches is that clean training data can be quite expensive to obtain. For the existing not-so-clean data (collected in a semi-supervised fashion), there can be only little benefit by scaling (an already huge) training set by further two (!) orders of magnitude.

For example, a recent work by researchers from Google and Carnegie Mellon university has showed that a 300x (!) increase in the number of training examples only modestly improves performance. There is a lot of hope that reinforcement learning will solve these issues, but it does not seem to work yet.

All in all, judging by a good number of publications and blog posts that I have been reading in the last six years, we can now do well in a number of constrained domains. However, the success depends mostly on the existence of human-created training data and tons of engineering effort. In that, I suspect that the success of end-to-end systems (i.e., no engineering effort to modularize the problem and synthesize a system from multiple sometimes handcrafted models) is still limited.

Extending existing techniques to new domains requires many years of work from skilled engineers and scientists. I do not see how this can change in the near future. I actually expect that we will need many more scientists and engineers to continue making good progress. Brace yourself, it looks there is megatons of work ahead.

srchvrs's blog

You are here