Recently, the company I worked for was acquired by another company, and the new owner decided to close Denver office. Suddenly, I had to face the unpleasant necessity to sell myself on the job market again. Last time I did it five years ago and, since then, the programming job landscape has changed. The most advanced and growing companies now talk about machine learning, big data analysis and big data processing. Five years ago I even did not hear these words during job interview.
Luckily for me, I did some of it the last few years. We built models that predicted user behavior based on the historical data. The basic approach is pretty straightforward. You collect user data (say, each user is presented by 30 parameters, like name, age, address, education level, etc) and you mark each user by a flag, whether s/he did the action of interest: bought certain product (true/false), clicked certain advertisement (true/false), spent more than 5 seconds reading the advertisement (true/false), and similar. Then you feed these data into some algorithm and generate a model, which gives you a score for each user (the probability that the user will do this or that action). Then you test the model on the another period of the historical data, tweak the model if needed (to be more precise) and then can use it to predict a new user behavior: show the product s/he most probably will buy and move aside the product the least probable to be bought. I was surprised to see how much the sales revenue grows with such a simple technique.
That is what encompasses basic machine learning functionality. There is a lot of mathematics around it. And you have to understand your data too: do some data properties depend on each other? how reliable are the data? which combination of data identify user uniquely? are there duplicates? which algorithms better for the kind of prediction you need? And you have to know limitations and parameters of the algorithms. Then pure programming challenges come in: how to process big amount of data quickly enough? You might not be able to get away with one computer, even a very powerful one. For example, our company used Apache framework Hadoop to coordinate twenty seven multi-core servers to meet the required time constraints. And such companies like Google and Amazon use many thousands of computers to do that.
The adjective “big” appeared during the last few years and spread around like a ground fire during a dry season in the prairie. The last version of Java includes the language constructs that support parallel computing and asynchronous processing. New – non-blocking – servers were created that took on themselves multi-threading, but you, as a programmer, have to structure your data and algorithm in a way that they can be used in parallel. “Reactive”, “dynamic”, and “functional” become standard adjectives of the word “programming”.
The traditional sequential programming is a history. Now the huge amount of data are processed in parallel by huge amount of independent agents (microservices). The results are picked up by a swarm of other agents (microservices) and so on, until the final result is reached.
The most exciting aspect of this new brave world for me was the discovery that my human intuition could not compete with machine learning algorithms anymore. I used to think that computers just can do faster the job a human can do, and that programmer can predict the result in principle, if not with high precision. What I find more and more is that my intuition does not give me even a hint in the direction that later is identified as the most fruitful by an algorithm. I even do not talk here about deep learning, when a computer decides which data parameters to use and which not. Even the basic machine learning algorithm can find trends in the data you never even hoped existed.
My friends, I admit my total surrender to the power of artificial intelligence. It is not even a question of my good will. It is just simple and straightforward fact that faces everybody who works in this area. The race is over. Humans can rule or have an illusion they do, but we cannot compete with machines anymore in understanding what is going on. Well, by “understanding” I assume the ability to discover an actionable result. So far, we do not allow machines too much of a freedom in acting, but let us see what will happen in just a few years from now.
Then I thought about latest trends in physics. The science used to be about nature of things around us. Physicists talked about the structure of the material world, about forces, causes and effects. Then we were forced to admit that we did not know what was actually “out there.” All we can do is to predict the outcome, using the formulas (called “laws” if they are not broken relatively often). Given the input, we can compute the results, but we do not know why everything happens the way it happens. Even when we use notion of “God”, we still cannot explain why s/he decided to do it this or that way. Poor schoolchildren! I remember how we struggled to explain physical processes, while all we actually were required to do was to memorize words and sentences that were approved in lieu of explanation.
In the course of the last decade this understanding, once a heretical and dissenting, became a public knowledge. Many physicists accepted the notion of a “computational universe”: physics describes the behavior of the world around us not as the system of “cause-effect”, but as an interplay of various states, whose nature remains unknown and unknowable to us on principle.
As many of you know, I started my professional life as a physicist. Then I met Valeriy Popov, also physicist, a graduate of Moscow Engineering Physics Institute, who tempted me by his vision of modeling the real life objects and processes in a digital software system. That’s how I became a programmer. To be able better model complex systems, we tried to introduce an object-oriented design in the fundamentally procedural language programming.
That is why when Ada, then C++ and Java hit the market, I was ready and easily converted to this new programming paradigm of many independent objects interacting and performing calculations over their states in the quiet darkness of a digital space. And now, for the last five years, big data forced us to move even further down the path of introducing even more independent agents that are processing flows of data (states) and identify the dependencies between them. Does not it sound to you as a scientific activity already? It looks so to me.
If that is the case, then I hope that physics and programming converge and I have a chance to become a physicist again, even if to a degree. I am looking forward to it! Well, it might happen not very soon, but Aubrey De Grey promises to let us live 1000 years or more (even if my friend John Graham calls this promise a “hogwash”; did I spell it correctly, John?)
P.S. It took me two weeks – since I started my search – to get an offer for a new job, which is not bad at all. And another good thing is that the new job requires usage of all the latest stuff I mentioned above. I will tell you later how it goes.
But humor will always remain exclusively human! |
There was a paper We Are Humor Beings: Understanding and Predicting Visual Humor recently that shows how machine learning can be used to allow computer to recognize humor. |
Send your comments using the link Contact or in response to my newsletter.
If you do not receive the newsletter, subscribe via link Subscribe under Contact.