The Rise of the Algos

By Bogdan Teleuca

As the world moves towards the big data era, society will undergo a major shift. Big data is already transforming many aspects of our life and forcing us to reconsider basic principles, as we evaluate how to best utilise big data while preventing potential harm. Simple changes to existing rules will not be sufficient to temper big data’s dark side.

In their book, Big Data (Houghton, Mifflin, Harcourt Publishing, 2013), Viktor Mayer-Schonberger and Kenneth Cukier, propose a new role called “algorithmist.” The “algos, as I like to call them, will need to be equipped to deal with some face major issues of privacy and information transparency that are at the core of this societal shift

For instance, consider the issue of privacy in relation to data collection, which typically provides its value through secondary uses. As these uses were not known when initially collected, it is hard to see how it would be possible to ask the producer of the data for their individual consent.

In protecting privacy, human freedom will become paramount. New institutions and a new breed of professionals, the algos, will need to emerge to interpret complex algorithms and to advocate for people who might be harmed by big-data usage.

The responsibility will shift from the data producer (a person who tweets, for example) to the data user (a developer of a tweet-based sentiment index algorithm). This makes sense, as it is the data user who knows (or, should know) how they intend to exploit the collected data and they are best positioned to know if privacy rules will be broken by the applications about to develop.

An algo could help the data users to conduct an assessment whether the privacy framework and their privacy rights are respected. Moreover, the algo can help a company devise protections on how to blur the data (make it fuzzier), without destroying its value. For example, a pharmaceutical company wishing to test its models on external data, containing persons treated of epilepsy, within a certain geographical area, will maybe satisfied with a more aggregated answer (people living in the Brussels Region) instead of the granular data (full address or postcode), that can help identify the individual and hence, break the privacy framework.

The algo should be able to understand the usage and the privacy frameworks in order to devise such forms of blurring, or fuzziness.

Predictions

The concept of justice is based on the principle that the humans are responsible for their own actions. In the big data era, there is strong temptation to predict which people will commit a crime and subject them to “special treatment” or, to calculate a probability, based on historical data, of what their future actions might be. For example, one might employ a model to determine which inmates released on parole will eventually return to prison.

Suppose a “target” determined by the algorithm, has a name, let’s say Jimmy. Jimmy is a human person and he has a life. By looking at his past history, or gathered data about his past events, the algorithm decides “there is a high probability” that Jimmy will commit a small felony in the next three years.

As any human, Jimmy has hopes, dreams and plans. Should the Department of Justice assign a social worker to keep a close eye on Jimmy, maybe even try to steer him towards the right path (which could be seen as some kind of harassment) or, should it respect the fundamental pillar of human rights and leave him alone, as Jimmy has not in fact committed any crime, yet?

Tough question, and I am sure there are many different opinions in relation to this question.

A fundamental pillar of big-data governance must be a guarantee that we will judge people based on their behaviour and observed acts and not by crunching data to qualify them as potential wrong-doers. Statistically, modelling historical data is not always the best predictor for a future state.

It must the role of the algo to raise a flag when judgment made on propensity might actually harm a person.

The Black Box

Computer systems generally base their decisions on rules they have been designed to follow. They use classes of algorithms designed to perform what is called as unsupervised learning. So when a system makes a crazy decision, somebody can open and inspect the computer code and (try to) understand it. 

Suppose an order to sell a million shares at half the market price was suddenly routed to execution, causing panic and trading algos to enter into a fire-sale mode. Or, a plane’s automated pilot activated without any warning, descending the plane 2000 ft. in a matter of seconds. The code can be opened, logs (supposing there are any) or code inspected and eventually the program can be improved.

In recent years in the financial industry, regulatory bodies have placed a lot of emphasis on the transparency and traceability of the data transformation and calculating processes. Basically they ask to be able to open the black box or have built into it transparent materials.

The main risk of the big data applications is that they will become black-boxes that elude accountability, traceability, confidence and that they lack explanative power. To prevent this, big data will require monitoring, transparency and a new type of expertise and intuition.

The algos must be courageous enough to take this role and to be able to devise effective governance for these models.

Algo - The Professional

Let’s see now how bright the future might look like for an algo or algorithmist (if you prefer the long academic name) and what role they might play. The algos must have expertise in:

Computer Science: proficient in SQL and NoSQL, in ETL and NoETL, in data warehousing and document management databases, understand small, structured data and big unstructured data alike. One might ask, how can one be an expert in NoETL, NoSQL, NoSomething? Welcome to 2015, when you can make a pretty good living out of it…

Mathematics and Statistics: not trusting them blindly will be the first condition. Statistics cannot be smarter than the people using it, but in some cases stats can make very smart people do very dumb things.

Industry experience: a decent number of years in various roles and for different organisations. They need to understand that, people do not really care about infrastructure, algorithms, models and software (although some do care), they mostly care about building relations with other people.

Algos will evaluate the selection of the data sources, their quality, and the choice of analytical and predictive tools. Algos will help people to interpret the results. In the event of a dispute, they will be given access to algorithms, statistical approaches and datasets that determined a decision.

Algos will perform audits for companies that want expert support. And, they may certify the soundness of big-data applications (like anti-fraud, stock-trading systems, etc.) Finally, external algorithmists will consult organisations and the public sector on how to make use of data in their domain.

The algos will adhere to a code of conduct and this new profession will regulate itself.

The algos impartiality, confidentiality, competence and professionalism will be enforced by tough liability rules. They will need to be called upon by courts as big data experts.

Finally, any person who believes they have been harmed by big-data predictions, for instance a patient rejected for surgery, a candidate denied a job, an employee fired, an inmate denied parole or a person denied credit, will be able to appeal to algos to assist them in understanding and appealing those decisions.

Only time will tell if the algos will become a self-regulated professional body and be the guardians of the bridge between people and data. Till then, keep calm and keep coding, it's business as usual...

Bogdan Teleuca is a Senior Risk Consultant at Business & Decision Belgium. He has a diverse experience in risk management, information management and technology within financial services in particular, banking, insurance and funds management.