Theo Bass – DECODE
The internet runs on personal data. It is the price we pay for apparently free services and for seamless integration. That’s a bargain most have been willing to make – or at least one which we feel we have no choice but to consent to. But the consequences of personal data powering the internet reverberate ever more widely, and much of the value has been captured by a small number of large companies.
That doesn’t just have the effect of making Google and Facebook very rich, it means that other potential approaches to managing – and getting value from – personal data are made harder, or even impossible. This post explores some of the challenges and opportunities that creates – and perhaps more importantly serves as an introduction to a much longer document – Me, my data and I:The future of the personal data economy – which does an excellent job both of surveying the current landscape and of telling a story about how the world might be in 2035 if ideas about decentralisation and personal control were to take hold – and what it might take to get there.
Dirk Helbing, Bruno S. Frey, Gerd Gigerenzer, Ernst Hafen, Michael Hagner, Yvonne Hofstetter, Jeroen van den Hoven, Roberto V. Zicari, Andrej Zwitter – Scientific American
There is plenty of evidence that data-driven political manipulation is on the increase, with issues getting recent coverage ranging from secretively targeted Facebook ads, bulk twitterbots and wholesale data manipulation. As with so much else, what is now possible online is an amplification of political chicanery which long pre-dates the internet – but as with so much else, the extreme difference of degree becomes a difference of kind. This portmanteau article comes at the question of whether that itself puts democracy itself under threat from a number of directions, giving it a pretty thorough examination. But there is a slight sense of technological determinism, which leads both to some sensible suggestions about how to ensure continuing personal, social and democratic control – but also to some slightly hyperbolic ideas about the threat to jobs and the imminence of super-intelligent machines,
Hila Mehr – Harvard: ASH Center for Democratic Governance and Innovation [pdf]
The first half of this paper is a slightly breathless and primarily US-focused survey of the application of AI to government – concentrating more on the present and near future, than on more distant and more speculative developments.
The second half sets out six “strategies” for making it happen, starting with the admirably dry observation that, “For many systemic reasons, government has much room for improvement when it comes to technological advancement, and AI will not solve those problems.” It’s not a bad checklist of things to keep in mind and the paper as a whole is a good straightforward introduction to the subject, but is very much an overview, not a detailed exploration.
It’s a surprisingly common mistake to design things on the assumption that they will always operate in a benign environment. But the real world is messier and more hostile than that. This is an article about why self-driving vehicles will be slower to arrive and more vulnerable once they are here than it might seem from focusing just on the enabling technology.
But it’s also an example of a much more general point. ‘What might an ill-intentioned person do to break this?’ is a question which needs to be asked early enough in designing a product or service that the answers can be addressed in the design itself, rather than, for example, ending up with medical devices which lack any kind of authentication at all. Getting the balance right between making things easy for the right people and as close to impossible as can be managed for the wrong people is never straightforward. But there is no chance of getting it right if the problem isn’t acknowledged in the first place.
And as an aside, there is still the question of whether vehicles are anywhere close to be becoming autonomous in the first place.
Simson Garfinkel – MIT Technology Review
This is a long, but fast moving and very readable, essay on why AI will arrive more slowly and do less than some of its more starry-eyed proponents assert. It’s littered with thought-provoking examples and weaves together a number of themes touched on here before – the inertial power of the installed base, the risk of confusing task completion with intelligence (and still more so general intelligence), the difference between tasks and jobs, and just how long it takes to get from proof of concept to anything close to real world practicality. There are some interesting second order thoughts as well. There is a tendency, for example, to assume that technologies (particularly digital technologies) will keep improving. But though that may well be true over a period, it’s very unlikely to be true indefinitely: in the real world, S-curves are more common than exponential growth.
A straightforward and very useful post which does exactly what the title says – a step by step explanation of what AI can now do, what it might be able to do and what there is currently no prospect of its doing (for economic as well as technical reasons).
It is – or should be – well known that 82.63% of statistics are made up. Apparent precision gives an authority to numbers which is sometimes earned, but sometimes completely spurious. More generally, this short article argues that humans have long experiences of detecting verbal nonsense, but are much less adept at spotting nonsense conveyed through numbers – and suggests a few rules of thumb for reducing the risk of being caught out.
Much of the advice offered isn’t specific to the big data of the title – but it does as an aside offer a neat encapsulation of one of the very real risks of processes based on algorithms, that of assuming that conclusions founded on data are objective and neutral, “machines are as fallible as the people who program them—and they can’t be shamed into better behaviour”.
Michelle Nijhuis – the New Yorker
Were the Beatles average? This is Matthew Taylor in good knockabout form on a spectacular failure to use data analysis to understand what takes a song to the top of the charts and, even more bravely, to construct a chart topping song. The fact that such an endeavour should fail is not surprising (though there are related questions where such analysis has been much more successful, so it’s not taste as such which is beyond the penetration of machines), but does again raise the question of whether too much attention is being given to what might be possible at the expense of making full use of what is already possible. Or as Taylor puts it, “We are currently too alarmist in thinking about technology but too timid in actually taking it up.”
This is an extract from a new book, The Mathematical Corporation: Where Machine Intelligence and Human Ingenuity Achieve the Impossible (out last month as an ebook, but not available on paper until September). The focus in the extract focuses on the ethics of data, with a simple explanation of differential privacy and some equally simply philosophical starting points for thinking about ethical questions.
There is nothing very remarkable in this extract, but perhaps worth a look for two reasons. The first is that the book from which it comes has a lot of promise; the second is a trenchant call to arms in its final line: ethical reasoning is about improving strategic decision making.
Josh Sullivan and Angela Zutavern – Sloan Review
This is a short sharp summary of how biases affect AI design and what to do about them, reaching the conclusion that government oversight is essential (though not, of course, sufficient). There are interesting parallels with Google’s in house rules for working on AI, so worth reading the two together.
Joanna Bryson – Adventures in NI
What’s the best way to arrange the nearly 3,000 names on a memorial to the victims of 9/11 to maximise the representation of real world connectedness?
Starting with that arresting example, this intriguing essay argues that collection, computation and representation of data all form part of a system, and that it is easy for things to go wrong when the parts of that system are not well integrated. Focus on algorithms and the distortions they can introduce is important – but so is understanding the weaknesses and limitations of the underlying data and the ways in which the consequences can be misunderstood and misrepresented.
Jer Thorp – Medium
If machine learning is not the same as human learning, and if machine learning can end encoding the weaknesses of human decision making as much as its strengths, perhaps we need some smarter ways of doing AI. That’s the premise for a new Google initiative on what they are calling human centred machine learning, which seems to involve bringing in more of the insights and approaches of human-centred design together with a more sophisticated understanding of what counts as a well-functioning AI system – including recognising the importance of both Type I and Type II errors.
Katharine Schwab – Co.Design
Artificial intelligence is more artificial than we like to think. The idea that computers are like very simple human brains has been dominant pretty much since the dawn of computing. But it is critically important not to be trapped by the metaphors we use: the ways in which general purpose computers are not like human brains are far more significant than the ways in which they are. It follows that machine learning is not like human learning; and we should not mistake the things such a system does as simply a faster and cheaper version of what a human would do.
Robert Epstein – Aeon
Does the power of big data combined with location awareness result in our being supported by butlers or harassed by stalkers? There’s a fine line (or perhaps not such a fine line) between being helpful and being intrusive. Quite where it will be drawn is a function of commercial incentives, consumer responses and legal constraints (not least the new GDPR). In the public sector, the balance of those forces may well be different, but versions of the same factors will be in play. All of that, of course, is ultimately based on how we answer the question of whose data it is in the first place and whether we will switch much more to sharing state information rather than the underlying data.
Nicola Millard – mycustomer
If it’s hard to explain how the outputs of complex systems relate to the inputs in terms of sequential processes steps, because of the complexity of the model, then perhaps it makes sense to come at the problem the other way round. Neural networks are very crude representations of human minds, and the way we understand human minds is through cognitive psychology – so what happens if we apply approaches developed to understand the cognitive development of children to understanding black box systems?
That’s both powerful and worrying. Powerful because the approach seems to have some explanatory value and might be the dawn of a new discipline of artificial cognitive psychology. Worrying because if our most powerful neural networks learn and develop in ways which are usefully comparable with the ways humans learn and develop, then they may mirror elements of human frailties as well as our strengths.
Sam Ritter and David G.T. Barrett – DeepMind
This is a neat summary of questions and issues around the explicability of algorithms, in the form of an account of a recent academic conference. The author sums up his own contribution to the debate pithily and slightly alarmingly:
Modern machine learning: We train the wrong models on the wrong data to solve the wrong problems & feed the results into the wrong software
There is a positive conclusion that there is growing recognition of the need to study the social impacts of machine learning – which is clearly essential from a public policy perspective – but with concern expressed that multidisciplinary research in this area lacks a clear home.
Zachary Lipton – Approximately Correct
It’s often said that decisions generated by algorithms are inexplicable or lack transparency. This post asks what that challenge really means, and argues that there is nothing distinctive about decisions made by algorithms which makes them intrinsically less explicable than decisions made by human brains. Of the four meanings Felten considers, the argument from complexity seems the most prevalent and relevant – and where his counter argument is weakest. But this is an important and useful way of clarifying and exploring the problem.
Ed Felten – Freedom to Tinker
‘Digital’ risks becoming ever more shapeless as a word – as it increasingly means everything, it necessarily means nothing. In a post three years ago, Catherine Howe brought some rigour to at least one aspect of the issue, identifying seven tribes of digital. Now she has felt the need to add an eighth – the robot army, reflecting the shift she see from large scale automation being an interesting theory to becoming a practical reality.
Catherine Howe – Curious?
An interesting take on the problem of algorithmic reliability: treat it as an economic problem and apply economic analysis tools to it. Algorithms are adopted because they are expected to create value; a rational algorithm adopter will choose algorithms which maximise value. One dimension of that is that if the consequential cost of errors is low, the value of an improved algorithm will also be low (setting an inconvenient appointment time matters less than making an incorrect entitlement decision). More generally, decision making value is maximised when the marginal value of an extra decision equals its marginal cost.
One consequence of taking an economics-based approach which this article doesn’t cover is the importance of externalities: the decision maker about an algorithm (typically an organisation) may pay insufficient weight to the costs and benefits experienced by the subject of a decision (often an individual), so producing a socially sub-optimal outcome.
Juan Mateos-Garcia – NESTA
This article on the biases of algorithmic decision making is notable for two reasons. The first is that it comes from a national newspaper, suggesting that the issue is becoming more visible to less specialised audiences. The second is that it includes a superbly pithy statement of the problem:
Our past dwells within our algorithms.
There is also an eight minute TEDx talk, mostly covering the same ground, but putting a strong emphasis in the final minute on the need for diversity and inclusion in coding and introducing the Algorithmic Justice League, a collective which aims to highlight and address algorithmic bias.
Joy Buolamwini – The Observer