Jan 08 2018

Book reviews: Christmas Extravaganza!

I’ve given up writing my full length posts for my Christmas book haul which, I show below, was rather fine.This is in part a result of the type of books one gets for Christmas, and in part the conditions under which they are read – in a Christmas cake induced haze, for my part.

walking_thumb1A Philosopy of Walking by Frédéric Gros

A Philosopy of Walking by Frédéric Gros was, like the best presents, something I wouldn’t have got for myself but nevertheless enjoyed.

The book interleaves chapters on various walking related thoughts with some walking oriented biographical content. This covers Nietzsche, Rimbaud, Rousseau, Nerval, Kant and Ghandi. The predominant feeling from these biographies is a bit grim, several of the protagonists died young or after prolonged illness – Nerval committed suicide. Their walking in feels compulsive. Ghandi lived to a ripe old age but was ultimately assassinated. Kant took the same walk every day, he lived to 80 but it sounds pretty dull!

The chapter on pilgrimage struck a cord with me, I’ve been meditating for a while which often involves focusing on a mantra or physical manifestation, like breathing. Some pilgrims take a similar approach combining walking with a prayer-like mantra.

Somehow the author has missed our family favourite walking habit – humming the Imperial March from Star Wars as a rhythm to walk to along broad well-made paths in the Lake District.

The book is translated from French, I learned this on reading in a footnote that the French word témoin which I knew meant “witness” also refers to the baton in a relay race.

New views by Alastair Bonnett

new_viewsNew Views by Alastair Bonnett is a different manner of Christmas book, a coffee table book – as are the rest of the books in this post.

New Views  is a collection of world maps illustrating different data in three broad areas which could be described as physical, human and animal, and trade. The pattern is the same in  each case – a double page contains a map with key, and on the following double page is some text describing the context of the map and another, different graphic. The maps are very much on the global scale, cities may be mentioned here and there but the overwhelming impression is of the world as a whole, not individual countries.

I liked the map of lightning strikes which highlights odd areas, particularly in the east of the Democratic Republic of Congo which has the highest rate of lightning strikes in the world. The maps of amphibian and bird diversity are fun too – they map out features of the underlying geography like rivers and mountains but in different ways.

I was surprised to learn just how big an exporter of nuts the US was, I should have known this since I constant in my last job was the monthly scrape of the reports of the almond board of California for a customer. Also I learned that Brazil exports no Brazil nuts because they don’t grow there!

Sometimes the colour keys are a bit cryptic, that’s to say I couldn’t distinguish between two categories on the scale. On another map countries where there is no data are omitted completely which makes the map difficult to parse unless you have a photographic recall of the shapes of the countries of the world. I was puzzle to learn that the viper was the only poisonous snake in the United Kingdom – I always called them “adders”.

This is a creditable work of this genre.

Bird by Andrew Zuckerman

birdBird by Andrew Zuckerman is an immense book, comprised entirely of photographs of birds shown against a pure white background. There are a few words, and a pictorial index which names the birds, at the back but the main body of the text is completely wordless.

The pictures are gorgeous but I found myself wanting more having flicked through to the end of the book. The style is intentional and is contrasted with that of Audubon who included much more context in his famous paintings of birds. Technically the photographs are very good to excellent.

Zuckerman has produced a number of books in this style, I’m most interested to see his works on flowers and creatures.

Vermeer: The Complete Works by Karl Schütz

vermeerVermeer: The Complete Works by Karl Schütz. I was surprised to read that there are only 35 works attributed to Vermeer. This may be because he fell out of popularity after his death in the 17th century and interest was not revived until the 19th century.

Canaletto & the art of Venice by Rosie Razzall and Lucy Whitaker

canalettoCanaletto & the art of Venice by Rosie Razzall and Lucy Whitaker. The authors names are very discreetly displayed on this volume. I’m a fan of Canaletto – I love the almost CAD-like precision of his architectural paintings.

Dec 28 2017

Book review: Fraud analytics by B. Baesens, V. Van Vlasselaer and W. Verbeke

This next book is rather work oriented: fraud_analyticsFraud Analytics using descriptive, predictive and social network techniques: A guide to data science for fraud detection by Bart Baesens, Veronique van Vlasselaer and Wouter Verbeke.

Fraud analytics starts with an introductory chapter on the scale of the fraud problem, and some examples of types of fraud. It also provides an overview of the chapters that are to come. In the UK fraud losses stand at about £73 billion per annum, typically fraud losses are anything up to 5%. There are many types of fraud: credit card fraud, insurance fraud, healthcare fraud, click fraud, identity theft and so forth.

There then follows a chapter on data preparation, sampling and preprocessing. This includes some domain related elements such as the importance of the so-called RFM attributes: Recency, Frequency, and Monetary which are the core variables for financial transactions. Also covered are missing values and data quality which are more general issues in statistics.

The core of the book is three long chapters on descriptive statistics, predictive analysis and social networks.

Descriptive statistics concerns classical statistical techniques such as the detection of outliers using the z-score (the normalised standard deviation), through the clustering techniques such as k-means or related techniques. These clustering techniques fall into the category of unsupervised machine learning. The idea here is that fraudulent transactions are different to non-fraudulent ones, this may be a temporal separation (i.e. a change in customer behaviour may indicate that their account has been compromised and used nefariously) or it might be a snapshot across a population where fraudulent actors have different behaviour than non-fraudulent ones. Clustering techniques and outlier detection seek to identify these “different” transactions, usually for further investigation – that’s to say automated methods are used as a support for human investigators not a replacement. This means that ranking transactions for potential fraud is key. Obviously fraudsters are continually adapting their behaviour to avoid standing out, and so fraud analytics is an arms-race.

Predictive analysis is more along the lines of regression, classification and machine learning. The idea here is to develop rules for detecting fraud from training sets containing example transactions which are known to be fraudulent or not-fraudulent.Whilst not providing an in depth implementation guide Fraud Analytics gives a very good survey of the area. It discusses different machine learning algorithms, including their strengths and weaknesses particularly with regard to model “understandability”. Also covered are a wide range of model evaluation methods, and the importance of an appropriate training set. A particular issue here is that fraud is relatively uncommon so care needs to be taken in sampling training sets such that algorithms have a chance to identify fraud. These are perennial issues in machine learning and it is good to see them summarised here.

The chapter on social networks clearly presents an active area of research in fraud analytics. It is worth highlighting here that the term “social” is meant very broadly, it is only marginally about social networks like Twitter and Facebook. It is much more about networks of entities such as the claimant, the loss adjustor, the law enforcement official and the garage carrying out repairs. Also relevant are networks of companies, and their directors set up to commit corporate frauds. Network (aka graph) theory is the appropriate, efficient way to handle such systems. In this chapter, network analytic ideas such as “inbetweeness” and “centrality” are combined with machine learning involving non-network features.

The book finishes with chapters on fraud analytics in operation, and a wider view. How do you use these models in production? When do you update them? How do you update them? The wider view includes some discussion of data anonymisation prior to handing it over to data scientists. This is an important area, data protection regulations across the EU are tightening up, breaches of personal data can have serious consequences for those companies involved. Anonymisation may also provide some protection against producing biased models i.e those that discriminate unfairly against people on the basis of race, gender and economic circumstances. Although this area should attract more active concern.

A topic not covered but mentioned a couple of times is natural language processing, for example analysing the text of claims against insurance policies.

It is best to think of this book as a guide to various topics in statistics and data science as applied to the analysis of fraud. The coverage is more in the line of an overview, rather than an in depth implementation guide. It is pitched at the level of the practitioner rather than the non-expert manager. Aside from some comments at the end on label-based security access control (relating to SQL) and some screenshots from SAS products it is technology agnostic.

Occasionally the English in this book slips from being fully idiomatic, it is still fully comprehensible – it simply reads a little oddly. Not a fun read but an essentially starter if you’re interested in fraud and data science.

Dec 26 2017

Review of the year: 2017

As I finish work for the year, and we await Christmas Day, it is time for me to start writing my “Review of the year”. This is a somewhat partial view of the world, as seen through the pages of my blog which these days is almost entirely book reviews, you can see a list of my blog posts for the year here. My Goodreads account tells me I have read 32 books this year.

Linked to reading, I wrote a post on Women Writers – I’ve been making an effort to read more books written by women over the last couple of years. This has worked out really well for my fiction reading, where I’ve found some new sci-fi authors to enjoy, and some, like Ursula Le Guin who have been around a while. Le Guin’s The Left Hand of Darkness is certainly in contention for my favourite novel ever. On non-fiction I’ve not had as much success – a chunk of my non-fiction reading is in technology and the number of women published in this area is tiny. I found the acknowledgements section of books by men a useful place to find women to follow on twitter.

This year I read Pandora’s Breeches by Patricia Fara – about women in science from about 1600 to 1850. I also read Hidden Figures by Margot Lee Shetterly, about the Africa-American women who worked as “human computers” for the organisation which was to become NASA. I think this told me more about being an African-American than being a woman. I hadn’t appreciated previously the sheer effort and determination required for African-Americans to progress, changing the laws to end legally-sanctioned discrimination was simply the first step (resisted at every turn by white supremacists).

I read some fairly academic history of science too, Inventing Temperature and Leviathan and the air-pump. Inventing Temperature is about the history of the measurement of temperature. Temperature is important to most physical scientists in one way or another, perhaps more so for ones like I once was. This book covers the less-told history, and re-surfaces some of the assumptions that these days are no longer taught or certainly don’t stick in the mind.  Leviathan and the air-pump is about the foundation of the experimental method as it is (roughly) seen today. I liked these two books because they didn’t follow the “great man” narrative which is what you get from reading scientific biographies – a much more common genre in the wider history of science.

I also read a few books on the history of Chester, following on from reading about Roman Chester last year. Two things struck me in this, one was the image of post-Roman Britons living in the ruins of the Roman occupation. Evidence from this period immediately following the Roman occupation, in Chester it amounts to a thin dark layer of material in the Roman barracks which could well be pigeon droppings! The second stand out was the fact that Chester’s mint/money making operation was bigger than London’s in the 9th century. I was also interested in the “Pentice” a curious timber structure attached to the St Peter’s church by the cross in the centre of town that appears to have been Chester’s administrative centre since the medieval period (it was demolished in the early 19th century).

In news outside the world of books, we had an election in the UK, the result was a bit of a surprise but we can probably agree we are not in a great position now politically with a weak government steadfastly refusing to even countenance ending the Brexit process and an official “opposition” in the Labour Party supporting them in this.

Surprise hit of the year was the ARK exhibition of sculpture at Chester Cathedral. I wouldn’t describe myself as a connoisseur of art, particularly not sculpture but I loved this exhibition. The exhibits were scattered through the cathedral and its grounds. A life-sized ceramic horse, and three very large egg-shaped objects making a very public sign of what lay within. It turns out that sculpture works really well in an old cathedral, there are so many shapes and textures to pick up on. This picture encapsulates it for me:

On the technology front I read about Scala, I’ve also wrote a post about setting up my work PC to use Scala which requires a bit of wrangling. I read about behaviour driven testing, and the potential downsides of data science from a social point of view and game theory.

A final mention goes to Ed Yong’s “I contain multitudes”, one of the first books I read this year, which is all about the interaction between microbes and the hosts they live with – including you and me. Possibly this is my favourite book of the year, but looking down the list I don’t think there was any book I regretted reading and a fair few of them were thoroughly excellent.

No holiday post this year, we were back in Portinscale, on the outskirts of Keswick again – notable achievement: getting Thomas (5) up several peaks – starting with Cat Bells! Embarrassment prevents from writing much about my Pokemon Go obsession, in my defence I will say that it is educational for Thomas and encourages him to walk places!

Dec 06 2017

Book review: Leviathan and the air-pump by Steven Shapin & Simon Schaffer

leviathan-airpumpLeviathan and the air-pump by Steven Shapin & Simon Schaffer has been recommended to me by a number of people. The book discusses the dispute between Thomas Hobbes, author of Leviathan, published in 1651 and Robert Boyle, who published his first works using his scientific works involving the air-pump in 1660. It is about the foundation of the scientific experimental method.

Leviathan and the air-pump was first published in 1986, I read the 2011 second edition which has a lengthy introduction discussing reactions to the first edition of the book.

The aim of the book is to use this quite narrow case study to learn more about the rise of the “experiment” as a central activity in the way science is done. The book also explores a different way of doing the history of science, certainly when it was originally published in 1985.

I feel I am falling amongst philosophers and sociologists in reading this book, the ideas of Wittgenstein on “language-games” and “forms of life” are familiar to Mrs SomeBeans in her study for a doctorate in education.

Leviathan and the air-pump focuses on two of Boyle’s experiments in particular: his recreation of Torricelli’s experiment which sees what we now know to be a partial vacuum form above mercury in an upturned, closed cylinder and an experiment on the adhesion of smooth surfaces in a vacuum. The word “vacuum” turns out to be pivotal in the dispute with Hobbes. Hobbes held the philosophical view that there could be no such thing as a vacuum, whilst Boyle held a more mechanistic view that he did a thing which produced a space devoid of air (or much reduced in it) which he would call a “vacuum”. The book could do with a little more explanation of the modern view of these experiments. The adhesion of smooth surfaces experiment, in particular, I believe is probing a different phenomena to that which Boyle believed.

Shapin and Schaffer’s account of Boyle’s work covers both the mechanics of the experiment but also its role such experiments in generating “matters of facts”. This rests on three pillars: doing the experiments in public, a goal of replication and an experimental write up, along the lines of the modern form. The air-pump was a relatively early scientific instrument which allows some dissociation between the experimenter and the audience. Criticism of the device is not criticism of the experimenter.

Hobbes attacked Boyle on various fronts, fundamentally it did not hold with experimentation as a route to discovering the underlying causes of things. That role fell to philosophising and pure, rational, thought. Geometry was Hobbes’ model for that manner of discovery. Shapin & Schaffer discuss, briefly, other critics of Boyle. Franciscus Linus gets a somewhat patronising treatment, he is in favour of experimentation and actually does some himself but Boyle is not impressed. Henry More believes in experiments, but only to demonstrate the need for God in explaining the world.

Hobbes and the Royal Society, of which Boyle was a key figure, bore the scars of the recent English Civil War, they were desperate for peace but they sought it in different ways. The Royal Society were collegiate and sought discussion followed by agreement over matters of fact. Hobbes, on the other hand, wanted peace by authority – there was a correct answer and that should be accepted through authority. Boyle and the Royal Society wanted to demonstrate that the experimental method that they were developing allowed the generation of beneficial knowledge without rancour. I wonder whether reports of the extreme disputatiousness of Isaac Newton are a continuation of the Hobbes/Boyle argument.

It is easy to believe that this discussion between Boyle and Hobbes is long in the past but visit a physics department and see the interaction between experimental and theoretical physicists. There is a strong whiff of the Hobbesian about some theoretical physicists. Some theories pass because they are considered too beautiful to be wrong, deviations between theory and experiment are sometimes seen as a problem with the experiment (that’s not to say the experiments are perfect!). Experimentalists are seen, to a degree, as crude mechanicals.

Replication, discussed in this book, is a still-present issue. In the early years of the air-pump replication was only achieved, principally by Huygens, by those that had visited London and seen the original in action. No-one replicated the air-pump based solely on written reports. This is, to a degree, still true today. A secondary issue here is that the rewards of replication are minimal, particularly in the biological sciences where so-called p-hacking means that any experiment can produce a “significant” result that won’t be replicatable.

I enjoyed Leviathan and the air-pump, for me as a modern scientist, the detail of the dispute is fascinating. I can see the book being somewhat controversial amongst historians of science since it likely gives Hobbes more of a hearing, and more impact than previously. It also gives the political climate of the time a leading role in the creation of the experimental method, and by its narrow focus makes Boyle feel like the “inventor” of the modern experimental method. Overall, the book is pretty readable although it stretched my vocabulary in places – I found the preface to the second edition less readable than the original book.

Oct 28 2017

Book review: The Art of Strategy by Avinash K. Dixit and Barry J. Nalebuff

art_of_strategyNext up, some work related reading. The Art of Strategy: A Game Theorist’s Guide to Success in Business and Life by Avinash K. Dixit and Barry J. Nalebuff.

The Art of Strategy is about game theory, a branch of economics / mathematics which considers such things as the “ultimate game” where one player choses how to split $100 (i.e. keeping $60 and giving away $40) and a second player decides to accept or reject the split, in the latter case neither of them gets any money. In the former case they get the offered split.

In the “prisoners dilemma” two prisoners are each offered the opportunity to give evidence against the other. If one of them does this, and the other doesn’t, then they will be set free, whilst their fellow prisoner services a sentence. If both betray the other then they will both serve a longer sentence than if they had both kept quiet.

These examples represent the simplest two main types of game, the ultimate game is an example of a sequential game (where one player makes a decision followed by the other) whilst the prisoners dilemma is an example of a simultaneous game (where players make their decisions simultaneously). In real life, chess is an example of a sequential game and a sealed bid auction is a simultaneous game. Games are rarely played as a single instance, simultaneous games may be repeated (“the best out of 3”), and sequentially games may involve many moves. This repetition enables the development of strategies such as “tit for tat” and punishment. 

The ultimate game and the prisoners dilemma provide a test bed for game theory, normally illustrating that real humans don’t act as the rational agents that economics intended! For example, in the ultimate game players really should accept any non-zero offer since the alternative is getting nothing, in practice players will reject offers even as high as $10 or $20 as unjust. 

Sequential games are modelled using “game trees”, which are like “decision trees”. Simultaneously games are modelled with payoff tables. The complexity of real sequential games, such as chess, means we cannot inspect all possible paths in the game tree, even with high power computing.

The first part of the The Art finishes with some strategies for simultaneous games. These are to look for dominant strategies where they are available, i.e they are the best strategy regardless of what the other players do. If this isn’t possible eliminate dominated strategies, i.e. those which are always beaten by your opponent. Nash equilibria are those moves which could not be improved, even given knowledge of an opponents moves. There can be multiple Nash equilibria in a game, which means if strategies are not explicitly stated the the players must guess which strategy the other player is using and act accordingly. This section also covers how social context influences play, and ideas of “punishment”.

The second part of the book looks at how the strategies described in the first part are used in action, although these examples are sometimes somewhat hypothetical. This part also introduces randomness (called “mixed strategies”) as a component of strategies.

The final part of the book covers applications of game theory in the real world, including auctions, bargaining and voting. I was interested to learn of the several sorts of auction, the English, Dutch, Japanese and Vickrey. The English auction is perhaps the one we are the most familiar with, participants signal when they wish to make a bid, and the bid rises with time. The Japanese auction is similar in that the bid is always rising but in this case all bidders start in the auction with their hands raised (indicating they are bidding) and put their hands down when the price is too high. A Dutch auction is one in which the price starts high, and drops, the winner is the one who first makes a bid. Finally, a Vickrey auction is a sealed-bid auction where the winner is the one the makes the highest bid, but they pay the second highest value.

Auctions are big money, the UK 3G spectrum auction in 2000 raised £22.5 billion from the participants. It’s worth spending some money to get the very best game theorists to help if you are participating in such an auction. The section on bargaining is relevant in the UK at the moment given the Brexit negotiations, particularly the idea of the Best Alternative to a Negotiated Agreement (BATNA). Players must determine their pay off relative to the BATNA, and must convince their opponents that the BATNA is as good as possible.  

I found the brief descriptions of  concrete applications of game theory such as in the various “spectrum” auctions for mobile phone systems, and the formation of price fixing cartels the most compelling part of the book.

Game theory is a central topic in at least parts of economics, as witnessed by the award of the pseudo-Nobel Prize for Economics in this area – there is a handy list here (http://lcm.csa.iisc.ernet.in/gametheory/nobel.html), if you are interested.

The Art of Strategy has some overlap with books I have read previously, the decision tree/game trees have some relevance to Risk Assessment and Decision Analysis with Bayesian Networks by Fenton and Neil (which uses the Monty Hall problem as an illustration). The Undercover Economist by Tim Harford discusses game theory and its relevance to the mobile frequency auctions in the UK, as well as the example of information in buying second hand cars. The Signal and the Noise by Nate Silver has some discussion of gaming statistics.

Older posts «

» Newer posts