Nov 04 2019

Book review: Deep learning with Python by François Chollet

Deep learning with Python by Francois Chollet is the third book I have reviewed on deep learning neural networks. Despite these reviews only spanning a couple of years it feels like the area is moving on rapidly. The biggest innovations I see from this book are in the use of pre-trained networks, and the dominance of the Keras/Tensorflow/Python ecosystem in doing deep learning.

Deep learning is a type of artificial intelligence based on many-layered neural networks. This is where the “deep” comes in – it refers to the numbers of layers in the networks. The area has boomed in the last few years with the availability of massive datasets on which to train, improvements in numerical algorithms for training neural networks and the use of GPUs to further accelerate deep learning. Neural networks have been used in production since the 1990s – by the US postal service for reading handwritten zip codes.

Chollet works on artificial intelligence at Google and is the author of the Keras deep learning library. Google is also the home of Tensorflow, a lower level library which is often used as a backend to Keras. This is a roundabout way of saying we should expect Chollet to be expert and authoritative in this area.

The book starts with some nice background to machine learning. I liked Chollet’s description of machine learning (deep learning included) being about finding a representation of data which makes the problem at hand trivial to solve. Imagine taking two pieces of coloured paper, placing them one on top of the other and then crumpling them into a ball. Machine learning is the process of un-crumpling the ball.

As an introduction to the field Deep Learning in Python runs through some examples of deep learning applied to various classes of problem, including movie review sentiment analysis, classifying newswire articles and predicting house prices before going back to discuss some issues these problems raise. A recurring theme is the problem of overfitting. Deep learning models can learn their training data really well, essentially they memorise the answers to questions and so when they are faced with questions they have not seen before they perform badly. Overfitting can be addressed with a range of techniques.

One twist I had not seen before is the division of the labelled data used in machine learning into three, not two parts: training, validation and test. The use of training and validation parts is commonplace, the training set is used for training – the validation set is used to test the quality of a model after training. The third component which Chollet introduces is the “test” set, this is like the validation set but it is only used when your model is about to go into production to see how it will perform in real life. The problem it addresses is that machine learning involves a large number of hyperparameters (things like the type of machine learning model, the number of layers in a deep network, the form of the activation function) which are not changed during training but are changed by the data scientist quite possibly automatically and systematically. The hyperparameters can be overfitted to the validation set, hence a model can perform well on validation data (that it has seen before) but not on test data which represents real life.

A second round of examples looks at deep learning in computer vision, using convolution neural networks (convnets). These are related to the classic computer vision process of convolution and image morphology. Also introduced here are recurrent neural networks (RNNs) for applications in processing sequences such as time series data and language. RNNs have memory across layers which dense and convolution networks don’t, this makes them effective for problems where the sequence of data is important.

The final round of examples is in generative deep learning including generating text, the DeepDream system, image style transfer and generating images of faces.

The book ends with some thoughts of the future. Chollet comments that he doesn’t like to use the term neural networks which implies the ability to reason and abstract in the way that humans do. One of the limitations of deep learning is that, as currently used, does not have the ability to abstract or generate programmatic descriptions of solutions. You would not use deep learning to launch a rocket – we have detailed knowledge of the physics of rockets, gravity and the atmosphere which makes a physics-based approach far better.

As I read I realised that keeping up with what was new in machine learning was a critical and challenging task, Chollet answers this question exactly suggesting three approaches to keeping abreast of new developments:

  1. Kaggle – the machine learning competition site;
  2. ArXiv – the preprint server, in particular http://www.arxiv-sanity.com/ which is a curated view of the machine learning part of arXiv;
  3. Keras – keeping up with developments in the Keras ecosystem;

If you’re going to read one book on deep learning this should probably be the one, it is readable, covers off the field pretty well, Chollet is an authority in this area and in my view has particularly acute insight into deep learning.

Sep 04 2019

Book review: Superior by Angela Saini

superiorNext I turn to Superior: the return of race science by Angela Saini, having recently read Inferior by the same author. Inferior discusses how men of science have been obsessed with finding differences in all manner of human abilities on the basis of gender. Superior does the same for race.

In both cases largely male, white scientists spend inordinate amounts of time and effort trying to demonstrate the superiority of white males. There is a still pervasive view amongst scientists that what they do is objective and somehow beyond the reach of society. However, there is a a choice to be made in what is studied which goes beyond the bounds of science. Unlike Inferior, Superior reveals explicit funding and political support for racist ideas which stretch to the present day.

For Saini this is somewhat personal since she is of Indian origin, and considered a “Black member” by the NUJ. This highlights one of the core issues with race. The limited palette of races introduced in the 18th century ignored the huge variations across Africa and Indian to render the world down to White, Black, Indian, Chinese.

Furthermore the genetic variations within races, are bigger than those between races. Race was a construct invented long before we knew anything about genes, and it was a construct assembled for specific geopolitical purposes. The fundamental problem with race science is that it is literally skin deep, you might as well try to establish the superiority or otherwise of people having brown eyes, or red hair. The variations in genes amongst red-heads are as large as those between red-heads and blondes.

The account is historical, starting with the first “research” into race, when Britain, France and other countries where building empires by colonisation and the slave trade was burgeoning. It became important to rationalise the mistreatment of people from other countries, and race was the way to do it. White scientists neatly delineated races, and asserted that the white race was at the top of the pile, and thus had the right to take the land of other races, who were not using it correctly, and subjugate them as slaves.

Darwin’s work on evolution in the 19th century gave race science new impetus, white superiority could be explained in terms of survival of the fittest – a natural law. These ideas grew into the science of eugenics which had the idea of improving human stock through breeding. This wasn’t a science practiced in the margins, still renowned figures at the heart of statistics and biology were eugenicists.

Eugenics increased in importance prior to the Second World War but the behaviour of Hitler and the Nazis meant it fell out of favour thereafter. This is not to say race science disappeared. In 1961 a number of academics set up the journal Mankind Quarterly, funded by Wickliffe Draper’s Pioneer Fund. This had the appearance of a respectable academic journal but was in fact an echo chamber for what was essentially white supremacists. Similar echo chambers were set up by the tobacco and oil industries on for smoking and climate change. They look sufficiently like academic science to fool outsiders, and for politicians to cite them in times of need but the rest of their parent fields look on in horror. Mankind Quarterly is still published to this day, in fact within the last couple of years Toby Young was forced to resign as director of the Office for Students having attended meetings at University College London organised by Mankind Quarterly. University College London has a troubled relationship with race science.          

This isn’t to say that all race science is maliciously racist. The human genome project led to plans to establish the diversity of the human species by sequencing the DNA of “isolated” groups, this typically meant indigenous people. Those promoting this diversity work were largely liberal, well-meaning if not somewhat paternalistic but their work was riven by ethical concerns and the natural concerns of indigenous people who they sought to sample.

Touching on origins Saini observes, once Neanderthal DNA was found in white Western Europeans the species experienced something of revival in reputation. Once a by-word for well, being Neanderthal, they are now seen as rather more sophisticated. It turns out the 10,000 year old Cheddar man was surprisingly dark-skinned certainly to Britons wishing to maintain their ancestry was white. The key revelation for me in this section was the realisation that large scale migration in prehistoric times was on on-going affair, not a one off. Waves of people moved around the continents, replacing and merging with their predecessors. 

It has been observed that some races are prone to particular medical conditions (at least if they are living in certain countries, which should be a clue) therefore we seek a genetic explanation for these differences. This approach is backed by the FDAs approval of a drug combination specifically marketed to African Americans for hypertension. Essentially this was a marketing ploy. African Americans experience significant environmental challenges which are risk factors for hypertension, hypertension is a complex condition for which there is no simple genetic explanation.

Even for sickle cell anaemia, for which there is a strong genetic basis, using race as a proxy is far from ideal – the rate of the sickle cell anaemia gene varies a great deal across Africa and is also common in Saudi Arabia and India. A former colleague of mine from Northern Italy had the condition.

For a middle-aged white Western European male scientist Superior is salutatory reading. As for Inferior people men like me have repeatedly asked “What makes us so special?”, it is long past time to stop. 

Aug 08 2019

Book review: Gods and Robots by Adrienne Mayor

gods_and_robotsThis is a review of Gods and Robots:Myths, Machines, and Ancient Dreams of Technology by Adrienne Mayor. The book is about myths from mainly ancient Greece, and how they relate to modern technology. Material from the Etruscans, and to a lesser extent Indians, Chinese and Romans play a part.  

The central characters in the book are Talos, Medea, Daedalus, Hephaestus, Pandora, Prometheus with a small part for Pygmalion.

Talos was an animated bronze statue, made by Hephaestus – the god of the forge, to guard the island of Crete. He appears in a number of stories but most notably in that of Jason and the Argonauts. He is killed by Medea, who promises him eternal life but instead kills him by removing a bolt from his ankle which releases the vital fluid that fills his body. Talos appears in mythology in the 5th century BC. Medea is a witch who offers Pelias eternal life but instead tricks his children into killing him.

Prometheus was one of the mythical Titans, and was responsible for creating humans, and giving them fire – for which he was punished by Zeus. Depictions of his act of creation vary, with earlier representations showing him manufacturing humans in the way one might assemble an automata whereas later representations show him animating human figures.

On my first reading Daedalus appeared to be a semi-mythical character, that’s to say a historical figure whose actions were extended in the telling. However reading more widely it seems he was mythical. His actions, designing the labyrinth for King Minos, making wings for himself and Icarus to escape the King and making highly realistic statues are on the boundaries of the possible.

Pandora was constructed by Hephaestus at Zeus’s direction to bring evil to the world. Again she is a constructed being, and representations of her show her as being different from other figures with something of an “uncanny valley” look to her. Pygmalion carved a statue of a beautiful woman which Aphrodite brought to life. I was intrigued to discover that the self-propelled mops and buckets in Disney’s Fantasia have antecedents in Goethe in the late 18th century and Lucian of Samosata in 150AD.

Bronze statuary seem to be a recurring theme in the book, for the period when these myths arose bronze was seen as an almost magical metal. Artisans had learnt the lost wax process for casting which enabled them to create hyper-realistic statues of people and animals. In myth these bronze statues were associated with heat, Talos kills his victims by crushing them in a burning embrace, Hephaestus’s Colchis bulls were made from bronze and had burning skin.

Another theme of the book is immortality, something that mortals seek to gain from the gods. Reflected today in modern life extension research.

The Greek mythology I know I believe I learnt as a fairly young child from films like Jason and the Argonauts (cited in this book), and I think the book of the Clash of the Titans film amongst other modern reproductions of these tales. What is missing from this education is an appreciation that the characters in these stories are not singular, in the historical record there are multiple versions of each character stretching back over many hundreds of years.

As someone with no classical training I am very curious as to how the stories in this book reach us. As I understand the earliest original writing on paper is something like 1500 years old which leaves a thousand or so years back to some of the stories in this book. Mary Beard’s book SPQR touches on this briefly. Gods and Robots does not explicitly address this question, it shows us pictures of ancient vases and talks about works that have been lost but are recorded by later writers but does not join the dots.

What do these antecedents to robots in ancient myths mean to us today? Perhaps the striking thing is that the robots we see in cinema dating from the early 20th century, and being realised now, have precursors dating back thousands of years. We have been imagining robots for a very long time similarly we have been imagining life extension for a very long time. However, machines for computation and factories do not figure in this book. The Antikythera mechanism makes a brief appearance – it dates from something like 100BC – but it is more an illustration of how sophisticated Greek artisans had become. In some ways the striking thing about the Antikythera mechanism is its uniqueness, it is the only device of its type from this period and we don’t even see simpler devices as far as I am aware. 

The book finishes with some instances of mythological devices appearing in real (ancient) life. It has to be said that a number of these are of a gruesome nature, more than one tyrant took pleasure in forcing people to jump from great heights with Daedalus-like wings leading to their death. Phalaris used a bronze bull to roast people alive.

This is a fairly short book, I think the main thing it leaves me with is a desire to know more about the myths of the ancient world have reached us rather than the central topic of the book.    

Jul 24 2019

Book review: Designing Data-Intensive Applications by Martin Kleppmann

Designing Data-Intensive Applications by Martin Kleppmann does pretty much what it says in the title. The book provides a lot of detail on how various types of databases and database functionality work, and how these can be plumbed together to build applications. It is reminiscent of Seven Databases in Seven Weeks by Eric Redmond and Jim R. Wilson, in the sense that it provides a broad overview of a range of different data systems which are specialised for different applications. It is authoritative and well-written. Seven Databases is more concerned with the specifics of particular NoSQL databases whilst Designing Data-Intensive Applications is concerned about data applications rather than just the underlying database.

The book is divided into three broad sections covering foundations of data systems, distributed data and derived data. Each chapter starts with a cartoon map of the territory, which I thought would be a bit gimmicky but it serves as a nice summary of what the chapter covers particularly in terms of the software available.

The section on data systems talks about reliability, scaleability and maintainability before going on to discuss types of database (i.e. relational, graph and so forth) and some of the low-level implementation of data storage systems such as hash indexes and B-trees.

Reliability is about a system returning responses in a timely fashion, Amazon have observed sales drop by 1% for every 100ms of delay, other have reported a drop in consumer satisfaction of 16% with 1 second slowdown. The old academic in me twitches at providing these statistics without citing the reference. However, Designing Data-Intensive Applications is heavily referenced.

There is some interesting historical detail, including the IMS database which IBM built for the Apollo space program in the late 1960s (which is still available today), and the CODASYL database model for graph-like databases from a little later. Its interesting to see how some of these models have been revisited recently in light of the advent of fast, large memory in place of slow disk or even tape drives.

I was introduced to databases rather late in my career, they are not really a core part of the scientific computing background I have. Learning the distinction between OLAP (analytics) and OLTP (transactions) databases was useful. Briefly, transactional databases are optimised to work on single rows and provide fairly strong guarantees on transactional integrity. The access pattern for analytics databases is different, typically analytical workflows want to take the contents of an entire column and carry out aggregations and calculations over the whole column. Transactions are not so important on such databases but consistency is important, a query may take a long time to run but it should provide results as if it ran on the database at a single point in time. These workflows are better serviced by so-called column-stores such as Vertica.

The section on distributed data systems covers replication, partitioning, transactions and consensus. The problem with distributed systems is that you never know things have failed for ever, and its difficult to know what order things have happened in. This reminds me a bit of teaching special relativity to physics undergraduates long ago.

It is hard to even be able to rely on timekeeping on servers. I found this a bit surprising, when we put our minds to measuring time we can be incredibly accurate. GPS time signals have an accuracy significantly better than microseconds, yet servers synced well using NTP (Network Time Protocol) achieve something like 100 milliseconds – a factor of thousands poorer. And this accuracy is only achieved if everything is configured correctly. This is important because we therefore cannot rely on timestamps to provide a unique order for events across multiple servers, nor can we even rely on timestamps synced with NTP to be always increasing!

The two big themes in terms of databases are transactions and consensus. These are the concepts that provide the best assurance on the integrity of operations and their success over distributed systems. I used the word “assurance” rather than “guarantee” deliberately because reading Designing Data-Intensive Applications it becomes clear that perfection is hard to achieve and there are always trade-offs. It also highlights the problems of the language used to describe features. Some terms are used in different ways in different contexts.

The derived data sections starts with praise for the Unix way of piping data between simple command line scripts, Data Science at the Commandline covers this area in much more detail. It then goes on to discuss the MapReduce ecosystem and the differences between batch and stream processing. This feels like a section I will be returning to.

The book finishes with some speculation as to the future of the field, the two thoughts that stuck with me are the idea of federated databases, systems which use a common query language to interface with multiple different datastores. The second idea is that of unbundling functionality so that, for example, data may be stored in a standard SQL database for unique ID based queries and in Elasticsearch for full-text search queries – in some ways these are simply different facets of the same idea.

Designing Data-Intensive Applications is a big book with no padding, it is packed with detail including many references, but remains readable. Across a fair number of titles this is definitely one of the better technology books I have read.

Jun 02 2019

Book review: Sprint by Jake Knapp

sprintSprint: How to Solve Big Problems and Test New Ideas in Just Five Days by Jake Knapp with John Zeratsky and Braden Kowitz is another book in my business oriented stream of reading.

The sprint is a 5 day programme for planning and running a consumer test of a prototype, starting on Monday with the consumer test on Friday. The programme is laid out in huge detail, even lunch times and break times, with suggested menus are proposed and a maximum size for the sprint team of 7. It is something of the spirit of a “sprint” in the Agile sense but not the same thing.

The book arises from the authors’ experiences at Google Ventures, a venture capital firm, and their work with startups for the most part. I suspect this has a bearing of the cited success of the process, startups are typically compact organisations and typically at the beginning they really need to get something in front of customers. This looks like a great way of doing that, I can see it being more challenging in a mature organisation. Knapp does provide some examples from more mature organisations as well, and mentions at the end that college lecturers have adapted it for courses.

Knapp sets up the sprint as being in contrast to a conventional brainstorming session where everyone has an equal say and no idea is too stupid. The drawback of the brainstorming method is that typically a huge number of ideas are generated in the session, many of questionable quality and then nothing happens afterwards.

Sprint is strong on the idea of a Decider, someone that will make the ultimate decision at points through the programme. The Decider is typically someone like the CEO but if the CEO can’t be available all week then they can delegate to someone else. The Decider can be influenced by spot-votes of other participants but they have the casting vote. Spot-voting is when participants indicate preferences by places sticky spots on items. The higher level implication of the Decider is that there is someone committed to the sprint who has the power to make things happen after the sprint has happened.

The five days of the Sprint are as follows:

  • Monday – defining the challenge and coming up with a target;
  • Tuesday – come up with solutions;
  • Wednesday – plan out the prototype;
  • Thursday – build the prototype;
  • Friday – Run the consumer test; 

My experience of brainstorming is that typically the challenge / target stage is done elsewhere, and the main action is in the “come up with solutions” stage. In this programme the “come up with solutions” part is more of an individual exercise than a group one.

The prototype is planned out as a storyboard of around 15 frames which represent the screens someone might see on a website or app as they conducted the core task. The key initial frame might be a fake news article linking to the prototype website.

The prototype is typically implemented as a facade, it is a fake of a website or app built largely in Keynote (Apple’s presentation software). Initially I bristled at this since my special skill is building fairly functional prototypes in short-order but even I would struggle to do that in one working day. Knapp provides a few of examples where the prototype is something else, they worked with a health clinic in the US which tried out a family friendly clinic arrangement in one of their existing clinics, a pump manufacturer who made the a sales brochure for a new pump rather than a model pump and a robotics company who had the majority of a prototype hotel delivery robot already built.

The commitment of time is large, attendees are expected 10am-5pm all five days, actually 9am-5pm on Friday. There is some scope to allow the Decider to make appearances intermittently, and Monday includes an “Ask the experts” session where outsiders can be brought in for 30 minutes or so. I can see in a larger company that it would be hard to carve out the required time. Also in a larger company it is unlikely you would get a genuine Decider on board, the output of the sprint process would go into competition with other priorities.

The book finishes with a summary of the 5 day programme, a shopping list – indicating the exact number of packs of Post-It notes you should provide and same questions and answers. To a degree I like this, these are my type of people but I can imagine for many the level of detail, control will be oppressive.

Sprint is a quick and easy read, it is chatty in style and is littered with little stories from sprints Knapp and his team have taken part in. I’m probably not in a position where I’d be able to implement the sprint programme in its entirety but provides a lot of food for thought, little ways of changing things.

Older posts «