In the 1980s Robert Axelrod invited submissions for a computer tournament in which people submitted strategies for the Iterated Prisoner's dilemma. This is a game in which individuals choose to either cooperate or defect with each other. Since then similar work has been used to understand the evolution of cooperative behaviour in an evolutionary setting.
Like a large number of early (and sadly ongoing) research code, the code from Robert Axelrod's work was lost. Similar ongoing work is often done with poor sustainable practice for the software involved. In 2015, a Python library aiming to reproduce Axelrod's work was put on github (the Axelrod library). It has since accumulated more than 200 strategies with contributions from Academics and hobbyists alike.
This vast OPEN treasure trove of game theoretic tools is now being used to undertake a number of research projects. Including one that looks at an evolutionary process called a Moran Process which is a model of a population in which the makeup of future generations of the population depend on how well individuals of the current generation perform.
In 2012 a piece of research claimed that there was no advantage to having long memory of interactions. The work this talk will describe demonstrates how that's not true in evolutionary dynamics. Indeed: complex strategies have been trained using reinforcement learning and the huge number of strategies available through the Axelrod library to perform particularly well.
Interesting behavioural aspects also emerge: without external input, the strong strategies evolve "handshakes". These handshakes allow them to recognise friend or foe in the population and act accordingly.
This work not only has implications at a game theoretic and reinforcement learning level but can also help understand how and why complex behaviour can emerge in evolutionary settings.