-
Research Profile
On 24th October 2009 Christopher D. Clack was awarded the Doctor of Science (ScD) degree from the University of Cambridge - the university’s highest degree, awarded for distinction in the advancement of science, and conferred on scientists "with a proven record of internationally recognised scholarship, including substantial and sustained contributions to scientific knowledge".
This award recognises Clack's contribution to Computer Science in a research career at UCL that started in 1984 with the development of the world's first parallel graph reduction computer system made from stock hardware. Since then, Clack's research has covered four areas:
- Functional Programming;
- Genetic Computation;
- Agent-Based Simulation; and
- Financial Computing
Distinguishing features of Clack's research have been his ability to engage with scientists of other disciplines, to explore how research developments in Computer Science can benefit science more broadly, and to validate the impact of his research via engagement with industry.
In 2009 Clack secured funding from industry, the Economic and Social Research Council, the Natural Environment Research Council, and the Technology Strategy Board to launch a national Knowledge Transfer Network for financial services.
-
The InterDyne Simulator (2011-)
The InterDyne Simulator is designed to support exploration of interaction dynamics and feedback effects in non-smooth complex systems. It is a general-purpose tool that can be applied to a wide variety of systems, though at UCL its primary use has been to simulate interaction dynamics in the financial markets.
- InterDyne is a discrete-time simulator, and simulations proceed for a stated number of time steps. The mapping of a simulator time step to a period of real time is a semantic issue for the designer of the experiment being run on InterDyne.
- InterDyne is an intrinsically deterministic simulator - a simulation will always behave the same way every time it is run (unless the programmer expressly includes nondeterminism). This determinism greatly assists the understanding of the low-level interactions that cause complex behaviour, i.e. it facilitates determination of the causal pathway of a particular behaviour. Despite being intrinsically deterministic, Interdyne permits two types of non-determinism to be expressed - (i) the programmer may include non-deterministic (or pseudo-non-deterministic) elements in the code for a component; and (ii) where a simulation component receives many data items in one time step from many other components, Interdyne may be instructed to provide those data items either sorted according to the sender's identity or sorted pseudo-randomly - however, the pseudo-random behaviour will be identical each time the simulation is run. If it is desired to run a simulation multiple times, each time with a different behaviour, then the pseudo-random behaviour can be provided with a different seed on each run.
- InterDyne interaction is effected via communication between components; InterDyne supports both one-to-one communication and one-to-many communication.
- InterDyne supports the precise definition of communication topology between components, to determine which interactions (communications) are permitted and which are not. This facilitates the design and implementation of simulations; an InterDyne simulation is a directed graph where the nodes are components (such as a trader, a news feed, or an exchange) and the edges are communication links.
- InterDyne supports the specification of a separate information delay for each possible interaction path defined in the communication topology; these delays are applied to both one-to-one communications and one-to-many communications.
- InterDyne permits components to be modelled at differing levels of detail. For example, one component may represent a trading algorithm modelled in great detail including internal data structures, interacting with another component that is modelled as a simple function generating orders according to a required statistical distribution.
- InterDyne simulations are programmed using a functional language - the most recent version uses Haskell. This facilitates rapid development of simulations, and permits such simulations to be viewed as executable specifications.
- InterDyne simulations may be interpreted either as executable specifications or as agent-based simulations.
- The primary output from an InterDyne simulation is a trace file, suitable for further analysis.
Development of the InterDyne simulator
Development of the InterDyne Simulator began in 2011, initially implemented using the functional language Miranda and then ported to Haskell. Several UCL students and researchers have been involved either in the development of InterDyne, or in using InterDyne to run experiments in Interaction Dynamics. [Learn more]
InterDyne documents
A draft InterDyne User Manual is available; this provides an introduction to the basic features of InterDyne. Other working papers and project dissertations related to InterDyne are also available. [Learn more]
-
InterDyne Development
Development of the InterDyne Simulator began in 2011, initially implemented using the functional language Miranda (link)(link to book).
To improve performance and to take advantage of improved profiling tools, InterDyne has since been ported to the functional language Haskell (link). The Glasgow Haskell Compiler (GHC) generates optimised and efficient native code.
Several UCL students and researchers have been involved either in the development of InterDyne, or in using InterDyne to run experiments in Interaction Dynamics. These have included (with apologies for any ommissions):
- Elias Court, who made significant contributions to the Miranda version of InterDyne, and to our understanding of interaction dynamics between HFT market makers.
- Richard Everett, who helped explore mechanisms for state-space analysis
- Kyle Liu, who undertook the initial port of InterDyne from Miranda to Haskell
- Dmitrijs Zaparanuks, who helped resolve initial problems with the Haskell port, and contributed greatly to the analysis and understanding of interaction dynamics between HFT market makers
- Justin Moser, who undertook initial experiments to explore the HFT "front-running" claims made by Michael Lewis
- Aman Chopra, who conducted a more detailed exploration of Lewis's HFT "front-running" claims and also helped to resolve some subtle problems with the Haskell port
- Vikram Bakshi, who substantially improved InterDyne's internal infrastructure and contributed new agents and a FIX messaging engine (link)
- Saagar Hemrajani, who has explored the impact of Reg NMS on Lewis's HFT "front-running" claims (link)
- Florian Obst, who has developed a visualisation tool for InterDyne trace files, to assist in detecting emergent behaviour
Prior to the development of InterDyne, other approaches to simulating Interaction Dynamics were explored. InterDyne occupies a "Goldilocks position" that neither models at a level of abstraction that is too high (eg probabilistic modelling) nor at a level that is too low (eg process modelling). Several student projects contributed to previous approaches, and these have included:
- Timothy Cowlishaw
- Peter Hilton (link to dissertation - draft)
- Gernot Rotzer (link to dissertation - draft)
- Zuhaib Sharif-Butt
- Jakub Kozlowski (link to dissertation - draft)
- Andrei Airimitoaie
- Dragan Jovanovic
-
InterDyne Documents
The InterDyne Simulator - Documents
A draft InterDyne User Manual is available; this provides an introduction to the basic features of InterDyne:
The following working paper illustrates how InterDyne's dynamic interaction model can be used to gain in-depth understanding of financial market instability:
The InterDyne Simulator is not yet properly documented. However, several student projects at UCL have been based on either the development of the underlying technology or the use of the simulator, or both. Some related project dissertations are provided below:
- Link to Elias Court's dissertation (draft) 2013
- Link to Yifei (Kyle) Liu's dissertation (draft) 2014
- Link to Aman Chopra's dissertation 2015
- Link to Vikram Bakshi's dissertation 2015
Vikram Bakshi implemented a fairly comprehensive FIX engine (so that components in a simulation of financial markets can interact via messages that contain information in a way that corresponds fairly closely to reality) and also implemented a component that simulates a subset of the behaviour of the BATS BYX exchange. Note however that although an exchange may include a FIX tag within one of its defined input or output messages it might only accept a subset of the qualifiers which are defined by FIX. Vikram therefore designed a mechanism by which exchange-specific messages can be defined, and can be tested for correctness by the compiler (using the type system). The mechanism for interaction with the BATS BYX exchange component is explained briefly in the following document: (link to document).
-
InterDyne Modelling
The InterDyne Simulator - Modelling
An InterDyne model can be viewed either as an analytically-tractable executable specification or as an agent-based model. Taking the latter approach, the model consists of a number of agents communicating via the passing of messages. By defining in advance which agents may communicate with which other agents, a graph structure is created where the nodes are agents and the edges are permissible one-to-one communication paths. Agents may also send a one-to-many message to a broadcast channel; each agent declares in advance the broadcast channels to which it will listen.
InterDyne is run by executing the function "sim" applied to appropriate arguments. The function sim has type sim :: Int -> [Arg_t] -> [(Agent_t, [Int])] -> IO() where the first argument is the number of time steps for the simulation, the second argument is a list of (key, value) pairs ("runtime arguments") that are made available to every agent in the system, and the third argument is a list of information about each agent (namely, a two-tuple containing the agent function of type Agent_t and a list of broadcast channel IDs to which the agent will listen). Output is sent to a file. Each agent is uniquely identified by its position in the list of agent information - the first agent has ID=1, the second has ID=2 and so on. ID=0 is reserved for the "simulator harness" function that mediates messaging and controls the passage of time during simulation. The agent identifiers are used to specify the source and destination of all one-to-one messages.
Agents are functions that consume a potentially-infinite list of inbound messages and generate a potentially-infinite list of outbound messages. At each time step an agent must consume one item from the inbound list and must generate one new item on the outbound list. Each item in these lists is itself a list, so that at each time step an agent may receive multiple inbound messages and may generate multiple outbound messages. If an agent does not have any messages to receive or send at a given time step then it will either receive or generate an empty list. Optionally an agent may distinguish between an output item that is an empty list by mistake and an output item that is an empty list by design - it does this by generating an output item that is an empty list containing the distinguished empty message called a "Hiaton".
-
InterDyne Modelling Example
The InterDyne Simulator - Modelling Example
In early experiments all messages were sent using agent identifers. It is still possible to do this, but we have since found it useful to refer to agents by name rather than by ID number, and we now recommend inclusion in the list of "runtime arguments" a function that when executed will convert an ID into a name or vice-versa. This is included in the runtime arguments so that it can be used by any agent and so that the mapping of names to identifers is in the control of the experimenter. The following example illustrates how the function sim can be called:
import Simulations.RuntimeArgFunctions as RTAFuns
exampleExperiment :: IO ()
exampleExperiment
= do
sim 60 myargs (map snd myagents)
where
myargs = [ convert ]
myagents = [ ("Trader", (traderWrapper, [1])),
("Broker", (brokerWrapper, [3])),
("Exchange",(exchangeWrapper, [2,3]))
]
convert = RTAFuns.generateAgentBimapArg myagentsIn the above example sim is instructed to run for 60 time steps; there is only one runtime argument called convert, and the third argument to sim is a list of agents and broadcast channels on which they will listen. The convert function is a partial application of the library function generateAgentBimapArg (available in InterDyne v0.25 and later) to myagents - the result will be a function that will convert a name to an ID and vice versa. The first agent subscribes to broadcast channel 1, the second subscribes to channel 3, and the third subscribes to channels 2 and 3. This example does not illustrate how to define an output file for the results, nor how to use names instead of integers for broadcast channels, nor how to specify the legal communications topology and the delays that should be applied to each communication link. It does however indicate the parsimonious style that can be achieved when using Haskell.
Agents are typically (but not always) written in two parts: (i) a "wrapper" function that manages the consumption of inbound messages, the generation of outbound messages, and the update of local state, and (ii) a "logic" function that is called by the wrapper function and which calculates the messages to be sent. The "wrapper" function is the true agent function, and it must be of type Agent_t
Here is a simple agent wrapper that does nothing (at each time step it consumes an inbound item, and creates an empty outbound item). It does not call a logic function:
f :: Agent_t
f st args ((t, msgs, bcasts) : rest) myid
= [] : (f st args rest myid)The agent function is recursively defined and loops once per time step. It takes four arguments: st is a local state variable (in this example it is never inspected and never changed), args is a copy of the runtime arguments (every agent is passed a copy of all the runtime arguments), the last argument myid is the ID of this agent (which is decided by the simulator and should never be changed) and the third argument is the list of inbound items - each item is a 3-tuple containing (i) the current time (an integer); (ii) a list of all one-to-one messages sent to this agent by other agents; and (iii) a list of all broadcast messages available at this time step on all the broadcast channels to which this agent is subscribed.
-
InterDyne Topology & Delays
When considering dynamical interaction between system components, we may wish to (i) define in advance the topology of legal interactions, and (ii) define the interaction delay that should be applied to such legal interactions (noting that delays may be asymmetric - i.e. that the delay from component A to component B may not be the same as the delay from B to A). InterDyne does not require the topology and delays to be defined, but provides support for such definition (from version 0.24 onwards).
To define topology and delays, two runtime arguments must be passed to the "sim" function (both arguments must be present): first, a function that takes two agent IDs (integers that uniquely specify the start point and end point of an interaction) and returns an integer delay in timesteps; and second, the maximum delay in the system. From InterDyne version 0.25 the delay function argument uses the data constructor DelayArg and the identifying string "DelayArg" and for backwards compatibility the maximum delay is a Double but should represent a whole number of time steps.
The experimenter has complete freedom to define the delay function, and if the interaction specified by the two agent IDs is illegal then the delay function should raise an error. Here is a very simple example for three agents:
import Simulations.RuntimeArgFunctions as RTAFuns
exampleExperiment :: IO ()
exampleExperiment
= do
sim 60 myargs (map snd myagents)
where
myargs = [ convert,
(Arg (Str "maxDelay", maxDelay)),
(DelayArg (Str "DelayArg", delay)) ]
myagents = [ ("Trader", (traderWrapper, [1])),
("Broker", (brokerWrapper, [3])),
("Exchange",(exchangeWrapper, [2,3]))
]
convert = RTAFuns.generateAgentBimapArg myagents
delay 1 2 = 1
delay 1 x = error "illegal interaction"
delay 2 x = 2
delay 3 2 = 3
delay 3 x = error "illegal interaction"
maxDelay = fromIntegral 3InterDyne experimenters often find it convenient to drive the delay function from an adjacency matrix. After setting the delays as shown above, the experimenter need do no more; one-to-one messages are automatically delayed by the stated number of timesteps, and broadcast messages are split into separate messages for each recipient and delayed by the amount stated for each link.
-
Financial Computing (2006-)
Clack's research focuses on the application of Computer Science in Finance. For example, Financial Computing research has drawn on Clack's expertise in Genetic Computation (GC), Multi-Agent Simulation (MAS), and Functional Programming:
- to develop new techniques for stock-picking that provide robust performance in volatile environments;
- to develop new techniques for managing multi-objective investment portfolios;
- to improve the performance of financial time series analysis;
- to explore the behaviour of markets; and
- to model and analyse dynamic coupling and feedback loops in financial markets.
-
Dynamic Interaction in Financial Markets
Dynamic Interaction in Financial Markets
Following the May 6, 2010 Flash Crash there is has been increasing interest in the dynamic behaviour of financial markets.
Given that the majority of exchange-based trading is computer-based, often with computers deciding how much to trade at what prices and at which venues, and given the rise of High Frequency Traders both as proprietary traders and as market makers, we are particularly interested in the dynamic behaviour of interacting financial algorithms. This includes inter-alia market-making algorithms, benchmark algorithms, arbitrage algorithms, momentum algorithms, contrarian algorithms and the matching algorithms used in the exchanges.
Our research is proceeding along the following lines:
-
To model dynamic interaction in discrete time. We focus on discrete-time modelling because in reality all interaction between computers is conducted in discrete time (albeit very fast discrete time, as data is latched into and out of the interfaces of digital computers), and because attempting to apply a continuous-time model to a discrete-time system can be problematic (cf Brigo and Mercurio 2000).
-
To model (i) detailed algorithms and strategies derived from published academic work; (ii) unpublished algorithms derived from private communication and personal experience; and (iii) representative algorithms that distil certain characteristics of real algorithms.
-
To explore the dynamics of simple dynamical systems by detailed specification of the associated discrete-time equations, by tracing the results of those interacting equations, and by analysing various state spaces of those systems. Of particular interest are issues such as information delay and feedback networks.
-
To explore the dynamics (including delay and feedback) of complex dynamical systems via simulation. To this end we have developed a Simulator for Interaction Dynamics (our working name for this simulator is "InterDyne"). The simulator is currently used to simulate interaction dynamics in financial markets, but the InterDyne framework is potentially applicable to other domains as well. The InterDyne Simulator supports precise control over communication topology and each communication link may have an associated information delay.
-
-
InterDyne Messages
InterDyne has a very rich messaging system that supports a wide range of messages from the very simple (for which examples are given below) to the very complex (for example, detailed FIX messages between components of a financial system). The messaging system is of central importance to InterDyne since messages are the medium for interaction between components.
Messages are all of type Msg_t and are either (i) one-to-one messages, or (ii) broadcast messages. All messages include a pair of integers that indicate (i) the IDs of the sending and receiving agents for one-to-one messages, or (ii) the sending agent ID and the receiving broadcast channel ID for broadcast messages. Examples include:
- Message (Int, Int) [Arg_t] - a one-to-one message, sending a list of (key, value) pairs
- Ordermessage (Int, Int) Order - a one-to-one message, sending an order (typically to a component that models an exchange)
- Datamessage (Int,Int) [Char] - a one-to-one message sending a string
- Debugmessage (Int, Int) [Char] - a one-to-one message sending a string (for debugging purposes)
- Broadcastmessage (Int, Int) Broadcast_t - a broadcast message sent to a broadcast channel and containing data whose type is further defined within Broadcast_t
If a message is sent to agent ID 0 it receives special treatment. Agent ID 0 is reserved for the simulator harness, and a message sent to ID 0 is printed to an output file. This is the primary mechanism for defining the output of the simulator. Currently the main output file is the "trace file" that records all messages sent to ID 0 except for those with constructor Datamessage - these messages are instead output to a special file "data.csv", and this provides mechanism for structured output that can be viewed as a spreadsheet.
Here is a simple agent wrapper that sends a debug message to the output trace file (via the simulator harness) at every time step. It does not call a logic function:
f :: Agent_t
f st args ((t, msgs, bcasts) : rest) myid
= [m] : (f st args rest myid)
where
m = Debugmessage (myid,0) ("Debug "++(show t)) -
Genetic Investment
The financial markets are subject to large and unpredictable changes. Computer algorithms that interact with financial markets have three choices: (i) adapt rapidly to change, (ii) be robust to change, or (iii) fail. But how should a Genetic Computation (GC) stock-picking system respond to volatile markets? After a market shift, the GC will need to re-train on new data, but initially there is insufficient data for re-training. Clack's research team has worked to evolve solutions that are robust to market changes:
- Behavioural GP Diversity for Dynamic Environments: an application in hedge fund investment (Yan & Clack, in Proceedings GECCO 2006, pp 1817-1824, ACM, 2006)
- Evolving robust GP solutions for hedge fund stock selection in emerging markets (Yan & Clack, in Proceedings GECCO 2007, pp 2234-2241, ACM, 2007) - (Best Paper award).
- Diverse committees vote for dependable profits (Yan & Clack, in Proceedings GECCO 2007, pp 2226-2233, ACM, 2007)
- ALPS evaluation in Financial Portfolio Optimisation (Patel & Clack, in Proceedings IEEE CEC 2007, pp 813-819, IEEE, 2007)
- Learning to optimize profits beats predicting returns - comparing techniques for financial portfolio optimisation (Yan, Sewell & Clack, in Proceedings GECCO 2008, pp 1681-1688, ACM Press, 2008)
- GP Age-layer and Crossover Effects in Bid-Offer Spread Prediction (Willis, Patel & Clack, in Proceedings GECCO 2008, pp 1665-1672, ACM Press, 2008)
- Robustness of multiple objective GP stock-picking in unstable financial markets: real-world applications track (Hassan & Clack, in Proceedings GECCO 2009, pp 1513-1520, ACM, 2009)
- Behavioural GP diversity for adaptive stock selection (Yan & Clack, in Proceedings GECCO 2009, pp 1641-1648, ACM, 2009)
The following paper was nominated for a Best Paper award; is now cited in Hitoshi Iba's key new book surveying the field of Evolutionary Computation in Financial Engineering; was supported by Thomson Reuters; prompted seven invited talks at industry conferences; and the work was also presented at two invited talks at subsequent GECCO conferences. This novel work is the first to identify a significant instability whereby cautious investment strategies can unwittingly become adventurous, and vice versa, following a shift in market regime.
- Multiobjective Robustness for Portfolio Optimization in Volatile Environments (Hassan & Clack, in Proceedings GECCO 2008, 1507-1514, ACM Press, 2008).
The following invited paper in the journal special issue on “Bio-Inspired Learning and Intelligent Systems” presents a Genetic Programming technique for portfolio optimisation; the first such technique to propose preserving diversity of dynamic phenotypic behaviour, and optimisation of investment returns rather than price-prediction. It is robust to market volatility, reducing down-time while gathering data for retraining, and demonstrates a 3-times performance improvement. The technique is being used by SIAM Capital, is included in Hitoshi Iba's recent book on Evolutionary Computation in Financial Engineering, and has been shown to have superior performance to Support Vector Machines (see above).
- Evolving robust GP solutions for hedge fund stock selection in emerging markets (Yan & Clack, Soft Computing 15(1):37-50, Springer 2011)
-
Genetic TSA
Financial time series analysis by Genetic Algorithms (GA) is characterised by large datasets with many non-linear dependencies. The default assumption that all genes are non-linearly linked leads to very large search times and intractability. Linkage learners can improve performance by identifying non-linear linkage, and dividing the search space where linkage does not occur. However, these learners are slow. The first paper below presents a novel perturbation-based linkage-learning algorithm that focuses on the semantics of composability, with consequential speed improvements, and the second paper applies linkage detection to financial time series:
- gLINC: Identifying Composability using Group Perturbation (Coffin & Clack, in Proceedings Genetic and Evolutionary Computing Conference (GECCO'06), pp 1133-1140, ACM, 2006)
- Nonlinearity linkage detection for financial time series analysis (Chiotis & Clack, in Proceedings Genetic and Evolutionary Computation Conference (GECCO'07), pp 1179-1186, ACM, 2007)
-
Market Simulation
Combining Multi-agent Simulation and Genetic Computation improves on simple analysis of historical data by incorporating feedback and learning. The following papers demonstrate how a simulation of hedging pressure in futures markets can cast doubt on predictions from previous analyses:
- Evolutionary simulation of hedging pressure in futures markets (Duke & Clack, in Proceedings IEEE Congress on Evolutionary Computing (CEC'07), pp 782-789, IEEE, 2007)
- Using an evolutionary agent-based simulation to explore hedging pressure in futures markets (Duke & Clack, in Proceedings Genetic and Evolutionary Computation Conference (GECCO'07), pp 2257, ACM, 2007)
-
Agent Based Simulation (2004-)
Agent Based Simulation (ABS) uses many autonomous rule-based "agents" interacting via co-operation and competition to simulate a real-world system; a particular challenge is to understand those interactions that lead to emergent and adaptive behaviours. The work undertaken by Clack with his colleagues has provided novel and effective solutions to research challenges in two areas:
- The use of ABS techniques in hypothesis formulation for complex behaviour in biological systems;
- The development of ABS theoretical techniques to model, detect and analyse emergent behaviour in dynamic environments.
-
Biological Systems
Biological systems are highly complex, and understanding how they respond and adapt to a dynamic environment is particularly challenging. Agent-based simulation can provide a convenient medium for domain experts to express and communicate complex ideas during hypothesis formulation. The following paper provides a high-level introduction to Bioscience Computing:
- BioScience Computing and the role of computational simulation in biology and medicine (Clack, in Intelligent Algorithms in Ambient and Biomedical Computing, eds. W.Verhaegh, E. Aarts and J.Horst, Philips Research Book Series, Vol. 7, pp 3-19, ISBN 1-4020-4953-6, Springer 2006).
Multi-agent simulation (MAS) techniques support exploration of non-evolutionary adaptations within a single lifetime. The following papers explore the computational processes that underlie specific biological mechanisms and for example demonstrate how a reasonably simple MAS can exhibit lifetime adaptation with multiple goal-oriented behaviours as emergent behaviour:
- The Artificial Cytoskeleton for Lifetime Adaptation of Morphology (Bentley and Clack, in Proceedings SODANS workshop proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems (ALIFE IX), pp 13-16, 2004).
- Morphological Plasticity: Environmentally Driven Morphogenesis (Bentley and Clack, LNCS 3630:118-127, ISSN 0302-9743, Springer 2005)
- Multi-level behaviours in agent-based simulation: colonic crypt cell populations (Chen, Nagl and Clack, in Proceedings Seventh International conference on Complex Systems, paper 22, New England Complex Systems Institute (NECSI) and Interjournal, 2008)
Clack collaborated with Eileen Cox (Natural History Museum) to supervise Katie Bentley's work in developing a novel agent-based technique to simulate morphological dynamics in living tissue; the following paper addresses diatom morphology and the technique has since been applied to cancer tumour morphology. This work has led to the founding of a morphogenesis modelling laboratory at Harvard. Unlike previous models, this simulation recreates detailed mechanisms and chemical interactions and the following paper (JCR impact factor 2.071) provides unexpected evidence that a single mechanism produces two different types of morphogenesis. Rib-domination phenomena are faithfully reproduced and the new model, experimental methodology, and results, were assessed against state of the art techniques and SEM data by the Natural History Museum.
- Diatom colony formation: A computational study predicts a single mechanism can produce both linkage and separation valves due to an environmental switch (Bentley, Clack and Cox, Journal of Phycology 48(3):716-728, Phycological Society of America, 2012)
-
ABS Theory
Agent Based Simulation Theory
How can complex behaviour in Biology and the Financial Markets be modelled, detected and analysed? Theoretical work in this area develops the new concept of "complex events" that extend in both time and space:
- Context sensitivity in individual-based modeling (Chen, Clack and Nagl, BMC Systems Biology 1, Supplement 1, pp 44, ISSN 1752-0509, BMC 2007)
- Specifying, Detecting and Analysing Emergent Behaviours in Multi-Level Agent-Based Simulations (Chen, Nagl and Clack, in Proceedings Summer Computer Simulation Conference (SCSC'07), pp 969-976, ISBN 1-56555-316-0, Society for Modeling and Simulation International, 2007
- A calculus for multi-level emergent behaviours in component-based systems and simulations (Chen, Nagl and Clack, in Proceedings Emergent Properties in Natural and Artificial Complex Systems (EPNACS'2007), in ECCS'07 European Conference on Complex Systems, pp 35-51, 2007)
- Multi-level behaviours in agent-based simulation: colonic crypt cell populations (Chen, Nagl and Clack, in Proceedings Seventh International conference on Complex Systems, paper 22, New England Complex Systems Institute (NECSI) and Interjournal, 2008)
- A method for validating and discovering associations between multi-level emergent behaviours in agent-based simulations (Chen, Nagl and Clack, LNCS 4953:1-10, Springer, 2008)
- Complexity and Emergence in Engineering Systems (Chen, Nagl and Clack, Studies in Computational Intelligence 168:99-128, Springer 2009)
- A formalism for multi-level emergent behaviours in designed component-based systems and agent-based simulations (Chen, Nagl and Clack, Understanding Complex Systems 44:101-114, Springer 2009)
The novel Complex-Event formalism described in the following paper has since been applied by oncologists to model cancerous tumour growth, with specific success in the field of colonic crypt tumorigenesis. The journal is ranked 3rd out of 14 journals covering the field of computer simulation. The formalism supports the modelling of phenomena that are distributed simultaneously across different physical and temporal levels; superiority over alternative modelling techniques has been independently validated, and as further validation this paper demonstrates how it extends the established X-Machines formalism.
- Identifying Multi-Level Emergent Behaviors in Agent-Directed Simulations using Complex Event Type Specifications (Chen, Clack and Nagl, Simulation 86:41-51, Sage, 2010)
-
Genetic Computation (1997-2000)
Genetic Programming (GP) systems were originally cumbersome and unable to evolve complex programs. But are these necessary impediments? Can GP be modified to become more efficient in the evolution of complex programs including polymorphism, recursion and higher-order functions? Building on earlier work in Functional Programming, a novel approach investigated the use of pure lambda-expression parse trees, augmented the GP system with a polymorphic higher-order type system, and incorporated recursion over finite lists. The results show substantial benefit, increase in expressiveness, decrease in parse tree size, and improved performance:
- Recursion, Lambda Abstraction and Genetic Programming (Yu and Clack, in Proceedings Third Genetic Programming Conference, eds. J.R. Koza et al, pp 422-431, ISBN 1-55860-548-7, Morgan Kaufman, 1998)
- PolyGP: A Polymorphic Genetic Programming System in Haskell (Yu a Clack, in Proceedings Third Genetic Programming Conference, eds. J.R. Koza et al, pp 416-421, ISBN 1-55860-548-7, Morgan Kaufman, 1998)
- Performance-Enhanced Genetic Programming (Clack and Yu, LNCS 1213:87-100,ISSN 0302-9743, Springer, 1997)
Traditional Genetic Algorithm (GA) behaviour was also unsatisfactory - to achieve good performance a crossover operator had to be chosen to match the chromosome encoding. By using explicit and detailed history about which previous modifications were beneficial and which were not, can a general-purpose GA crossover operator be devised that is insensitive to encoding? Selective Crossover is a novel extended GA representation with an enhanced crossover procedure: it monitors contributions to fitness on a per-gene basis and guarantees good behaviour regardless of problem encoding. It works exceptionally well with problems of high epistasis where it finds a better solution than the conventional crossover operators, and does this with the least number of evaluations (a repeatable 25% improvement):
- Royal Road Encodings and Schema Propagation in Selective Crossover (Vekaria and Clack, in Proceedings WSC4: 4th Online World Conference on Soft Computing in Industrial Applications, eds. Y.Suzuki et al, pp 281-292, ISBN 185233293X, Springer, 2000)
- Biases Introduced by Adaptive Recombination Operators (Vekaria and Clack, in Proceedings Genetic and Evolutionary Computation Conference (GECCO'99), eds. Banzhaf et al, pp 670-677, ISBN 1558606114, Morgan Kaufman, 1999)
- Selective Crossover in Genetic Algorithms: An Empirical Study (Vekaria and Clack, LNCS 1498:438-447, ISSN: 0302-9743, Springer 1998)
- Selective Crossover in Genetic Algorithms (Vekaria and Clack, in Proceedings Third Genetic Programming Conference, eds. J.R. Koza et al, pp 609, ISBN 1-55860-548-7, Morgan Kaufman, 1998)
- Genetic Programming with Gene Dominance (Vekaria and Clack, in Late Breaking Papers at the Genetic Programming 1997 Conference, pp 300, ISBN 0-18-206995-8, Stanford University Bookstore, 1997)
-
Strictness Analysis
Strictness Analysis (SA) determines during compilation whether functions definitely evaluate their arguments, and is essential for PFP systems that retain normal-order termination properties. However, SA time complexity is exponential. Though worst-case behaviour may be exponential, is it possible to find the fixed point efficiently for the common, well-behaved functions? The following two papers show how a simple lattice representation of finite height, with a monotonicity rule and a synergistic search algorithm, leads to a new combinatorially-implosive technique based on tracking "frontiers" in the lattice:
- "Finding fixpoints in abstract interpretation" (Peyton Jones and Clack,in Abstract Interpretation of Declarative Languages eds. S.Abramsky and C. Hankin, Chapter 11, pp 246-265, ISBN 0-7458-0109-9, 1987)
- "Strictness analysis - a practical approach" (Clack and Peyton Jones, LNCS 201:35-49, ISSN 0302-9743, 1985)
-
Functional Programming (1985-1999)
At its outset, Parallel Functional Programming (PFP) promised an elegant way to program parallel systems: if a sequential program were correct then correctness on a parallel system would be guaranteed - if a large number of practical implementation issues could be resolved. Prior to Clack's research, PFP was in its infancy and substantially inhibited by several issues, most notably how to obtain efficient lazy evaluation semantics. The research undertaken by Clack with his colleagues provided effective and seminal solutions to a wide range of research challenges in PFP:
- how to detect needed ("strict") computations statically, efficiently and automatically?
- how to use strictness to provide lazy evaluation semantics in a parallel system?
- how to guarantee a run-time system provides efficient management of dynamic parallel tasks?
Theoretical advances were demonstrated through the design and construction of real parallel hardware, with all the required software infrastructure to support PFP applications.
- "Foundations" (Michaelson, Hammond and Clack, in Research Directions in Parallel Functional Programming, Chapter 2, pp 31–61, ISBN 1-85233-092-9. 1999) introduces FP and implementation issues, and
- "Realisations for Non-Strict Languages" (Clack, in Research Directions in Parallel Functional Programming, Chapter 6, pp 149-187, ISBN 1-85233-092-9. 1999) gives a retrospective survey and comparative evaluation of non-strict PFP research to 1998.
Additionally, Clack's functional programming research has contributed to FP language design, and the wider field of Memory Management (MM):
- how to profile lazy FP run-time performance?
- can object-oriented semantics combine with FP?
- can dynamic MM be fast and small?
-
GRIP Parallel Processing Architecture
How can system architecture be best designed as an efficient target for a PFP compiler? GRIP (Graph Reduction In Parallel) explored the design space using stock processors and custom hardware such as intelligent memory units, and new compilation and run-time system technology were successfully demonstrated.
Elements of GRIP were eventually transferred to industry and incorporated into the ICL "Goldrush" parallel database server.
The GRIP architecture and its programing paradigm for PFP are described in the following papers:
- GRIP: the GRIP Multiprocessor (Clack, in Parallel Computing Principles and Practice, ed. T. J. Fountain, Chapter 7, pp 266-275, ISBN 0-521-45131-0, CUP 1990).
- High Performance Parallel Graph Reduction (Peyton Jones, Clack and Salkild), LNCS 365:193-206, ISSN 0302-9743, Springer 1989). This paper is also reprinted as:
- High Performance Parallel Graph Reduction (Peyton Jones, Clack and Salkild, in Programming Languages for Parallel Processing, eds. D. Skillicorn and D.Talia, Chapter 5, pp 234-247, ISBN 0-8186-6502-5, IEEE Computer Society Press 1994). Described as a collection of reprints of "the most important parallel-programming languages designed in the last decade".
- Functional Programming on the GRIP multiprocessor (Peyton Jones, Clack, Salkild and Hardie, in Proceedings IEE International Specialist Seminar on Design and Application of Parallel Digital Processors, pp 116-127, available electronically from the IEEE at this link, IEEE. 1988)
- GRIP - a parallel graph reduction machine (Peyton Jones, Clack, and Salkild), ICL Technical Journal 5(3):595-599, ISSN 0142-1557, OUP 1987).
- GRIP - a high performance architecture for parallel graph reduction (Peyton Jones, Clack, Salkild and Hardie, LNCS 274:98-112, ISSN 0302-9743, Springer 1987). This paper is also reprinted (plus an additional two pages) as:
- GRIP - a high performance architecture for parallel graph reduction (Peyton Jones, Clack, Salkild and Hardie, in Multiprocessor Computer Architecture, eds. T. J. Fountain and M. J. Shute, Chapter 5, pp 101-120, ISBN 0-444-88215-4, Elsevier 1990)
-
Four Stroke Reduction Engine
At its outset, Parallel Functional Programming (PFP) offered "painless parallelism" due to the lack of side effects. However, the existing ALICE system at Imperial was disappointingly slow. Did this presage some fundamental problem, or was ALICE simply immature? How could a compiler bridge the gap between the highly abstract functional program and the details of parallel execution? and how could this be done efficiently?
The four-stroke reduction engine (4SRE) was one of the earliest abstract machines for PFP, the first to be implemented on GRIP (see separate section), and the progenitor of several novel features found in subsequent PFP systems: e.g. the "evaluate and die" model for transparent communication of results from child to parent tasks; the encoding of the dump in the heap; several novel mechanisms for throttling parallelism (including "discardable sparks" and preferential resumption of suspended tasks); and the lack of explicit communication between tasks. The four-stroke reduction engine is described in the following paper:
- "The four-stroke reduction engine" (Clack and Peyton Jones, in Proceedings of the 1986 ACM Conference on Lisp and Functional Programming, pp 220-232, ISBN 0-89791-200-4,1986).
This paper is also reprinted in the following book of seminal papers:
- Selected Reprints on Dataflow and Reduction Architectures, ed S.S.Thakkar, Chapter 7, pp 327-339, ISBN 0-8186-0759-9, 1987.
-
Lexical Profiling
Manual optimisation of run-time time and memory usage is problematic for lazy Functional Programing (FP) languages. FP programmers do not specify detailed memory allocations, and function evaluation may be interlaced dynamically; the mapping between the source code and what happens at run-time is complex. Is it possible to provide meaningful and intuitive profiling information in such a context? and if so how? This target behaviour was achieved by attributing time and space costs to portions of the source code as though strict evaluation had occurred, whilst reporting the quantitative measures of the actual lazy evaluation. The following paper provided the design and the first implementation of this technique for an interpreted language, together with suggestions of how to implement it for a compiled language. The technique was adopted and further developed by the profiler for the popular GHC functional language compiler.
- Lexical Profiling: Theory and Practice (Clack, Clayman and Parrott, Journal of Functional Programming, 5(2):225-277, Cambridge University Press, ISSN 0956-7968, 1995
-
Dynamic Memory Management
Dynamic Memory Management (DM) is essential to the implementation of functional languages and many other modern programming languages. Prior to Clack's research, DMM could either be fast or small (i.e. have low fragmentation) but not both. DMM is a very mature research area, and to identify a new algorithm is a considerable challenge. By using a tree bitmap representation and altering the search semantics, a best-fit free-list memory allocator was created that provides both a ten times improvement in worst-case performance and excellent best-case performance. The allocator was patented:
- A Data Structure, Memory Allocator and Memory Management System (Clack, UK Patent Application number 9924061.6, filing date 11th October 1999. PCT application number WO0231660, filing date 10th October 2001. UK and EU Patent Offices.
-
PhD students supervised
PhD students supervised (first supervisor)
-
Leo Carlos-Sandberg - 2017-2023 - "Investigation of Causality Pattern Structure for the Exploration of Dynamic
Time-Varying Behaviour" -
Jose Gonzalez (part time) - 2017-withdrew
-
Yang Gu - 2011-2013 - withdrew
-
Samet Gogus (Barclays Treasury, part-time) - 2007-2009 - Credit risk - withdrew
-
Ghada Hassan (Egyptian Culture Bureau) - "Multiobjective Genetic Programming for Financial Portfolio Management in Dynamic Enviroments" PhD awarded 2010 - now a lecturer at Cairo University
-
Chih-Chun Chen (EPSRC DTA) - "Complex Event Types for Agent-Based Simulation" PhD awarded 2009 - now a postdoctoral researcher at Centre d'analyse et de matheatique sociales
-
David Coffin (EPSRC) - 2004-2006 - Genetic Algorithms search space decomposition - subsequently changed research direction and supervisor in 2006
-
Wei Yan (industry) - "Evolving Robust Genetic Programming Solutions in Dynamic Environments - with a case study in hedge fund stock selection in an emerging market" PhD awarded
-
Katie Bentley (industry / DTA) - "Adaptive Behaviour through Morphological Plasticity in Natural and Artificial Systems" PhD awarded 2006 - now Assistant Professor, Computational Biology Laboratory, Beth Israel Deaconess Medical Center, Harvard Medical School
-
Lee Braine (EPSRC CASE) - "Design and Implementation of an Object-Oriented Functional Language" PhD awarded 2000 - now Managing Director, Chief Technology Office, Barclays
-
Kanta Vekaria (EPSRC 97701326) - "Selective Crossover as an Adaptive Strategy for Genetic Algorithms" PhD awarded 2000 (link to thesis) - now principal engineer at Nokia
-
Tina Yu - "An Analysis of the Impact of Functional Programming Techniques on Genetic Programming" PhD awarded 1999 (link to thesis) - now Associate Professor at the Memorial University of Newfoundland, Canada
-
Simon Courtenage (EPSRC 90311629) - "The Analysis of Resource Use in the Lambda-Calculus by Type Inference" PhD awarded 1995 (link to thesis) - now Principal Lecturer at the University of Westminster
-
Stuart Clayman (SERC) - "Developing and Measuring Parallel Rule-Based Systems in a Functional Programming Environment" PhD awarded 1993 (link to thesis) - now Senior Research Fellow at UCL
-
David Parrott - "Synthesising Parallel Functional Programs to Improve Dynamic Scheduling" PhD awarded 1993 - now Chief Architecture Officer at Thomson Reuters
Other academic supervision
- 2006-2010 Nuno Rocha Nene - PhD student - 2nd supervisor
- 2005-2008 Manish Patel - PhD student - 2nd supervisor
- 2004-2005 Yoav Zibin - RA - principal investigator
- 2004-2005 Xiaowen Liu - RA - principal investigator
- 1995-1997 Steven Gould - RA - principal investigator
- 1995-1997 Peter Lidwell - RA - principal investigator -
-
-
PhD students examined
- Wenbin Wu - Price Discovery via Secondary Markets in Stable Coins An Agent-Based Approach (External Examiner, KCL, 2024)
- Ross Phillips - The predictive power of social media within cryptocurrency markets (Internal Examiner, UCL, 2019)
- Qasim Nasar-Ullah - High Performance Parallel Financial Derivatives Computation (Internal Examiner, UCL, 2014)
- Ayub Hanif - Computational Intelligence Sequential Monte Carlos for recursive Bayesian Estimation (Internal Examiner, UCL, 2013)
- Yasin Rosowsky - Spurious regions in financial space: pattern recognition for foreign exchange forecasting (Internal Examiner, UCL, 2013)
- Gary Doctors - Towards patient-specific modelling of cerebral blood flow using lattice-Boltzmann methods (Internal Examiner, UCL, 2011)
- Behzad Behzadan - MINES: Mutual Information Neuro-Evolutionary System (Internal examiner, UCL, 2010)
- Andrew Cheadle - Soft Real-time Garbage Collection for Dynamic Dispatch Languages (Internal examiner, Imperial College London, 2010)
- Sebastien Marion - Using Class-Level Static Properties and Data Mining to Predict Object Lifetimes (External examiner, University of Kent, 2009)
- Trevor A. Graham - The expansion of mutant clones in tumorigenesis (Internal examiner, UCL, 2009)
- Kun (Max) Jiang - MILCS: A Mutual Information based Learning Classifier System (Internal examiner, UCL, 2009)
- Adrian Ramirez-Cabrera - An Extension of the Strong Typing Genetic Programming Model (External examiner, MPhil, University of Manchester, 2000)
- Benoit Lanaspre - Static Analysis for Distributed Prograph (External examiner, University of Southampton, 1997)
- Jin Yang - Co-ordination Based Structured Parallel Programming (Internal examiner, Imperial College London, 1997)
- Patrick J. Wright - Optimised Redundant Cell Collection for Graph Reduction (Internal examiner, UCL, 1993)
- David Lillie - Polymorphic Subtype Inference in Combinator Languages (Internal examiner, Imperial College London, 1992)
- Guido K. Jouret - Exploiting Data-Parallelism in Functional Languages (Internal examiner, Imperial College London, 1991)