I was born in Scotland and brought up in France, and the general opinion is that I speak both English and French with a foreign accent. I obtained my PhD in Statistics at Imperial College London in 2008, supervised by Prof. Andrew Walden. I was awarded a Heilbronn research fellowship in Data Science in 2012, which I first held at the University of Bristol and then at the University of Oxford. I became assistant professor of Statistics at the University of Bristol in July 2017, and was promoted to full professor in July 2022. As of Jan 2024, I am chair of Statistical Learning at the University of Edinburgh. My research interests include data exploration, graph embedding and unsupervised learning.
data exploration; statistical testing; clustering; anomaly detection; embedding; graph analytics; behaviour analytics; manifold learning; topological data analysis; non-parametric statistics; high-dimensional statistics; representation learning; unsupervised learning; machine learning.
In recent years, there has been a significant opportunity for innovation in Statistics. When I was doing my PhD, it was not "easy" to find data. Now, you can pretty much pick any subject, and find a relevant data source, complete with data processing pipelines and documentation. It's a lot easier to combine mathematical thinking with real-world data to make something useful.
Working with data can give you mathematical ideas; for example, large-scale cyber-security problems led me to think about how to do graph embedding with disassortative networks[1], where similar entities don't tend to connect. Conversely, purely mathematical ideas can often find applications in real data; for example, a fairly long path of analysis eventually led us to the notion of a "continuous tree" which would be potentially recoverable in high-dimensional data, and indeed we found this was relevant for stem cell research[2].
More generally, my research is about discovering structure, for example, correlations, clusters, hierarchy, trends, or manifold structure; in complex data such as large relational databases, dynamic networks, or high-dimensional data (e.g. tables with many columns, text, images). One goal is to empower humans, in an increasingly digital world, by supporting data exploration or "looking at data to see what it seems to say" (Tukey, 1977). Another is to develop ways of refining data to improve machine-learning, e.g. in terms of accuracy and robustness.
The applications of my research are quite wide-ranging, and I have won funding (over £7M between government & industry) for applications in biosciences, healthcare, (cyber-)security, societal resilience, environmental protection and more. For example, Microsoft uses unfolded spectral embedding[3] for anti-corruption[4].
Multiple PhD and postdoctoral research positions are to be opened for NeST. Please contact me if you want to discuss these or other NeST research/industrial collaboration opportunities.
More generally, I'm always happy to hear from students interested in doing research, e.g. a PhD. Fundamentally, you'll need to enjoy doing maths — everything else you can learn.