Neal, thanks for taking the time to meet with me today. It would be great if you could give us an introduction to yourself.
I joined Monzo about 2 years ago (I think! I’ve lost track of time) and my background is in Computer Science. I did my undergrad at UCL, here in London, and then moved on to a PhD studying recommender systems in 2006. Back then, Netflix had launched a competition to improve their recommender system and I used that data set for my PhD research. From there, I stayed in academia for a little while as a postdoctoral researcher, and my research moved away from recommender systems. My last project – about mobile health apps – was spun out into a small startup in Cambridge, which unfortunately failed. I then joined Skyscanner, which was around the time when they got acquired by C-Trip and also when they were acquiring other cool companies. It was interesting to see such an engineering-driven organisation that was already really big and very profitable.
Around that time, my sister introduced me to Monzo (it was back when Monzo was only a prepaid card) and I thought it looked really interesting – banking struck me as an area that had a lot of potential, because banking for me personally had always been… a bit s**t! I then found out that Monzo was hiring for a Data Scientist and met up with Dimitri, who is now our VP of Data, to find out a bit more. I interviewed and joined shortly after that. I started off in more of a generalist role (as there was only 5 of us in the data team when I joined) and my work was doing a little bit of Machine Learning, a little bit of Operations Analytics and a little bit of data infrastructure.
Today, Monzo has over 30 Data Scientists who, broadly, cover 4 areas. The biggest of them is Product Analytics, where the focus is on helping the business and our product engineering teams make the right decisions. The second big area is all the analytics you may naturally associate with banking: financial crime analytics, lending analytics for overdrafts and loans, and operations analytics for customer service. Finally, we have two squads that focus on Data Engineering and Machine Learning – and I am leading the Machine Learning Squad.
That’s great. Your career with Monzo has progressed quite far since joining, what does your role now encompass as the Machine Learning Lead?
The transition from being primarily an individual contributor to becoming a lead has been super interesting for me, and I’ve had a lot of great support from others here at Monzo throughout the process.
I’ve moved away from thinking “can I build this thing and get it out there” to thinking about “can I build a team that creates great things.” It’s now about whether we’re working on the right things, whether we’re working well together, how we share knowledge, how do we enable each other and how we break down big problems into smaller problems. Having said that, we’re still at the size where I still get to write some code, so I haven’t fully moved away from that – which I’m happy about.
You’ve recently written an article about building Machine Learning teams for both speed and success – where’s that all come from?
Before I joined Monzo, I was interviewing with other companies and one quite large company that I visited paired me with a Machine Learning Researcher for the day while I was visiting. Naturally, I was chatting with him the whole day about what he does and one of the stories he told was about how a product team came to him with an idea that involved automated text summarisation. He took the idea away and worked on it for a year and even published his research in some of the best conferences. I then asked “well, did this new machine learning work you came up with see the light of day?” and the answer was no. This was because, by the time he’d finished with his research and took it back to the product team, they said that they’d moved on from the idea and they weren’t actually interested in it anymore as their problem had advanced already.
For some people, that’s fine, as they care about doing the research, but one of the things I’ve found disappointing is how much machine learning research we have out there, and how little of it is in customers hands and making a real difference for them.
Do you find that situation across the whole machine learning industry?
I’ve seen it across many places. For example, when I was doing my PhD, I would go to the ACM Recommender Systems conference, which is a really great conference because it attracts a mixture of academics and people from industry. It’s always amazing to see people in academia proposing improvements to the state-of-the-art, and people in industry discussing how they’re applying these ideas in real products.
For example, one of my mentors moved to Netflix shortly after the Netflix competition finished and he’s written a lot of stuff about the lessons they learned. One of them was about the winning solution for the competition: it combined literally hundreds of machine learning models together in order to cross the target accuracy threshold. And again, the key question is did Netflix ship this stuff? The answer: they didn’t ship all of it, they only took the best bits of the solution. Clearly, they don’t just need to think about accuracy – they need to decide whether the accuracy gain is worth the engineering effort. The competition data set they had released had nearly half a million users and about 20,000 films, but Netflix’s actual data is many millions of users and many more thousands of films – and their recommendations are not based on rating films anymore.
This is a big point for me when thinking about how I want to set up my team at Monzo. I’m trying to balance two things; one being that state-of-the-art research is moving at the speed of light and every few months there’s something super cool being published. But inside of Monzo, spending months looking at that research and applying it to our problems would not be very impactful at all – the actual impact is can we take the best bits out of it and put it into the app quickly, or put it into our internal tools for customer support quickly. So this is what has shaped my thinking about speed.
One way we are thinking about this is by enabling Machine Learning Scientists to ship their own models – which is a bit of a debated point in the industry. Some companies don’t regard Machine Learning Scientists as Engineers: Scientists shouldn’t ship models because maybe they won’t do it correctly, or maybe they don’t have the experience or knowledge to ship production code.
The approach that we’ve taken has been ‘can we make it incredibly easy and safe to do it, so that a Machine Learning Scientist can deploy a model?’. This has made the process faster, but has also meant that people who join the team spend some time upskilling in backend engineering – which takes a little bit of time, commitment and support.
When you look for Machine Learning Scientists to join the team, do you look for someone who can already pump out production-ready code?
No, not at all. We split the interview process into a few steps: CV review, initial call, take-home test and then an on-site interview.
In that initial call, the main thing we’re looking for is an impact mentality. So are you the type of Machine Learning Scientist who just wants to tweak machine learning models ad nauseam, or wants to see things through to customer impact? As a team, we’re not at the stage where we are squeezing small gains out of existing areas – we’re at a stage where we spend more of our time breaking new ground. It’s not ‘can we get 1% improvement?’ but ‘can we ship something that demonstrates value’ or ‘can we try new ideas?’ and so in that initial call it’s all about how candidates measure their own impact: we look for people who think that impact is measured in testing things with real customers, rather than ‘I just made the model and it looks fine’.
We do have a take-home test where they do some coding, but the main bit is we give them a broad problem where we ask them to write some code to solve it and the key thing we’re looking for is how do you define the problem and less so about whether your code is 100% resilient.
With speed being such a big factor in what you do, how do you balance it with quality? As sometimes speed is substituted for quality.
When we think about speed, we don’t think about it in terms of cutting corners or producing substandard work, but it is about where are the areas in our work where we can speed things up by taking a more holistic view of the problems.
One very classical example of this is that every time a machine learning team gets an idea, there’s often a sort of reset. Everyone sits back and thinks ‘Here’s a new idea, what data do we have?’, then you need to analyse the data, build some features and then train a model, ship the model and A/B test it. What I saw in other places I’ve worked is that every time that would happen, you would start again from scratch. We’ve been taking a different approach – where we focus on families of problems. For example, at Monzo, we do all of our customer support via the in-app chat; naturally, that means we have a lot of ideas about things we want to test to improve the customer support experience.
So, every time we start working on a new problem that involves text data, we shouldn’t be starting from scratch and thinking ‘where’s the data?’. One of the ways we’ve been thinking about it is in terms of reusable tools: is if you’re going to tackle a new supervised learning problem for text data, you don’t need to build a text processing pipeline again.
There are other areas where we re-use existing models to build something new. A very simple example would be, in the app, you can go to the help screen and search for what you need, and it points you to an article that can help you find the answer to your questions. Under the hood, that’s a neural network that is encoding the text you’ve written in the search bar and is seeing which of our help articles is the most similar to the text you’ve written (i.e. “I’ve forgotten my pin’ will direct you to the article ‘forgot your pin?’). A second problem is when people write to our customer support, we want to give our Customer Support Agents recommendations on what to reply to you. If you write to us saying “I’ve forgotten my pin”, the Agent shouldn’t have to try to find a page on how to reset your pin, then type out a full reply. So, we have a bunch of saved responses that can be accessed by a shortcut which then populates the box on instructions on how to reset your pin. Now, this is a very similar problem to the help screen one because we have the same input data and the only difference is instead of looking at what the most similar help article is, we’re looking at what the most similar saved response is – so we use the same machine learning model on both. By virtue of that, we can develop one machine learning model and build two things and we’ve just doubled our speed that way. Because the core problem is still the same, if we improve the model, we improve two product features.
This is how we’re thinking about speed. We look at the whole process that we would follow if it were one Data Scientist by themselves and think about how we can build tools that when the next Data Scientist comes along, they’re not starting from scratch, and, when we train a model, we think about which other problems we can use it for.
What advice would you give to other Machine Learning leaders that want to implement speed into their teams?
One thing which is at the forefront of my mind (as the quarter is just finishing!) is thinking about the balance between breadth and depth. If you have a team of three Machine Learning Scientists, like we do, then you have a very difficult choice on whether you get them to work alone on three separate problems, or bring them all together to make them think about collaboration and re-suing tools and models but work on fewer problems. My view right now is I’d rather say no to a couple of problems in order to really set us up for speed on one really important problem, rather than saying “okay, there’s 10 different problems – you take two, you take two and I’ll take two and everyone goes away and works alone.” I remember in other places I’ve worked where we have these meetings with everyone working on machine learning and everyone would present. Every single person that stood up to present would be presenting something completely different. One person talking ranking, one about recommendation, one about forecasting models and one person is talking anomaly detection. All of it would be useful in terms of finding out all the things everyone’s working on, but it would be less useful for giving constructive and useful feedback because you’re so far away from the problem.
The other thing we’ve been doing is thinking about how we share knowledge. We had this classic problem where we would have a meeting to give feedback on a specific project, and the main person working on the project would present what they’d done. What we found is everyone else in the room would be giving them ideas that the person had already tried (and weren’t presented because they didn’t work). We found these meetings weren’t very useful because all of our feedback was ‘have you tried this – yeah, it didn’t work’ and ‘have you tried this – yeah, that didn’t work either’ so we flipped it around a little bit. Right now, Machine Learning Scientists who work on a problem keep a written diary on everything they’d done. This helped us to share negative results, so when we got to the meeting, we got visibility on everything that didn’t work, so the feedback becomes more about ‘have you tried this because it could work’. The people working on something would leave the meeting with a list of things to try, which has been a powerful thing for us. Granted, it has been a little more effort because you have to write everything down, but it helps massively when working on existing problems or very similar problems.
If you’re all focusing on the same problems, how do you prioritise the work you do?
At Monzo, we’ve adopted a similar company structure that Spotify made famous with having tribes and squads, but instead of tribes we have collectives (which sounds a bit more sci-fi!). All of these collectives will be setting their own goals and then the challenge we have is to look at all of these goals and think where machine learning could help the most. That’s a combination of a few different things: how new is the problem, how much data do we have, is it a problem that looks like machine learning could really help and how big is the problem. This is where we then will prioritise problem spaces. At the moment, our big focus is on NLP for customer support – we don’t necessarily prioritise specific topics in customer support, but we’ll know our focus area is on that part of the business.
Do you or any of your team specialise in NLP? Or is this something you’ve picked up along the way?
It’s definitely something we’ve picked up. I don’t think anyone in the team has a specific background in NLP – I certainly don’t! But having said that, NLP has been a very trendy area in machine learning for the last couple of years and the amount of tools, pre-trained models, research papers, and blog posts has gone through the roof. Specifically, we’re using the transformers library from Hugging Face and they’ve made it super easy to train and finetune models. Again, because NLP has become so popular, the tooling has matured a lot so we’re not having to start from scratch on a lot of these problem spaces and implement every last bit of a training pipeline. We can pick up the best models that Google is training and fine-tune them to Monzo specific problems.
In the long term, we’ll have to see how this model works as Monzo grows. Clearly, we’ll have to start focusing on other areas and we haven’t decided yet on how we’re going to do that. We may end up following the same path that the rest of Data Science has taken, which is about putting Data Scientists in each bit of the business, or we may continue to pursue this path of having a group of Machine Learning Scientists who swarm on different problem spaces. Ask me in a year and we’ll see how it’s going!
What plans do the Machine Learning team have instilled for Monzo next year?
The main focus is on our Customer Support and, related, the app’s help screen. Before, if you were to go the chat screen in the app you would only be chatting with a human Customer Support Agent. We’re now thinking about how we can use machine learning to mediate the conversation between two people.
For example, if you want to chat to us about something very simple or straight forward, we can give you automated recommendations – if you say “help I’ve lost my card”, we can suggest a number of articles that can help you and tell you how to order a new card in the app. In some other cases, we can try to give you the answer you need directly. This means you don’t have to wait for someone to talk to.
Where machine learning comes into play is in trying to find these answers for you, for the things we’re comfortable trying to give you automated answers for (i.e. ‘I’ve forgotten my pin’). But if you come to us and say ‘I think I’ve been defrauded’ then the role of machine learning is not to give you an answer, it’s to help an agent know that it’s urgent and offer the help on that side.
One final question to wrap everything up – what is one book you would recommend?
I’ll have to cheat on this one! I recently read the ‘Dark Forest’ trilogy, which is a fun sci-fi collection written by Liu Cixin. I’d recommend it as everyone that works in data should read stuff that’s got nothing to do with data. You need to escape from the data world every now and again and put your day to day behind you and entertain yourself a little bit.