How Data Scientists Transform the Financial Industry with Geoffrey Horrell from Refinitiv
Geoffrey Horrell is Director of Applied Innovation, London Lab. His current focus is helping asset managers with digital transformation using the Knowledge Graph, Data Fusion and Intelligent Tagging NLP capabilities.
Over the last 15 years, Geoff has a long track record of launching innovative content products, including Value Chain, TRBC, Events Platform, Estimates Delta and Knowledge Direct. Geoff is based in London and has a Master’s in Economics from Edinburgh University.
Geoffrey Horrell’s LinkedIn: https://www.linkedin.com/in/geoffhorrell/
Geoffrey Horrell’s Twitter: @GeoffHorrell
Geoffrey Horrell’s Website:https://www.refinitiv.com/en
Podcast website: https://www.humainpodcast.com
YouTube Full Episodes: https://www.youtube.com/channel/UCxvclFvpPvFM9_RxcNg1rag
Support and Social Media:
– Check out the sponsors above, it’s the best way to support this podcast
– Support on Patreon: https://www.patreon.com/humain/creators
– Twitter: https://twitter.com/dyakobovitch
– Instagram: https://www.instagram.com/humainpodcast/
– LinkedIn: https://www.linkedin.com/in/davidyakobovitch/
– Facebook: https://www.facebook.com/HumainPodcast/
– HumAIn Website Articles: https://www.humainpodcast.com/blog/
Here’s the timestamps for the episode:
(00:00) – Introduction
(01:33) – Refinitiv was a global provider of data and workflow solutions. And, as you said, APIs and something, hopefully to reach out to the developer community who wanted to get more data into their applications and drive their strategies within wealth management, investment management, trading, risk, all these different sectors that we were serving. we’re now serving the all financial ecosystem in one company, which it’s exciting from a market service and a growth opportunity. Being able to build new kinds of analytics and really serve as customers at each part of their investment life cycle, is something that’s keeping us in the labs very busy and keeping us really excited.
(04:12) – Even though it’s called London Stock Exchange Group, we’re serving customers all around the world. In fact, London’s a great place to be, because you kind of have one foot in the Asia time zone and one foot in the North America time zone.
(05:41) – There’s labs, different labs, doing different kinds of things. So some labs out there are really partnering with FinTech to incubate them and grow them. We do customer research and we bring that lean startup, approach. Which is to build something rapidly. Just test it with a few customers and iterate wrap, quickly. We built a capability called The Data Science Accelerator. Which mixed large sample data, large sample sets of data available with tutorials, with examples, with Jupyter notebooks.
(11:27) – We ran the survey both to understand what’s happening about the role of data science itself, but also what they need. It’s an emerging industry. It’s an emerging capability. So, What do people want? What services do they need? What tools do they need? What kind of data do they need? What kind of projects are they working on? So, what we’ve seen in the headline of this is that data scientists within the financial sector are really on the rise in a big way.
(13:52) –The different use cases: market risk, credit risk, those are areas that traditionally had quants and sort of senior analytics managers in those things. What you’ve seen is other functions, reporting and compliance, portfolio management, investment research, idea generation, trade execution, pre-trade. There’s like a dozen different types of use cases. What you’ve seen data scientists having to do is not just crunch the numbers and build models, but also advise how you should set this project up? How do we break down the business problem on the one side? So that kind of strategic direction of like: How do we do this well? The strategic role of the data scientist is not just in how do I build this model, but it’s also in how does a company set themselves up to understand the end to end flow around it.
(16:38) – What you need for pre-trade execution is very different for what you need for risk and compliance. So data scientists are being embedded within those groups. That’s the model that we’ve seen.
(18:47) – There is definitely an evolution on the engineering side, ML operations. Operations is kind of a dirty word in engineering, even though that’s the stuff that is required to make sure things work. But the sort of ML ops or ML engineering, definitely we see a growth there. Specialization there it’s difficult because you’re trying to get somebody who understands enough about data science models and stats and governance, but also is spending all day everyday on the engineering side. So that’s a really interesting hybrid. It’s massive. There’s a big shortage, actually, in the industry in financial services in data engineering.
(22:45) – There’s both regulation and the threat of regulation that is going to come around in these areas. That’s critical, but beyond that, the ethics of AI. I issued my model fair; but, Do I really understand my data source? Where has it come from? Has it been sourced in an ethical way?
(26:11) – What you’ve seen is the data scientist being the one who’s taking the lead in evaluating, in testing data. Scientists are saying 83% of the time they are the ones who are involved in trialing the data. But over 50% of the time, they’re also the one who makes the final decision in the data. At the time they were involved in a third of the time, they are the one who makes the final choice.
(28:17) – That’s even accelerating that full end to end digitization. So if we do this when we do the survey next year, we’ll see it even move even further. But the survey said that 72% of the businesses we talked to said that ML is a core component of their business strategy.
(30:24) – You’ll see NLP move front and center into the mainstream and it won’t be seen as an alternative thing or a niche thing. It’s going to be a core capability. Linking the data, enriching data, identifying outliers, filtering all the different steps that are actually incredibly valuable. How do we get better tooling, better standards around how we work with that data? I see a lot of investment, a lot of new startups, a lot of seed capital going into that area.
(33:10) – The approach you think you have to take is perhaps moving to more of a synthetic data approach.
(37:32) – You can find the report on refinitv.com, on our website. And there’s a lapse page there as well, you can see all the details of our different projects. refinitiv.com/mlreport2020.