The future is model-centric.

The future of decision-making and operational control is
model-based.

Models of every size and of varying complexity will permeate through all applications, dashboards and interfaces to support decision flow and extend operational awareness.

Recent years have seen a deep recognition of the profound value of data – “it’s the new oil”, as the saying goes – however this data is almost worthless without analysis, and meaningful summarisation. Data needs to be harnessed and channelled through models to mine its real value. To amplify its value.

The advent of LLMs (like ChatGPT) has not changed this – in large organisations, LLMs will primarily act as a declarative, linguistic and interpretative layer over the top of models, data and services in the form of APIs. They are not the models themselves.

Models are immensely powerful ...

Models are powerful abstractions that can compress enormous volumes of information and predictive utility into a tiny, almost dimensionless space.

And yet, in the most superficial sense, when everything is in place, they can be ‘built’ in seconds.

Models extend operational awareness and push organizations towards optimality.

Depending on the structure, they can predict, classify, control, guide, estimate, explain, identify … and many other things.

Models shape execution.

... and yet ... surprisingly fragile.

Models don’t exist in a vacuum: they require deep infrastructure and great care for them to add value to core operations.

And you need to rebuild them. Constantly.

The central dilemma: speed-quality trade-offs.

Demand for models is soaring, far outstripping the ability for current approaches to supply them.

To meet these demands, models need to be built and deployed at far greater speeds than they are now.

However! At the same time, pressure is also increasing for models to come with safety and performance guarantees.

The world is waking up to ‘model risk’: the risks and dangers arising from poor model quality, poor modelling practices, and poor operational discipline in model use.

This fundamental trade-off between development speed and model quality is the central dilemma facing modern organisations that seek to build and use models for mission-critical applications.

We have placed ourselves directly in the path of this storm. We are building our whole company around this dilemma.

To make matters worse ...​

... modelling in a regulated environment is maximally difficult.

If building predictive models wasn’t challenging enough … regulated sectors such as banking require a large of number of additional requirements beyond the model itself.

These requirements are broadly codified in law, but in plain terms

Data standards

Data needs to meet rigorous standards in terms of coverage, volume, quality and overall representativeness​

01

Model architectures

Exotic, black box model architectures are not acceptable – in fact models need to exhibit a very high level of ‘explain-ability’​

02

Modelling process

The modelling process needs to be extensively documented – modellers need to show [1] a strong understanding of advanced parameter estimation, and model diagnostics; and [2] that the model is compliant with the law​

04

Stakeholder approvals

A group of key stakeholders need to review and sign-off every intermediate stage of modelling and key artefacts​

05

Rigorous model selection processes

Selection of a final model needs to be exhaustive and systematic, using a range of methods – and special subsets of the data – to prove that the model will perform well​

06

Independent review

A dedicated function (‘Group Risk’ in many banks) reviews and approves the model to exacting standards, providing a rigorous challenge process​

07

Audit transparency

The overall process is subject to intensive audit at any point in time by the Group Audit function

09

Data standards

Data needs to meet rigorous standards in terms of coverage, volume, quality and overall representativeness​

01

Model architectures

Exotic, black box model architectures are not acceptable – in fact models need to exhibit a very high level of ‘explain-ability’​

02

The resolution? Industrialise modelling.

For the ‘central dilemma’ to be resolved, especially in a regulated environment, where the constraints are high … the approach to modelling needs to radically restructured.

The core operational concept needs to shift away from modelling as ‘deep quantitative work by irreplaceable experts’ and towards the efficient and rapid production of models to high standards.

There needs to be a more definitive split between:

   [1] complex, manually-intensive methodology work, and

   [2] model production.

Our platform is based on robust, heavily-standardised patterns of model development that do not sacrifice the flexibility that modellers expect from a development environment.

We are re-framing the modelling process, matching advanced methods of parameter estimation with equally advanced interfaces and patterns for an accelerated build process.

Problem : code is a bottleneck to industrialisation

Raw code is too fragile a resource type for a high-performance production task.

In fact it’s more fundamental than that … raw code is the wrong level of abstraction altogether for accelerated model production.

Heresy? Maybe.

The problem is that a typical industrial modelling project involves hundreds, if not thousands of lines of code. And this code is spread across dozens of separate text files and notebooks. Nested inside complex folder structures.

Worse still, the key parameters governing its behaviour are physically spread throughout the code. This makes them difficult to access efficiently, and the overall modelling process difficult to control. This exceedingly common, almost invisible pattern is a recipe for error and poor efficiency.

The point here is that modelling code is a loose, open, fragmented, text-based resource. It’s a gigantic, complex error surface embedded in a colossal state space. This is not a medium conducive to accelerated model production.

And this guy ain’t saving the day. Not yet anyway.

Sometimes a cool-looking short-cut is actually the long way.

It doesn’t matter that LLMs can write code faster than a human. That code still has to be read and internalised by the human modeller; analysed for correctness; probably rewritten to get it running; refined to integrate it into existing project code; documented for reproducibility; and run multiple times per modelling stage by humans.

And that’s assuming companies are happy to have human modellers working with code generated by an engine that has questionable reasoning skills* and hallucinates with little provocation.

The other option of course is simply to just give the whole thing to the parrot. Let it write AND run the code. Right. Well, if your goal was to radically increase OpRisk as part of your automation plan, then you will have chosen well.

Despite their almost miraculous power, and the genuine step up in performance and autonomy over previous model architectures … LLMs are still not an appropriate engine or platform to which tasks can be delegated that require high precision, high transparency, and high safety.

Critical observations aside, we DO believe that LLMs will ultimately play a role in modelling operations, but not necessarily in the way that’s currently projected by by exuberant AI tutorials. LLMs clearly need to be paired with other elements to reach their full industrial potential.

To Summarise.

Demand for models, and AI, is increasing, well beyond the ability for current modelling approaches to supply these powerful assets​