We need data; but we need to encapsulate our understanding of cities in theories; and then we need to represent these theories in models. From ‘Nullius in verba’ to ‘Evolvere theoria et intellectum’ as a subsidiary motto: develop theory and understanding.

Can we ever have theory in the way that physicists have theory? Of course not, in that we have too many uncertainties at the micro-scale – the behaviours of individuals and organisations and this creates uncertainties at broader scales. We can’t calculate ‘constants’ (parameters) to umpteen decimal places and our data points do not fit precisely onto smooth lines or curves. But if we ask the right questions, we can develop theories in relation to those questions and then, from a quantitative perspective – and there are others! – we might say that a typical error level (or measure of uncertainty – is around 10% say. This is much better than not having the theory, particularly if it enables us to predict, at least for the short run.

A good example of the ‘right question’ in my own experience is the use of Boltzmann mathematics to estimate spatial interaction – flows from origins to destinations. This can be illustrated by a snooker analogy. A white snooker ball is fired into a pack of balls. If the question is: what are the trajectories of each of the balls following impact, then this is too complicated for Newtonian dynamics. If the question is: how many times on average does a ball hit a cushion, then Boltzmann takes over and the question can be tackled. In the social sciences, there are questions about individual behaviour that can’t be answered in models while averaging over behaviour to produce an aggregate quantity – the number of trips from A to B – can be handled.

In urban analysis, we have a pretty good qualitative understanding of much of why cities are as they are, and how they are evolving over time. Much of this understanding is captured in formal models – and indeed there is an iterative process: the understanding is deepened through the models. So what do we have? First, at a point in time – the statics. At the aggregate scale, we have pretty good demographic and economic (input-output) models based on accounts (though they are not systematically applied where they could be). At the meso scale, we have models of where people choose to live and work, and of the journey to work; we have some models of employment location; and pretty good transport and retail flows’ models. There is a long and distinguished history of integrating these models into comprehensive land-use transport models (LUTI models) – cf. Lowry model.

At the micro level, there are theoretical foundations in terms of utility maximising (for individuals), profit maximising (for firms) and what might be called ‘benefit maximising’ for the public sector. These perspectives are essentially economic and need refinement through psychology, business studies and public policy disciplines for example – and to an extent this is beginning to happen.

The theory-building and model-building tasks become more difficult when the time dimension is added and we shift from statics to dynamics. Cities are nonlinear complex systems and any realistic models need to reflect this and be able to represent multiple equilibria, path dependence and phase changes. A tall order!

This very brief sketch makes theory- and model-building sound reasonably straightforward. It isn’t!! There are many different approaches. Researchers usually put themselves into a particular camp (or silo) and find it very difficult to take on other perspectives. This alone makes integration difficult, and it is inherently challenging anyway. This is a situation, however, that means there are terrific research opportunities!

So why is it so difficult and what can we do about it?

As we saw in a previous entry (systems thinking), an important starting point is system definition and the choice of scales to be used, and methods to be deployed to build a model from theory and hypotheses, are critical elements of this. It is important to do this explicitly, and to be in a position to compare one’s own decisions with those of others. That alone can resolve many disputes about ‘what is best’: often it’s a matter of ‘difference’ in the first instance and if this is understood an analysis can reveal whether there is a ‘best’ or whether the approaches are in some sense complementary.

Consider economists on scale: the usual distinction is between micro economics and macro economics – individuals and organisations versus regional or national economies. From an urban perspective, what is important is the geography – a meso scale, coarser than micro, finer than macro, and showing spatial structure. Urban economists have shifted to this scale but then – perhaps because of the tradition of the micro focus? – tend to use a continuous space formulation rather than a discrete zone system. This matters, because the discrete zone system favours averaging and because the mathematics of continuous space is much more difficult to handle than that for discrete zone systems. These kinds of issues are typically not discussed explicitly or consciously.

Then consider disciplines more widely. Model building tends to be within certain traditions of particular disciplines – indeed sometimes traditions within disciplines. Consider some examples.

- The first example relates to the methods toolkit: statistics or mathematics? The former is essentially inductive – ‘let the data speak’; and the latter is deductive – ‘here is an hypothesis, and let’s test it’. Usually, in the development of theory, statistical analysis comes first followed by mathematical model building. However, in some cases, the two camps remain distinct: consider econometrics vs mathematical economics. (Some researchers probably don’t even realise that they are in a camp – or tribe!)
- There can be too much focus on optimisation: maximisation of utility or profits; and the associated models are unrealistic. What is needed is some way of ‘blurring’ – becoming realistically and plausibly sub-optimal. This is what entropy maximising does; or alternatively, and often equivalently, by continuing to maximise utility but adding assumptions about the distribution of utility functions.
- Economists have not been good at urban dynamics. Brian Arthur (cf. entry on combinatorial evolution) notes this and recommends the work of economic historians in this respect!
- Some disciplines have largely lost touch with the quantitative approaches that were once at their core – geography being a prime example, and possibly sociology, though that always had less of the quantitative.

So what conclusions can we draw?

- ‘Data science’ should not develop independently of theory.
- There has been much progress in theory- and model-building but the field is very fragmented through a variety of system decisions and working within disciplinary silos.
- There is a need to understand this variety and to integrate where possible. This is a good way of exposing new research opportunities.
- More researchers should be encouraged to engage with theory and model building. It is too often taken as something that is given. Someone once came up to me after a talk and asked me where I got my equations from. I said that they were my equations – I’d invented them. My questioner repeated the question, and simply didn’t understand that this was possible: ‘equations came from books etc’.

We do need to reflect on divisions of labour in research. Theoretical physicists probably form quite a small proportion of physicists as a whole. So this is probably going to be true of any field, but in our case, I am arguing that it is almost certainly too small!

Alan Wilson, April 2015