Explanatory variables for predictive models

Question

Explanatory variables for predictive models

Closed this issue a year ago · 2 comments

Post to keep track of variables to be used in predictive models

Answer 1 · 2022-11-29T10:54:13.000Z

Copernicus Urban Atlas: https://land.copernicus.eu/local/urban-atlas/urban-atlas-2018

Answer 2 · 2023-01-20T15:15:11.000Z

I am trying to select variables that I assume may have explanatory power in prediction of one or more of our indicators and that are at the same time relatively easy to model. This can be the first selection to kickstart the discussion.

Data used in the urban grammar

population (ONS estimates)
workplace population (per aggregated class as used in UG)
CORINE Land cover classification
- here we will be mostly interested in a subset
  - Industrial or commercial units
  - Sport and leisure facilities
  - Green urban areas
  - Discontinuous urban fabric
  - Continuous urban fabric
  - Construction sites
  - Non-irrigated arable land
  - Pasturess
  - plus some forest classes that may be potentially aggregated together
points of interest

Using individual morphometric characters would be really tricky in the modelling part but we can do a trick here. If we design some scenarios as I would like to have here instead of this signature type that one, we can use the overall values characterising that signature as an input for prediction. That way we would be able to use form (which I know can predict house prices at least) but wouldn't have to bother with actual morphometric processing during the modelling phase. We can either get those means we have now or extract some probability sampling function from tessellation-level data to get a variation. If that seems reasonable, we can look at a subset of morphometric chars to work with.

From SPC

population (or from ONS estimates/census)
households
age
socioeconomic class