Data, metadata, digital vector boundaries, and functionalities for querying and mapping related to the UK Census 2021.
As of Jan-23, data at small area level are available only for England and Wales.
The package at the moment only contains metadata, organized data, geographic references, and boundaries. Other functionalities will (hopefully) follow.
Being more than 3GB in size, RcensusUK
is not available on CRAN, and probably never will be.
You can only install it from here using:
# install.packages('devtools')
devtools::install_github('lvalnegri/RcensusUK)
All data values from the Census are contained in data.tables with lower-case names starting with dt_
. All other objects in lower-case are metadata and geography data.tables, while objects in upper-case are all boundaries in sf format.
If not detailed otherwise, see the list below or the table tables
, all data included in each table is available (only) for the smallest possible OA
Output Area level (while working to include functions to query data, see below how to get data for upper levels).
For detailed information about all the included variables, see the table vars
, while in the summaries
table you can find the Nation's and Countries' totals.
The table vars_refs
allows to query the reference upper level(s) variable(s) for each lower level(s) variable(s), so that it is possible to create proper proportions and rates. You also need the table main_refs
if you want to exactly square all included data, which I don't think it's necessary for most purposes.
The following lists display:
- the domains in which all univariate data are partitioned into, at some finer degree than the ones set out by the ONS
- the name of the associated
dt_
data.table in the package - the list of corresponding ONS tables in the domain
- the smallest available level when different from
OA
, one in [MSOA
] or [LTLA
]
- TS001 Number of usual residents
- TS002 Legal Partnership Status
- TS006 Population density
- TS003 Household composition
- TS017 Household size
- TS010 Living arrangements [
MSOA
] - TS011 Households by deprivation dimensions
- TS007 Age [
MSOA
] - TS008 Sex
- TS009 Sex by Age [
LTLA
]
- TS004 Country of birth
- TS012 Country of birth (detailed) [
LTLA
] - TS005 Passports held
- TS013 Passports held (detailed) [
MSOA
] - TS015 Year of arrival in UK
- TS018 Age of arrival in the UK
- TS016 Length of residence
- TS019 Migrant Indicator
- TS020 Number of non-UK short-term residents by sex
- TS041 Number of Households
- TS021 Ethnic group
- TS022 Ethnic group (detailed) [
MSOA
] - TS023 Multiple ethnic group
- TS027 National identity - UK
- TS028 National identity (detailed) [
MSOA
]
- TS024 Main language (detailed) [
LTLA
] - TS025 Household language
- TS029 Proficiency in English
- TS026 Multiple main languages in households
- TS030 Religion
- TS031 Religion (detailed) [
MSOA
] - TS075 Multi Religion households
- TS066 Economic Activity Status
- TS063 Occupation
- TS064 Occupation - minor groups [
MSOA
] - TS062 NS-SeC
- TS060 Industry [
MSOA
] - TS059 Hours worked
- TS065 Unemployment History
- TS058 Distance travelled to Work
- TS061 Method of Travel to Work
- TS044 Accommodation type
- TS054 Tenure
- TS051 Number of rooms
- TS053 Occupancy rating for rooms
- TS050 Number of bedrooms
- TS052 Occupancy rating for bedrooms
- TS046 Central heating
- TS045 Car or van availability
- TS077 Sexual orientation [
MSOA
] - TS079 Sexual orientation (detailed) [
LTLA
] - TS078 Gender identity [
MSOA
] - TS070 Gender identity (detailed) [
LTLA
]
- TS067 Highest level of qualification
- TS068 Schoolchildren and full-time students
- TS037 General health
- TS038 Disability
- TS039 Provision of unpaid care
- TS040 Number of disabled people in household"
As detailed above, the data are provided only for the lower level available from the ONS, commonly this is the lowest small area OA
. Using the table lookups
for the matches, and the table zone_types
for the correct parent and child, it is easy to obtain by simple sum the correct figures for all the Zones in upper levels.
All names and geographic characteristics for each Zone in all hierarchies are listed in the zones
table. In particular:
LSOA
andMSOA
names are not the originals from ONS, but they correspond to the more readable format released by the House of Commons Library- the coordinates listed as
w
are the simple weighted centroids, calculated using the total population in micro grid of 30 meters provided by the Meta Data For Good project - the coordinates listed as
p
describe the Visual Center or Pole of Inaccessibility, "the location in a geographical area that is the furthest away from all its borders". This point is useful in mapping as it is often the best location to put a text label or a tooltip on a polygon to minimize the risks of overlapping and improve readability.
Because WARD
and in a greater extent PAR
are sometimes smaller than OA
, some of them are missing from the lookups, as more than one Zone is deemed to be part of a single OA
. The lookups
table only reference the Zone with the greater covered area, leaving to the missing
table to list the other conflicting Zones. This also obviously implies that on these Zones the values of Census data are not exact (I'll probably add for completeness a general table with all the correct values for these Zones when ONS will publish them).
The package also contains the following self-describing tables, mainly for search purposes in applications:
postcodes
, which includes the lookups between the 2.620 millions postcodes units (PCU
) in the UK and theOA
. For more details see the ONS Postcode Directory on the ONS Geography Portal, official publisher of the lookups, or my other package RpostcodesUK that also include lookups and boundaries for the upper levels of the Postal hierarchy: Postcode Sectors, Postcode Districts, Post Towns, Postcode Areas.localities
, a list of all Places in England and Wales. For more details see the Index of Place Names.
Finally, the table neighbours
contains all the adjacent Zones for each Zone in each hierarchy, as an aid for some simple spatial analysis.
Included in the package are quite a few Digital Vector Boundaries, that only include the ONS code of each Zone, so that they need to be joined with data and metadata from some of the above tables.
They can be partitioned in the following two main hierarchies:
The Census Hierarchy consists of the following items, each perfectly nesting into each other from the lower level OA
up to CTRY
:
OA
Output Areas (n = 178,605 + 10,275 = 188,880
)LSOA
Lower-Layer Super Output Areas (n = 33,755 + 1,917 = 35,672
)MSOA
Middle-Layer Super Output Areas (n = 6,856 + 408 = 7,264
)LTLA
Lower-Tier Local Authorities (n = 309 + 22 = 331
)UTLA
Upper-Tier Local Authorities (n = 152 + 22 = 174
)RGN
Regions (n = 9 + 1 = 10
)CTRY
Countries (n = 2
)
The process requires first to download the original OA boundaries from the ONS as Full Clipped EW (BFC) and Generalised Clipped EW (BGC), then the former format is kept as is for geographical operations, while the latter is simplified at 20%, and included in the package in a transformed WGS84 reference system (EPSG:4326).
The other boundaries are obtained by using the lookup tables OA to LSOA and MSOA and OA to LTLA to UTLA to RGN to CTRY to dissolve the above simplified version of *Output Areas.
This is a hierarchy with elements not directly related to ONS Census products, but anyway important for some purposes. The spatial operation used to build the lookups against the Output Areas, is max area in polygon, where the OA
boundary is overlayered to the hierarchy and each OA is associated with the Zone who covers most of the area (this is considered to be a safer process than the more classic point in polygon operation, as in the latter the centroid of the zone for the finer hierarchy does not even necessary fall inside the correspondent zones).
PCON
Westminster Parliamentary Constituenciesn = 533 + 40 = 573
, last updated Dec-21WARD
Wardsn = 6904 + 762 = 7666
(with 14 missing in the lookups in England only), last updated Dec-22PAR
Parished and Non Civil Parished Areasn = 10689 + 878 = 11567
(with 1,022 missing in the lookups in England only), last updated Dec-22CCG
Clinical Commissioning Group, asLOC
Sub Integrated Care Board Locations in Englandn = 106
, last updated Jul-22, and Local Health Boards in Walesn = 7
, last updated Apr-22, for a total of113
Do notice that these are NOT official boundaries, although the child level listed in the zone_types
table nests exactly in each Zone which in turn nests exactly in its parent (at least when confronting with the codes included in the lookups
table only), and there is some statistical errors involved when dealing with them, in particular PAR
and to less extent WARD
.
- Contains OS data © Crown copyright and database rights 2023
- Source: Office for National Statistics licensed under the Open Government Licence v.3.0
- Contains Parliamentary information licensed under the Open Parliament Licence v3.0
- Facebook Connectivity Lab and Center for International Earth Science Information Network - CIESIN - Columbia University. 2016. High Resolution Settlement Layer (HRSL). Source imagery for HRSL © 2016 DigitalGlobe. Accessed 15 Dec 2022."