/fetcha

Talk to SSB using Python.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

fetcha

Talk to SSB using Python.

import fetcha as fetcha
import logging
# Turn off INFO-warnings
logging.getLogger().setLevel(logging.WARNING)

Installation

# >> pip install git+https://github.com/dafeda/fetcha.git --upgrade
# Instantiate object with specific table_id that refers to a SSB-table.
# 10945 refers to Monetary aggregates M1, M2 and M3:
# https://www.ssb.no/en/statbank/table/10945
ssb_10945 = fetcha.SSB("10945", language="en")
# Number of rows in table.
ssb_10945.nrows_tot()
1422
# Number of rows per period.
ssb_10945.nrows_period()
9
# Get all available periods
periods = ssb_10945.periods()
periods[-7:]
['2020M08', '2020M09', '2020M10', '2020M11', '2020M12', '2021M01', '2021M02']
# Fetch latest period.
# Returns a pandas dataframe with its index set with verify_integrity set to True.
# If the dataframe is lacking an index, it means that the index columns do not make up a unique combination.
df_latest = ssb_10945.fetch()
df_latest.head()
value
contents month
Monetary aggregate M1. Stocks (NOK million) 2021M02 2526071.0
Monetary aggregate M2. Stocks (NOK million) 2021M02 2695383.0
Monetary aggregate M3. Stocks (NOK million) 2021M02 2697783.0
Monetary aggregate M1. Transactions last 12 months (NOK million) 2021M02 359029.0
Monetary aggregate M2. Transactions last 12 months (NOK million) 2021M02 343687.0
# Fetch list of periods
df_periods = ssb_10945.fetch(["2019M12", "2020M01", "2020M02"])
df_periods.head()
value
contents month
Monetary aggregate M1. Stocks (NOK million) 2019M12 2159770.0
2020M01 2182450.0
2020M02 2175681.0
Monetary aggregate M2. Stocks (NOK million) 2019M12 2345545.0
2020M01 2364841.0
# Fetch whole year of data
df_year = ssb_10945.fetch("2020")
df_year.head()
value
contents month
Monetary aggregate M1. Stocks (NOK million) 2020M01 2182450.0
2020M02 2175681.0
2020M03 2300443.0
2020M04 2340381.0
2020M05 2374607.0
# Fetch multiple years
df_years = ssb_10945.fetch(["2019", "2020"])
df_year.head()
value
contents month
Monetary aggregate M1. Stocks (NOK million) 2020M01 2182450.0
2020M02 2175681.0
2020M03 2300443.0
2020M04 2340381.0
2020M05 2374607.0
# Reset index before pivoting
df_year = df_year.reset_index().pivot(index="month", columns="contents")
df_year.head()
value
contents Monetary aggregate M1. 12-month growth (per cent Monetary aggregate M1. Stocks (NOK million) Monetary aggregate M1. Transactions last 12 months (NOK million) Monetary aggregate M2. 12-month growth (per cent) Monetary aggregate M2. Stocks (NOK million) Monetary aggregate M2. Transactions last 12 months (NOK million) Monetary aggregate M3. 12-month growth (per cent) Monetary aggregate M3. Stocks (NOK million) Monetary aggregate M3. Transactions last 12 months (NOK million)
month
2020M01 3.1 2182450.0 66236.0 3.9 2364841.0 87622.0 3.7 2368402.0 84912.0
2020M02 3.2 2175681.0 66037.0 3.8 2360484.0 86360.0 3.7 2364033.0 83138.0
2020M03 7.0 2300443.0 148469.0 7.5 2489403.0 170692.0 7.3 2492801.0 167960.0
2020M04 9.8 2340381.0 205486.0 9.5 2522315.0 216155.0 9.4 2525731.0 214558.0
2020M05 10.9 2374607.0 232311.0 10.2 2552508.0 234581.0 10.1 2555817.0 232003.0
ssb_10948 = fetcha.SSB("10948", language="en")
df_10948 = ssb_10948.fetch("2020")
df_10948.head()
value
holding sector contents month
Money holding sector Monetary aggregate M3. Stocks, seasonally adjusted (NOK million) 2020M01 2374459.0
2020M02 2387955.0
2020M03 2499994.0
2020M04 2543868.0
2020M05 2580435.0
# Fetch and join
# Get another table so we have something to join with.
ssb_10948 = fetcha.SSB("10948", language="en")
df_10948 = ssb_10948.fetch("2020")
df_10948 = df_10948.reset_index().pivot_table(
    index="month", columns="contents", aggfunc="mean"
)

df_10948.join(df_year).head()
value
contents 1-month growth, seasonally adjusted (per cent) Monetary aggregate M3. Stocks, seasonally adjusted (NOK million) Transactions last month, seasonally adjusted (NOK million) Monetary aggregate M1. 12-month growth (per cent Monetary aggregate M1. Stocks (NOK million) Monetary aggregate M1. Transactions last 12 months (NOK million) Monetary aggregate M2. 12-month growth (per cent) Monetary aggregate M2. Stocks (NOK million) Monetary aggregate M2. Transactions last 12 months (NOK million) Monetary aggregate M3. 12-month growth (per cent) Monetary aggregate M3. Stocks (NOK million) Monetary aggregate M3. Transactions last 12 months (NOK million)
month
2020M01 -10.02 949783.6 -1329.8 3.1 2182450.0 66236.0 3.9 2364841.0 87622.0 3.7 2368402.0 84912.0
2020M02 3.32 955182.0 3182.0 3.2 2175681.0 66037.0 3.8 2360484.0 86360.0 3.7 2364033.0 83138.0
2020M03 541.24 999997.4 38556.4 7.0 2300443.0 148469.0 7.5 2489403.0 170692.0 7.3 2492801.0 167960.0
2020M04 19.36 1017547.0 18928.0 9.8 2340381.0 205486.0 9.5 2522315.0 216155.0 9.4 2525731.0 214558.0
2020M05 14.82 1032174.2 17398.2 10.9 2374607.0 232311.0 10.2 2552508.0 234581.0 10.1 2555817.0 232003.0
# SSB has a limit of 300k rows per transaction.
# Some tables have more than that in one period.
ssb_10261 = fetcha.SSB("10261", language="en")
# Gives warning and returns None.
df_10261 = ssb_10261.fetch()
WARNING:fetcha.ssb:Query exceeds SSB limit of 300k rows per transaction. Current query tries to fetch 607104 rows. User a filter
# Can pass filter to fetch(), but first we need to choose what we want.
# Use variable levels to see which options you have.
ssb_10261.levels
0    {'code': 'Region', 'text': 'region', 'values':...
1    {'code': 'Kjonn', 'text': 'sex', 'values': ['0...
2    {'code': 'Alder', 'text': 'age', 'values': ['9...
3    {'code': 'Diagnose3', 'text': 'diagnosis: Chap...
4    {'code': 'ContentsCode', 'text': 'contents', '...
5    {'code': 'Tid', 'text': 'year', 'values': ['20...
Name: variables, dtype: object
# We limit the region to "The whole country".
ssb_10261.levels.iloc[0]
{'code': 'Region',
 'text': 'region',
 'values': ['0',
  '30',
  '01',
  '02',
  '03',
  '34',
  '04',
  '05',
  '06',
  '38',
  '07',
  '08',
  '42',
  '09',
  '10',
  '11',
  '46',
  '12',
  '14',
  '15',
  '50',
  '16',
  '17',
  '18',
  '54',
  '19',
  '20',
  'F00',
  '9',
  'H03',
  'H04',
  'H05',
  'H12',
  'Uoppgitt'],
 'valueTexts': ['The whole country',
  'Viken',
  'Østfold (-2019)',
  'Akershus (-2019)',
  'Oslo',
  'Innlandet',
  'Hedmark (-2019)',
  'Oppland (-2019)',
  'Buskerud (-2019)',
  'Vestfold og Telemark',
  'Vestfold (-2019)',
  'Telemark (-2019)',
  'Agder',
  'Aust-Agder (-2019)',
  'Vest-Agder (-2019)',
  'Rogaland',
  'Vestland',
  'Hordaland (-2019)',
  'Sogn og Fjordane (-2019)',
  'Møre og Romsdal',
  'Trøndelag - Trööndelage',
  'Sør-Trøndelag (-2017)',
  'Nord-Trøndelag (-2017)',
  'Nordland',
  'Troms og Finnmark - Romsa ja Finnmárku',
  'Troms - Romsa (-2019)',
  'Finnmark - Finnmárku (-2019)',
  'Total',
  'Uoppgitt',
  'Helseregion Vest',
  'Helseregion Midt-Norge',
  'Helseregion Nord',
  'Helseregion Sør-Øst',
  'Unknown'],
 'elimination': True}
fltr = [{"code": "Region", "values": ["0"]}]
df_10261 = ssb_10261.fetch(fltr=fltr)
df_10261.shape
(17856, 1)
df_10261.sample(10)
value
region sex age diagnosis: Chapter in ICD-10 contents year
The whole country Males 40-59 years Influenza and pneumonia Number of day cases 2019 76.0
Females 20-39 years Injuries of upper extremities Patients with day cases 2019 613.0
60-69 years CONGENITAL MALFORMATIONS Number of bed-days 2019 290.0
Both sexes 20-39 years Cardiac dysrhythmias Patients with day cases 2019 276.0
PREGNANCY, CHILDBIRTH AND THE PUERPERIUM Number of out-patient consultations 2019 109637.0
Females 60-69 years Glaucoma Number of day cases 2019 253.0
Both sexes Years, total Other maternal disorders predominantly related to pregnancy In-patients 2019 1478.0
60-69 years Diabetes mellitus In-patients 2019 488.0
70-79 years Other diseases of oesophagus, stomach and duodenum Number of bed-days 2019 3265.0
0-9 years Malignant neoplasms of female genital organs Out-patients 2019 1.0