ipeaGIT/geobr

A way to convert `code_muni` into any other higher level aggregations

rafalopespx opened this issue · 17 comments

Hello there,

Is there any rapid and fast way to know from a code_muni in which health_region, macro or micro, this municipality is inserted?

This is tricky to do if you trust only on the code, although the code_muni carries information from the state level and region, it is not true that part of the code_muni can be assigned to a code_health_region or code_health_macrorregion, just taking the first 5 digits from the code_muni

A workaround that I developed is using one of the relational tables from dataSUS to add the code_health_region and/or code_health_macrorregion to a data.frame with code_muni

maybe this can be implemented in read_municipalityfunction, as a parameter being passed by the user to return with other code other than code_muni

Thanks for the amazing package!

Adding on this, at DataSUS's FTP there is a registry of changes of municipalities that were moved from one health region to another, as well as, a table that relates this change and can relate the health regions with the municipalities, by code_muni and code_health_region

I have the relational database and can send it to anyone who will be working on this conversion, please mail me.

Hi @rafalopespx . Thanks for opening this issue. I like the idea of a table that associates each code_muni to the codes of other geographical units. Currently, the closest we have to this is this:


df <- geobr::lookup_muni(name_muni = 'all')

head(df)

#>   code_muni             name_muni code_state name_state abbrev_state code_micro
#> 1   1100015 Alta Floresta D'Oeste         11   Rondônia           RO      11006
#> 2   1100023             Ariquemes         11   Rondônia           RO      11003
#> 3   1100031                Cabixi         11   Rondônia           RO      11008
#> 4   1100049                Cacoal         11   Rondônia           RO      11006
#> 5   1100056            Cerejeiras         11   Rondônia           RO      11008
#> 6   1100064     Colorado do Oeste         11   Rondônia           RO      11008
#>          name_micro code_meso         name_meso code_immediate name_immediate
#> 1            Cacoal      1102 Leste Rondoniense         110005         Cacoal
#> 2         Ariquemes      1102 Leste Rondoniense         110002      Ariquemes
#> 3 Colorado do Oeste      1102 Leste Rondoniense         110006        Vilhena
#> 4            Cacoal      1102 Leste Rondoniense         110005         Cacoal
#> 5 Colorado do Oeste      1102 Leste Rondoniense         110006        Vilhena
#> 6 Colorado do Oeste      1102 Leste Rondoniense         110006        Vilhena
#>   code_intermediate name_intermediate
#> 1              1102         Ji-Paraná
#> 2              1101       Porto Velho
#> 3              1102         Ji-Paraná
#> 4              1102         Ji-Paraná
#> 5              1102         Ji-Paraná
#> 6              1102         Ji-Paraná

We could probably try to add the code of health regions to this output. Note however, that this output refers to the year 2010. I'm planning to update the function to include the Census 2022 soon.

Hi everyone! I am working on something similar here, trying to make compatible all Brazilian Territorial Divisions (DTB) from IBGE.

https://github.com/rfsaldanha/rdtb

Very early stage development, but the goal may be to track the municipality changes overtime and space.

Hi @rfsaldanha , thanks for the ping. It looks like you are trying to create a correspondence table for each year. Correct?

Yes, to have the corresponding DTB for each year. The problem is that the official IBGE DTB does not agree with the also official IBGE spatial dataset of municipalities of the same year. :-(

That's a problem, indeed. Which one should we trust, the spatial data or the table data?

I think that the spatial data is more widely used, then more trusted…

One thing that I encountered too, sometimes the spatial data do not agree with other levels or year of the same spatial data, if we pick all municipalities should this cover the country shapefile map or the state shapefile map, but this does not occur for some years

A "tidy" geobr with topological validation of spatial features would be interesting.

Hi both. Before We we make any data available in geobr, we process the data to harmonize column names, projections etc etc and we already "fix" the topolgy by applying sf::st_make_valid(). I understand this only fixes topological errors to some extent, though.

For example, here's an example of the problem mentioned by @rafalopespx. The total area of the country polygon is not the same as the sum of areas of each state. This is an inconsistency (in this case a small one) in the original IBGE data, and there is not much we can do about it. The impression I have is that any attempt to solve this inconsistency should be done by IBGE in the original raw data.

library(geobr)
library(sf)
options(scipen = 999)

c <- read_country(year = 2010, simplified = FALSE)
s <- read_state(year = 2010, simplified = FALSE)

area_c <- st_area(c)
area_s <- st_area(s) |> sum()

area_c
#> 8535238245979 [m^2]
area_s 
#> 8535240429377 [m^2]

I totally agree with you @rafapereirabr on this, and as I remember, I'm not so sure that the topological data is better than table data. And totally agrees, that the problem is at IBGE to be fixed.

One thing that will be really helpful is to have the table data that relates any code_muni with any other code on higher spatial levels, as before mentioned, e.g. the code_health_macrorregion and code_health_region. I think the faster solution is to add such on the lookup table such columns, this will permit generating and relating the codes on different aggregations.

I agree, ideally, we would have all columns added to the output of the lookup table.

obs. Do you know where to find the list of municipalities in each health region and macro region? I haven't found this table anywhere. It's is possible to determine this using a spatial join operation, but the original boundaries of health regions don't match those of municipalities, which creates some strange results (like a municipality from one state 'included' in the macro health region of another state)

I have, I'll send you, where can I send you?

Thanks, @rafalopespx . You can send it to rafa.pereira.br [at] gmail.com.

However, ideally, we would prefer to have an official document or piece of data with this info with an url so we can refer it.

Okay, I have found the relational data table in some different services from the Ministry of Health, you can download it directly from here: ftp://ftp.datasus.gov.br/territorio/tabelas/base_territorial.zip, which is the FTP server from DataSUS, and it points to the latest version of it.

Or you can download it from here: https://datasus.saude.gov.br/transferencia-de-arquivos/
and you have to go to Base Territorial at Fonte, then Bases Territoriais at Modelidade, and finally Bases Territoriais under the Tipo de Arquivo field

If none of them works I can send you the latest one that I downloaded, but I think downloading from there will guarantee you pick the official and latest version of each, and can be easily incorporated into a function

Hey, we keep an updated directory of codes that a municipality can be related to at Base dos Dados.

It also brings other codes besides health regions.

you can either download it at the website or R

install.packages("basedosdados")
library("basedosdados")

# Defina o seu projeto no Google Cloud
set_billing_id("<YOUR_PROJECT_ID>")

# Para carregar o dado direto no R
query <- bdplyr("br_bd_diretorios_brasil.municipio")
df <- bd_collect(query)```