pachadotdev/economiccomplexity

Balassa index for more years

CristinaProchazkkova opened this issue · 1 comments

Hi Mauricio,
To keep things organized, I am placing here my question - calculating the Balassa Index by year.
I am using your package economiccomplexity in R and I wonder if another parameter could be added (next to country and product, to also add the year)?
Thank you very much for your help! Cristina

Dear Cristina

@CristinaProchazkkova

I see that you are new to GitHub, so I've edited the question and added the key parts that you asked me by email. Thanks for agreeging to share this, so that other users can benefit from the question and its answer.

Actually, you cannot filter by year with this package, but the code below show how to do that with real trade data. I used dplyr and purrr with economiccomplexity to obtain the Balassa Index for three years. The code divides the data for the years 2016, 2017 and 2018, computes the index for each year, and then it puts the indexes all together in a single table.

# Packages ----

# this is for the example itself
library(dplyr)
library(purrr)
library(tidyr)
library(economiccomplexity)

# this is for the data for the example
library(tradestatistics)

# Data ----

# exports from Chile to the World, by product [Y]ear, [R]eporter, product [C]ode
d <- ots_create_tidy_data(years = 2016:2018, reporters = "all", table = "yrc")

# unique years
unique(d$year)

# we can't obtain Balassa index when d has more than 1 year, 
# so we use purrr + dplyr to filter the data and create a 
# new dataset with the correct index

# Filter by year + compute Balassa Index by year ----

# create a new function balassa_index_by_year()
balassa_index_by_year <- function(d) {
  # unique years 
  years <- unique(d$year)
  
  # filter one year at a time, compute the index, repeat and paste
  # all the pieces
  map_df(
    years,
    function(t) {
      d2 <- d %>% filter(year == t)
      
      d2 <- balassa_index(
        data = d2,
        country = "reporter_iso",
        product = "product_code",
        value = "export_value_usd",
        discrete = FALSE
      )
      
      # to use map df, we need d2 to be a data.frame/tibble, not a matrix
      # otherwise, tweak the example and use map(), which returns a list
      # of matrices in this case
      d2 <- as.data.frame(as.matrix(d2))
      d2 <- tibble::rownames_to_column(d2, "country_iso")
      d2 <- d2 %>% 
        gather(product_code, balassa_index, -country_iso) %>% 
        mutate(year = t) %>% 
        select(year, country_iso, everything()) %>% 
        as_tibble()
      
      return(d2)
    }
  )
}

d2 <- balassa_index_by_year(d)

# Final comments ----

# table d already includes a column export_which is different from the obtained now
# why? because tradestatatiscs database used the weighted average of 3 years of
# exports to obtain a more stable Balassa Index

Please let me know if you have more questions, and if you are using this package for a publication please cite as:

Mauricio Vargas (2020). economiccomplexity: Computational Methods for Economic Complexity. R
  package version 1.1. https://CRAN.R-project.org/package=economiccomplexity

A BibTeX entry for LaTeX users is:

  @Manual{vargas2020,
    title = {economiccomplexity: Computational Methods for Economic Complexity},
    author = {Mauricio Vargas},
    year = {2020},
    note = {R package version 1.1},
    url = {https://CRAN.R-project.org/package=economiccomplexity},
  }