Error running the "Real-world example"

Question

Error running the "Real-world example"

JulienPascal opened this issue 5 years ago · 14 comments

Hi,

Thanks for great package. It's a big time-saver. I have tried to run the Real-world example from the documentation:

#--------
# Options
#--------
# Set to true to add wealth data:
wealth = FALSE
small = FALSE

# ipak function: install and load multiple R packages.
# check to see if packages are installed. Install them if they are not, then load them into the R session.
# source: https://gist.github.com/stevenworthington/3178163
ipak <- function(pkg){
  new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
  if (length(new.pkg)) 
    install.packages(new.pkg, dependencies = TRUE)
  sapply(pkg, require, character.only = TRUE)
}

# Load the required packages
# If you don't have them already installed, it may take a while 
packages <- c("psidR")
ipak(packages)

require(psidR)
require(rjson)
require(data.table)

# ################################################
# Real-world example: not run because takes long.
# Build panel with income, wage, age and education
# optionally: add wealth supplements!
# ################################################
# The package is installed with a list of variables
# Alternatively, search for names with \\code{\\link{getNamesPSID}}
# This is the body of function build.psid()
# (so why not call build.psid() and see what happens!)
r = system.file(package="psidR")
if (small){
  f = fread(file.path(r,"psid-lists","famvars-small.txt"))
  i = fread(file.path(r,"psid-lists","indvars-small.txt"))
  if (wealth){
    w = fread(file.path(r,"psid-lists","wealthvars-small.txt"))
  }
} else {
  f = fread(file.path(r,"psid-lists","famvars.txt"))
  i = fread(file.path(r,"psid-lists","indvars.txt"))
  if (wealth){
    w = fread(file.path(r,"psid-lists","wealthvars.txt"))
  }
}

setkey(i,"name")
setkey(f,"name")
if (wealth) setkey(w,"name")
i = dcast(i[,list(year,name,variable)],year~name)
f = dcast(f[,list(year,name,variable)],year~name)
if (wealth) {
  w = dcast(w[,list(year,name,variable)],year~name)
  d = build.panel(datadir=data_dir,fam.vars=f,
                  ind.vars=i,wealth.vars=w, 
                  heads.only =TRUE,sample="SRC",
                  design="all")
  save(d, file= paste0(data_dir, "/psid.RData"))
} else {
  d = build.panel(datadir=data_dir,fam.vars=f,
                  ind.vars=i, 
                  heads.only =TRUE,sample="SRC",
                  design="all")
  save(d, file= paste0(data_dir, "/psid_no_wealth.RData"))
}

I encountered the following error message

INFO [2019-11-27 23:24:40] psidR: currently working on data for year 1969
Error in `[.data.table`(yind, , `:=`((as.character(ind.nas)), NA)) : 
  Can't assign to the same column twice in the same query (duplicates detected).
In addition: Warning message:
In `[.data.table`(tmp, , `:=`((nanames), NA_real_), with = FALSE) :
  with=FALSE ignored, it isn't needed when using :=. See ?':=' for examples.

Is there any easy fix for this problem?
This issue is related to #32
Thanks.

floswald commented 5 years ago

FYI #36

Answer 1 · 2019-11-28T07:40:39.000Z

Extra info:

I set the variable small = TRUE, everything works fine
Is there a problem with the year 1969 only?

INFO [2019-11-28 08:38:52] found FAM2013ER.rda already downloaded
INFO [2019-11-28 08:38:52] found FAM2015ER.rda already downloaded
INFO [2019-11-28 08:38:52] everything already downloaded. Build dataset now
INFO [2019-11-28 08:38:52] psidR: Loading Family data from .rda files
INFO [2019-11-28 08:38:56] psidR: loaded individual file: /media/julien/TOSHIBA EXT/DATASETS/PSID_data/IND2015ER.rda
INFO [2019-11-28 08:38:56] psidR: total memory load in MB: 1400
INFO [2019-11-28 08:38:56] psidR: currently working on data for year 2013
INFO [2019-11-28 08:38:56] full 2013 sample has 80666 obs
INFO [2019-11-28 08:38:56] you selected 33940 obs belonging to SRC
INFO [2019-11-28 08:38:56] dropping non-heads leaves 5450 obs
INFO [2019-11-28 08:38:57] psidR: currently working on data for year 2015
INFO [2019-11-28 08:38:57] full 2015 sample has 80666 obs
INFO [2019-11-28 08:38:57] you selected 33940 obs belonging to SRC
INFO [2019-11-28 08:38:57] dropping non-heads leaves 5318 obs
INFO [2019-11-28 08:38:58] End of build.panel

> head(d)
   faminc hours hvalue mortgage own state interview ID1968 pernum sequence relation.head age educ
1:  10950    40      0        0   5     4         1    860      1        1            10  72   14
2:  40942     0 148000        0   1    41         2    459      1        1            10  79   12
3:  52300     0  90000    35000   1     9         3    581      3        1            10  62   10
4:  26400  2096  90000        0   1     1         4   1438    187        1            10  34   12
5:   8520     0      0        0   8    42         5   1034      3        1            10  62   12
6:  43050  1520      0        0   5    48         6    691     33        1            10  23   14
   empstat weight     pid year
1:       4 54.070  860001 2013
2:       4 52.431  459001 2013
3:       4 86.861  581003 2013
4:       1  0.000 1438187 2013
5:       5 12.756 1034003 2013
6:       1 25.676  691033 2013

Answer 2 · 2019-11-28T07:52:07.000Z

Another follow-up:

there are <NA> in i = fread(file.path(r,"psid-lists","indvars-small.txt"))

 > head(i)
   year     age    educ empstat  weight
1: 1968 ER30004 ER30010    <NA> ER30019
2: 1969 ER30023    <NA>    <NA> ER30042
3: 1970 ER30046 ER30052    <NA> ER30066
4: 1971 ER30070 ER30076    <NA> ER30090
5: 1972 ER30094 ER30100    <NA> ER30116
6: 1973 ER30120 ER30126    <NA> ER30137

I am using the version from CRAN, I will re-install the package and try again with:

install.packages('devtools')
install_github("psidR",username="floswald")

Edit:
It's not an issue CRAN versus latest version on Github. I have just reinstalled the package and the problem is still there:

install.packages('devtools')
require(devtools)
install_github(repo = "https://github.com/floswald/psidR")

Answer 3 · 2019-11-28T07:58:36.000Z

it's hard to debug without downloading the whole slew of data again (doing that now). what's the earliest year where it works (i.e. after 1969)? the problem is each time they update the data.table API, some functionality in here breaks...

Answer 4 · 2019-11-28T08:49:07.000Z

my guess is that the problem is the wealth supplement. this has changed in the PSID, so currently wrong here . the wealth variables have been moved to the family files from 1999 onwards, so you should just select those in your fam.vars data.frame. try setting wealth=FALSE, and including the wealth vars in fam.vars for 1999 onwards. for the years before the current approach (download separate supplement) seems still to be valid, however, the code will fail when it attempts to download the supplement for 1999 (which is gone, AFAIK)

Answer 5 · 2019-11-28T08:50:33.000Z

ping @SchroederAdrian

Answer 6 · 2019-11-28T08:54:58.000Z

try setting wealth=FALSE, and including the wealth vars in fam.vars for 1999 onwards.

I will try to do that.

what's the earliest year where it works (i.e. after 1969)?

I am on it. I will restrict the sample until it works.

Answer 7 · 2019-11-28T09:17:02.000Z

The data freshly downloaded (152 MB) is here: https://www.dropbox.com/sh/oprzmrjx8xkp48s/AAAcPpdS62S2Iy5xnAQP8yMJa?dl=0

thanks! actually it's almost done (fast internet today!) will check later today.

Answer 8 · 2019-11-28T09:51:04.000Z

With wealth=FALSE, when I drop 1968 and 1969, the real-world example works:

# Clear the workplace:
rm(list = ls())

#install.packages('devtools')
#require(devtools)
#install_github(repo = "https://github.com/floswald/psidR")

#------
# Paths
#------
# Path to main file:
# !Adjut this to you own setting!
path_to_main = "/home/julien/Documents/REPOSITORIES/PSIDPanelBuilder"
# Where the data is stored:
#data_dir = "/home/julien/MEGA/Dataset/PSID"
data_dir = "/media/julien/TOSHIBA EXT/DATASETS/PSID_data"
  
  #--------
  # Options
  #--------
  # Set to true to add wealth data:
  wealth = FALSE
  # Set to true to work with the small dataset
  small = FALSE
  first_year = 1970
  
  # ipak function: install and load multiple R packages.
  # check to see if packages are installed. Install them if they are not, then load them into the R session.
  # source: https://gist.github.com/stevenworthington/3178163
  ipak <- function(pkg){
    new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
    if (length(new.pkg)) 
      install.packages(new.pkg, dependencies = TRUE)
    sapply(pkg, require, character.only = TRUE)
  }
  
  # Load the required packages
  # If you don't have them already installed, it may take a while 
  packages <- c("psidR")
  ipak(packages)
  
  require(psidR)
  require(rjson)
  require(data.table)
  
  # Let's read PSID data from json file
  login <- fromJSON(file = paste0(path_to_main, "/login_psid.json"))
  
  # ################################################
  # Real-world example: not run because takes long.
  # Build panel with income, wage, age and education
  # optionally: add wealth supplements!
  # ################################################
  # The package is installed with a list of variables
  # Alternatively, search for names with \\code{\\link{getNamesPSID}}
  # This is the body of function build.psid()
  # (so why not call build.psid() and see what happens!)
  r = system.file(package="psidR")
  if (small){
    f = fread(file.path(r,"psid-lists","famvars-small.txt"))
    i = fread(file.path(r,"psid-lists","indvars-small.txt"))
    if (wealth){
      w = fread(file.path(r,"psid-lists","wealthvars-small.txt"))
    }
  } else {
    f = fread(file.path(r,"psid-lists","famvars.txt"))
    i = fread(file.path(r,"psid-lists","indvars.txt"))
    if (wealth){
      w = fread(file.path(r,"psid-lists","wealthvars.txt"))
    }
  }
  
  # Selected years only
  i <- i[which(i$year>=first_year), ]
  f <- f[which(f$year>=first_year), ]
  
  setkey(i,"name")
  setkey(f,"name")
  if (wealth) setkey(w,"name")
  i = dcast(i[,list(year,name,variable)],year~name)
  f = dcast(f[,list(year,name,variable)],year~name)
  if (wealth) {
    w = dcast(w[,list(year,name,variable)],year~name)
    d = build.panel(datadir=data_dir,fam.vars=f,
                    ind.vars=i,wealth.vars=w, 
                    heads.only =TRUE,sample="SRC",
                    design="all")
    save(d, file= paste0(data_dir, "/psid.RData"))
  } else {
    d = build.panel(datadir=data_dir,fam.vars=f,
                    ind.vars=i, 
                    heads.only =TRUE,sample="SRC",
                    design="all")
    save(d, file= paste0(data_dir, "/psid_no_wealth.RData"))
  }

head(d)

> head(d)
   empstat_ faminc hours hvalue mortgage own state interview bankrupt1_7_13 bankrupt1_enddebt
1:        1   9244  1976  12000     9116   1    12         1             NA                NA
2:        1  17000  2040  18000    10000   1    24         2             NA                NA
3:        3   2552   360   2000        0   1    12        13             NA                NA
4:        1   5500  1880   1500        0   1    12        14             NA                NA
5:        1   4815   420  15000        0   1    32        15             NA                NA
6:        3   6229    73   3500      400   1    32        16             NA                NA
   bankrupt1_initdebt bankrupt1_property bankrupt1_property_val bankrupt1_repay_amt
1:                 NA                 NA                     NA                  NA
2:                 NA                 NA                     NA                  NA
3:                 NA                 NA                     NA                  NA
4:                 NA                 NA                     NA                  NA
5:                 NA                 NA                     NA                  NA
6:                 NA                 NA                     NA                  NA
   bankrupt1_repay_done bankrupt1_repay_length bankrupt1_repay_per bankrupt1_where bankrupt1_why1
1:                   NA                     NA                  NA              NA             NA
2:                   NA                     NA                  NA              NA             NA
3:                   NA                     NA                  NA              NA             NA
4:                   NA                     NA                  NA              NA             NA
5:                   NA                     NA                  NA              NA             NA
6:                   NA                     NA                  NA              NA             NA
   bankrupt1_why2 bankrupt1_why3 bankrupt1_year bankrupt2_7_13 bankrupt2_enddebt bankrupt2_initdebt
1:             NA             NA             NA             NA                NA                 NA
2:             NA             NA             NA             NA                NA                 NA
3:             NA             NA             NA             NA                NA                 NA
4:             NA             NA             NA             NA                NA                 NA
5:             NA             NA             NA             NA                NA                 NA
6:             NA             NA             NA             NA                NA                 NA
   bankrupt2_property bankrupt2_property_val bankrupt2_repay_amt bankrupt2_repay_done
1:                 NA                     NA                  NA                   NA
2:                 NA                     NA                  NA                   NA
3:                 NA                     NA                  NA                   NA
4:                 NA                     NA                  NA                   NA
5:                 NA                     NA                  NA                   NA
6:                 NA                     NA                  NA                   NA
   bankrupt2_repay_length bankrupt2_repay_per bankrupt2_where bankrupt2_why1 bankrupt2_why2
1:                     NA                  NA              NA             NA             NA
2:                     NA                  NA              NA             NA             NA
3:                     NA                  NA              NA             NA             NA
4:                     NA                  NA              NA             NA             NA
5:                     NA                  NA              NA             NA             NA
6:                     NA                  NA              NA             NA             NA
   bankrupt2_why3 bankrupt2_year bankrupt_91 bankrupt_num debt wealth ID1968 pernum sequence
1:             NA             NA          NA           NA   NA     NA    543      1        1
2:             NA             NA          NA           NA   NA     NA    794      1        1
3:             NA             NA          NA           NA   NA     NA   1688      1        1
4:             NA             NA          NA           NA   NA     NA   2202      1        1
5:             NA             NA          NA           NA   NA     NA    721      1        1
6:             NA             NA          NA           NA   NA     NA    752      1        1
   relation.head age educ weight empstat     pid year
1:             1  66    0   27.4      NA  543001 1970
2:             1  35    0   27.4      NA  794001 1970
3:             1  75    0   26.3      NA 1688001 1970
4:             1  35    0   26.3      NA 2202001 1970
5:             1  69    0   23.8      NA  721001 1970
6:             1  59    0    7.0      NA  752001 1970

With wealth=TRUE, the script fails:

INFO [2019-11-28 10:52:01] full 1984 sample has 80666 obs
INFO [2019-11-28 10:52:01] you selected 33940 obs belonging to SRC
INFO [2019-11-28 10:52:01] dropping non-heads leaves 3729 obs
Error in `[.data.table`(tmp, , codes, with = FALSE) : 
  column(s) not found: NA
In addition: There were 15 warnings (use warnings() to see them)

Answer 9 · 2019-11-28T12:09:22.000Z

excellent, that means my lead in #36 is correct. so: build without wealth supplements for now (select the wealth vars from family files for 1999 onwards!)

Answer 10 · 2019-11-29T02:26:17.000Z

Thanks for being on this guys! I manually compiled the files using lodown and your psidR helper functions but definitely handy to have a second dataset to compare to now. Great package besides that! The PSID is only useful when its easily accessible and your functions saved me a lot of time already.

Answer 11 · 2019-11-29T08:16:51.000Z

the 1969 and wealth issues are separate bugs I think. there is no wealth supplement in 1969, so that cannot be the reason of the error. For completeness, here is full output

> d = build.panel(datadir="~/data/psid",fam.vars=f,
+                 ind.vars=i,wealth.vars=w, 
+                 heads.only =TRUE,sample="SRC",
+                 design="all")
INFO [2019-11-28 08:56:18] will download as WEALTH1984ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH1989ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH1994ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH1999ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2001ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2003ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2005ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2007ER.rda
INFO [2019-11-28 08:56:18] Will download missing datasets now
INFO [2019-11-28 08:56:18] will download family files: 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1999, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015
INFO [2019-11-28 08:56:18] will download: IND2015ER
INFO [2019-11-28 08:56:18] will download missing wealth files.
This can take several hours/days to download.
 want to go ahead? give me 'yes' or 'no'.yes
please enter your PSID username: *****
please enter your PSID password: 
INFO [2019-11-28 08:56:47] downloading file ~/data/psid/FAM1968ER
INFO [2019-11-28 08:56:48] now reading and processing SAS file ~/data/psid/FAM1968ER into R
INFO [2019-11-28 08:57:12] downloading file ~/data/psid/FAM1969ER          
INFO [2019-11-28 08:57:15] now reading and processing SAS file ~/data/psid/FAM1969ER into R
INFO [2019-11-28 08:57:42] downloading file ~/data/psid/FAM1970ER          
INFO [2019-11-28 08:57:43] now reading and processing SAS file ~/data/psid/FAM1970ER into R
INFO [2019-11-28 08:58:12] downloading file ~/data/psid/FAM1971ER          
INFO [2019-11-28 08:58:13] now reading and processing SAS file ~/data/psid/FAM1971ER into R
INFO [2019-11-28 08:58:42] downloading file ~/data/psid/FAM1972ER          
INFO [2019-11-28 08:58:43] now reading and processing SAS file ~/data/psid/FAM1972ER into R
INFO [2019-11-28 08:59:15] downloading file ~/data/psid/FAM1973ER          
INFO [2019-11-28 08:59:16] now reading and processing SAS file ~/data/psid/FAM1973ER into R
INFO [2019-11-28 08:59:34] downloading file ~/data/psid/FAM1974ER          
INFO [2019-11-28 08:59:40] now reading and processing SAS file ~/data/psid/FAM1974ER into R
INFO [2019-11-28 08:59:59] downloading file ~/data/psid/FAM1975ER          
INFO [2019-11-28 09:00:01] now reading and processing SAS file ~/data/psid/FAM1975ER into R
INFO [2019-11-28 09:00:29] downloading file ~/data/psid/FAM1976ER          
INFO [2019-11-28 09:00:30] now reading and processing SAS file ~/data/psid/FAM1976ER into R
INFO [2019-11-28 09:01:25] downloading file ~/data/psid/FAM1977ER          
INFO [2019-11-28 09:01:26] now reading and processing SAS file ~/data/psid/FAM1977ER into R
INFO [2019-11-28 09:01:57] downloading file ~/data/psid/FAM1978ER          
INFO [2019-11-28 09:01:59] now reading and processing SAS file ~/data/psid/FAM1978ER into R
INFO [2019-11-28 09:02:34] downloading file ~/data/psid/FAM1979ER          
INFO [2019-11-28 09:02:35] now reading and processing SAS file ~/data/psid/FAM1979ER into R
INFO [2019-11-28 09:03:12] downloading file ~/data/psid/FAM1980ER          
INFO [2019-11-28 09:03:14] now reading and processing SAS file ~/data/psid/FAM1980ER into R
INFO [2019-11-28 09:03:54] downloading file ~/data/psid/FAM1981ER          
INFO [2019-11-28 09:03:56] now reading and processing SAS file ~/data/psid/FAM1981ER into R
INFO [2019-11-28 09:04:39] downloading file ~/data/psid/FAM1982ER          
INFO [2019-11-28 09:04:41] now reading and processing SAS file ~/data/psid/FAM1982ER into R
INFO [2019-11-28 09:05:21] downloading file ~/data/psid/FAM1983ER          
INFO [2019-11-28 09:05:23] now reading and processing SAS file ~/data/psid/FAM1983ER into R
INFO [2019-11-28 09:06:12] downloading file ~/data/psid/FAM1984ER          
INFO [2019-11-28 09:06:14] now reading and processing SAS file ~/data/psid/FAM1984ER into R
INFO [2019-11-28 09:07:32] downloading file ~/data/psid/FAM1985ER          
INFO [2019-11-28 09:07:34] now reading and processing SAS file ~/data/psid/FAM1985ER into R
INFO [2019-11-28 09:09:15] downloading file ~/data/psid/FAM1986ER          
INFO [2019-11-28 09:09:17] now reading and processing SAS file ~/data/psid/FAM1986ER into R
INFO [2019-11-28 09:10:41] downloading file ~/data/psid/FAM1987ER          
INFO [2019-11-28 09:10:45] now reading and processing SAS file ~/data/psid/FAM1987ER into R
INFO [2019-11-28 09:11:57] downloading file ~/data/psid/FAM1988ER          
INFO [2019-11-28 09:12:02] now reading and processing SAS file ~/data/psid/FAM1988ER into R
INFO [2019-11-28 09:13:36] downloading file ~/data/psid/FAM1989ER          
INFO [2019-11-28 09:13:39] now reading and processing SAS file ~/data/psid/FAM1989ER into R
INFO [2019-11-28 09:15:09] downloading file ~/data/psid/FAM1990ER          
INFO [2019-11-28 09:15:12] now reading and processing SAS file ~/data/psid/FAM1990ER into R
INFO [2019-11-28 09:17:01] downloading file ~/data/psid/FAM1991ER          
INFO [2019-11-28 09:17:06] now reading and processing SAS file ~/data/psid/FAM1991ER into R
INFO [2019-11-28 09:19:04] downloading file ~/data/psid/FAM1992ER          
INFO [2019-11-28 09:19:06] now reading and processing SAS file ~/data/psid/FAM1992ER into R
INFO [2019-11-28 09:21:07] downloading file ~/data/psid/FAM1993ER          
INFO [2019-11-28 09:21:12] now reading and processing SAS file ~/data/psid/FAM1993ER into R
INFO [2019-11-28 09:24:10] downloading file ~/data/psid/FAM1994ER          
INFO [2019-11-28 09:24:13] now reading and processing SAS file ~/data/psid/FAM1994ER into R
INFO [2019-11-28 09:28:09] downloading file ~/data/psid/FAM1995ER           
INFO [2019-11-28 09:28:11] now reading and processing SAS file ~/data/psid/FAM1995ER into R
INFO [2019-11-28 09:32:01] downloading file ~/data/psid/FAM1996ER           
INFO [2019-11-28 09:32:03] now reading and processing SAS file ~/data/psid/FAM1996ER into R
INFO [2019-11-28 09:35:29] downloading file ~/data/psid/FAM1997ER          
INFO [2019-11-28 09:35:31] now reading and processing SAS file ~/data/psid/FAM1997ER into R
INFO [2019-11-28 09:38:12] downloading file ~/data/psid/FAM1999ER          
INFO [2019-11-28 09:38:14] now reading and processing SAS file ~/data/psid/FAM1999ER into R
INFO [2019-11-28 09:42:38] downloading file ~/data/psid/FAM2001ER          
INFO [2019-11-28 09:42:41] now reading and processing SAS file ~/data/psid/FAM2001ER into R
INFO [2019-11-28 09:47:16] downloading file ~/data/psid/FAM2003ER          
INFO [2019-11-28 09:47:20] now reading and processing SAS file ~/data/psid/FAM2003ER into R
INFO [2019-11-28 09:51:59] downloading file ~/data/psid/FAM2005ER          
INFO [2019-11-28 09:52:11] now reading and processing SAS file ~/data/psid/FAM2005ER into R
INFO [2019-11-28 09:56:49] downloading file ~/data/psid/FAM2007ER          
INFO [2019-11-28 09:56:53] now reading and processing SAS file ~/data/psid/FAM2007ER into R
INFO [2019-11-28 10:04:25] downloading file ~/data/psid/FAM2009ER          
INFO [2019-11-28 10:04:29] now reading and processing SAS file ~/data/psid/FAM2009ER into R
INFO [2019-11-28 10:12:08] downloading file ~/data/psid/FAM2011ER          
INFO [2019-11-28 10:12:10] now reading and processing SAS file ~/data/psid/FAM2011ER into R
INFO [2019-11-28 10:20:19] downloading file ~/data/psid/FAM2013ER          
INFO [2019-11-28 10:20:26] now reading and processing SAS file ~/data/psid/FAM2013ER into R
INFO [2019-11-28 13:12:58] downloading file ~/data/psid/FAM2015ER          
INFO [2019-11-28 13:13:01] now reading and processing SAS file ~/data/psid/FAM2015ER into R
INFO [2019-11-28 13:22:37] downloading file ~/data/psid/WEALTH1984ER       
INFO [2019-11-28 13:22:38] now reading and processing SAS file ~/data/psid/WEALTH1984ER into R
INFO [2019-11-28 13:22:42] downloading file ~/data/psid/WEALTH1989ER       
INFO [2019-11-28 13:22:42] now reading and processing SAS file ~/data/psid/WEALTH1989ER into R
INFO [2019-11-28 13:22:47] downloading file ~/data/psid/WEALTH1994ER       
INFO [2019-11-28 13:22:48] now reading and processing SAS file ~/data/psid/WEALTH1994ER into R
INFO [2019-11-28 13:22:54] downloading file ~/data/psid/WEALTH1999ER       
INFO [2019-11-28 13:22:54] now reading and processing SAS file ~/data/psid/WEALTH1999ER into R
INFO [2019-11-28 13:23:00] downloading file ~/data/psid/WEALTH2001ER       
INFO [2019-11-28 13:23:00] now reading and processing SAS file ~/data/psid/WEALTH2001ER into R
INFO [2019-11-28 13:23:06] downloading file ~/data/psid/WEALTH2003ER       
INFO [2019-11-28 13:23:06] now reading and processing SAS file ~/data/psid/WEALTH2003ER into R
INFO [2019-11-28 13:23:11] downloading file ~/data/psid/WEALTH2005ER       
INFO [2019-11-28 13:23:12] now reading and processing SAS file ~/data/psid/WEALTH2005ER into R
INFO [2019-11-28 13:23:17] downloading file ~/data/psid/WEALTH2007ER       
INFO [2019-11-28 13:23:18] now reading and processing SAS file ~/data/psid/WEALTH2007ER into R
INFO [2019-11-28 13:23:23] downloading file ~/data/psid/IND2015ER          
INFO [2019-11-28 13:23:31] now reading and processing SAS file ~/data/psid/IND2015ER into R
INFO [2019-11-28 14:02:04] finished downloading files to ~/data/psid/       
INFO [2019-11-28 14:02:04] continuing now to build the dataset
INFO [2019-11-28 14:02:04] psidR: Loading Family data from .rda files
INFO [2019-11-28 14:02:13] psidR: loaded individual file: ~/data/psid/IND2015ER.rda
INFO [2019-11-28 14:02:13] psidR: total memory load in MB: 1400
INFO [2019-11-28 14:02:13] psidR: currently working on data for year 1968
INFO [2019-11-28 14:02:14] full 1968 sample has 80666 obs
INFO [2019-11-28 14:02:14] you selected 33940 obs belonging to SRC
INFO [2019-11-28 14:02:14] dropping non-heads leaves 2930 obs
INFO [2019-11-28 14:02:14] psidR: currently working on data for year 1969
Error in `[.data.table`(yind, , `:=`((as.character(ind.nas)), NA)) : 
  Can't assign to the same column twice in the same query (duplicates detected).
In addition: Warning messages:
1: In if (!wlth.down) { :
  the condition has length > 1 and only the first element will be used
2: In `[.data.table`(tmp, , `:=`((nanames), NA_real_), with = FALSE) :
  with=FALSE ignored, it isn't needed when using :=. See ?':=' for examples.
> traceback()
3: `[.data.table`(yind, , `:=`((as.character(ind.nas)), NA)) at build.panel.r#435
2: yind[, `:=`((as.character(ind.nas)), NA)] at build.panel.r#435
1: build.panel(datadir = "~/data/psid", fam.vars = f, ind.vars = i, 
       wealth.vars = w, heads.only = TRUE, sample = "SRC", design = "all")

Answer 12 · 2020-02-27T08:42:16.000Z

well, took me only 3 months, but that should work now. @SchroederAdrian @JulienPascal

Answer 13 · 2022-09-15T17:30:06.000Z

my guess is that the problem is the wealth supplement. this has changed in the PSID, so currently wrong here . the wealth variables have been moved to the family files from 1999 onwards, so you should just select those in your fam.vars data.frame. try setting wealth=FALSE, and including the wealth vars in fam.vars for 1999 onwards. for the years before the current approach (download separate supplement) seems still to be valid, however, the code will fail when it attempts to download the supplement for 1999 (which is gone, AFAIK)

Hi, I tried put the wealth variables in the famvars.txt documents for a 1999-2009 panel. However, it keeps warning me : "Error in [.data.table(tmp, , codes[-na], with = FALSE) :
column(s) not found: s402, s407, s417".
Is there anything I can do to solve this?
Thanks!