Error running the "Real-world example"
JulienPascal opened this issue · 14 comments
Hi,
Thanks for great package. It's a big time-saver. I have tried to run the Real-world example from the documentation:
#--------
# Options
#--------
# Set to true to add wealth data:
wealth = FALSE
small = FALSE
# ipak function: install and load multiple R packages.
# check to see if packages are installed. Install them if they are not, then load them into the R session.
# source: https://gist.github.com/stevenworthington/3178163
ipak <- function(pkg){
new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
if (length(new.pkg))
install.packages(new.pkg, dependencies = TRUE)
sapply(pkg, require, character.only = TRUE)
}
# Load the required packages
# If you don't have them already installed, it may take a while
packages <- c("psidR")
ipak(packages)
require(psidR)
require(rjson)
require(data.table)
# ################################################
# Real-world example: not run because takes long.
# Build panel with income, wage, age and education
# optionally: add wealth supplements!
# ################################################
# The package is installed with a list of variables
# Alternatively, search for names with \\code{\\link{getNamesPSID}}
# This is the body of function build.psid()
# (so why not call build.psid() and see what happens!)
r = system.file(package="psidR")
if (small){
f = fread(file.path(r,"psid-lists","famvars-small.txt"))
i = fread(file.path(r,"psid-lists","indvars-small.txt"))
if (wealth){
w = fread(file.path(r,"psid-lists","wealthvars-small.txt"))
}
} else {
f = fread(file.path(r,"psid-lists","famvars.txt"))
i = fread(file.path(r,"psid-lists","indvars.txt"))
if (wealth){
w = fread(file.path(r,"psid-lists","wealthvars.txt"))
}
}
setkey(i,"name")
setkey(f,"name")
if (wealth) setkey(w,"name")
i = dcast(i[,list(year,name,variable)],year~name)
f = dcast(f[,list(year,name,variable)],year~name)
if (wealth) {
w = dcast(w[,list(year,name,variable)],year~name)
d = build.panel(datadir=data_dir,fam.vars=f,
ind.vars=i,wealth.vars=w,
heads.only =TRUE,sample="SRC",
design="all")
save(d, file= paste0(data_dir, "/psid.RData"))
} else {
d = build.panel(datadir=data_dir,fam.vars=f,
ind.vars=i,
heads.only =TRUE,sample="SRC",
design="all")
save(d, file= paste0(data_dir, "/psid_no_wealth.RData"))
}
I encountered the following error message
INFO [2019-11-27 23:24:40] psidR: currently working on data for year 1969
Error in `[.data.table`(yind, , `:=`((as.character(ind.nas)), NA)) :
Can't assign to the same column twice in the same query (duplicates detected).
In addition: Warning message:
In `[.data.table`(tmp, , `:=`((nanames), NA_real_), with = FALSE) :
with=FALSE ignored, it isn't needed when using :=. See ?':=' for examples.
Is there any easy fix for this problem?
This issue is related to #32
Thanks.
Extra info:
- I set the variable
small = TRUE
, everything works fine - Is there a problem with the year 1969 only?
INFO [2019-11-28 08:38:52] found FAM2013ER.rda already downloaded
INFO [2019-11-28 08:38:52] found FAM2015ER.rda already downloaded
INFO [2019-11-28 08:38:52] everything already downloaded. Build dataset now
INFO [2019-11-28 08:38:52] psidR: Loading Family data from .rda files
INFO [2019-11-28 08:38:56] psidR: loaded individual file: /media/julien/TOSHIBA EXT/DATASETS/PSID_data/IND2015ER.rda
INFO [2019-11-28 08:38:56] psidR: total memory load in MB: 1400
INFO [2019-11-28 08:38:56] psidR: currently working on data for year 2013
INFO [2019-11-28 08:38:56] full 2013 sample has 80666 obs
INFO [2019-11-28 08:38:56] you selected 33940 obs belonging to SRC
INFO [2019-11-28 08:38:56] dropping non-heads leaves 5450 obs
INFO [2019-11-28 08:38:57] psidR: currently working on data for year 2015
INFO [2019-11-28 08:38:57] full 2015 sample has 80666 obs
INFO [2019-11-28 08:38:57] you selected 33940 obs belonging to SRC
INFO [2019-11-28 08:38:57] dropping non-heads leaves 5318 obs
INFO [2019-11-28 08:38:58] End of build.panel
> head(d)
faminc hours hvalue mortgage own state interview ID1968 pernum sequence relation.head age educ
1: 10950 40 0 0 5 4 1 860 1 1 10 72 14
2: 40942 0 148000 0 1 41 2 459 1 1 10 79 12
3: 52300 0 90000 35000 1 9 3 581 3 1 10 62 10
4: 26400 2096 90000 0 1 1 4 1438 187 1 10 34 12
5: 8520 0 0 0 8 42 5 1034 3 1 10 62 12
6: 43050 1520 0 0 5 48 6 691 33 1 10 23 14
empstat weight pid year
1: 4 54.070 860001 2013
2: 4 52.431 459001 2013
3: 4 86.861 581003 2013
4: 1 0.000 1438187 2013
5: 5 12.756 1034003 2013
6: 1 25.676 691033 2013
Another follow-up:
- there are
<NA>
ini = fread(file.path(r,"psid-lists","indvars-small.txt"))
> head(i)
year age educ empstat weight
1: 1968 ER30004 ER30010 <NA> ER30019
2: 1969 ER30023 <NA> <NA> ER30042
3: 1970 ER30046 ER30052 <NA> ER30066
4: 1971 ER30070 ER30076 <NA> ER30090
5: 1972 ER30094 ER30100 <NA> ER30116
6: 1973 ER30120 ER30126 <NA> ER30137
- I am using the version from CRAN, I will re-install the package and try again with:
install.packages('devtools')
install_github("psidR",username="floswald")
Edit:
It's not an issue CRAN versus latest version on Github. I have just reinstalled the package and the problem is still there:
install.packages('devtools')
require(devtools)
install_github(repo = "https://github.com/floswald/psidR")
it's hard to debug without downloading the whole slew of data again (doing that now). what's the earliest year where it works (i.e. after 1969)? the problem is each time they update the data.table API, some functionality in here breaks...
my guess is that the problem is the wealth supplement. this has changed in the PSID, so currently wrong here . the wealth variables have been moved to the family files from 1999 onwards, so you should just select those in your fam.vars data.frame. try setting wealth=FALSE
, and including the wealth vars in fam.vars for 1999 onwards. for the years before the current approach (download separate supplement) seems still to be valid, however, the code will fail when it attempts to download the supplement for 1999 (which is gone, AFAIK)
ping @SchroederAdrian
try setting wealth=FALSE, and including the wealth vars in fam.vars for 1999 onwards.
I will try to do that.
what's the earliest year where it works (i.e. after 1969)?
I am on it. I will restrict the sample until it works.
The data freshly downloaded (152 MB) is here: https://www.dropbox.com/sh/oprzmrjx8xkp48s/AAAcPpdS62S2Iy5xnAQP8yMJa?dl=0
thanks! actually it's almost done (fast internet today!) will check later today.
- With
wealth=FALSE
, when I drop 1968 and 1969, the real-world example works:
# Clear the workplace:
rm(list = ls())
#install.packages('devtools')
#require(devtools)
#install_github(repo = "https://github.com/floswald/psidR")
#------
# Paths
#------
# Path to main file:
# !Adjut this to you own setting!
path_to_main = "/home/julien/Documents/REPOSITORIES/PSIDPanelBuilder"
# Where the data is stored:
#data_dir = "/home/julien/MEGA/Dataset/PSID"
data_dir = "/media/julien/TOSHIBA EXT/DATASETS/PSID_data"
#--------
# Options
#--------
# Set to true to add wealth data:
wealth = FALSE
# Set to true to work with the small dataset
small = FALSE
first_year = 1970
# ipak function: install and load multiple R packages.
# check to see if packages are installed. Install them if they are not, then load them into the R session.
# source: https://gist.github.com/stevenworthington/3178163
ipak <- function(pkg){
new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
if (length(new.pkg))
install.packages(new.pkg, dependencies = TRUE)
sapply(pkg, require, character.only = TRUE)
}
# Load the required packages
# If you don't have them already installed, it may take a while
packages <- c("psidR")
ipak(packages)
require(psidR)
require(rjson)
require(data.table)
# Let's read PSID data from json file
login <- fromJSON(file = paste0(path_to_main, "/login_psid.json"))
# ################################################
# Real-world example: not run because takes long.
# Build panel with income, wage, age and education
# optionally: add wealth supplements!
# ################################################
# The package is installed with a list of variables
# Alternatively, search for names with \\code{\\link{getNamesPSID}}
# This is the body of function build.psid()
# (so why not call build.psid() and see what happens!)
r = system.file(package="psidR")
if (small){
f = fread(file.path(r,"psid-lists","famvars-small.txt"))
i = fread(file.path(r,"psid-lists","indvars-small.txt"))
if (wealth){
w = fread(file.path(r,"psid-lists","wealthvars-small.txt"))
}
} else {
f = fread(file.path(r,"psid-lists","famvars.txt"))
i = fread(file.path(r,"psid-lists","indvars.txt"))
if (wealth){
w = fread(file.path(r,"psid-lists","wealthvars.txt"))
}
}
# Selected years only
i <- i[which(i$year>=first_year), ]
f <- f[which(f$year>=first_year), ]
setkey(i,"name")
setkey(f,"name")
if (wealth) setkey(w,"name")
i = dcast(i[,list(year,name,variable)],year~name)
f = dcast(f[,list(year,name,variable)],year~name)
if (wealth) {
w = dcast(w[,list(year,name,variable)],year~name)
d = build.panel(datadir=data_dir,fam.vars=f,
ind.vars=i,wealth.vars=w,
heads.only =TRUE,sample="SRC",
design="all")
save(d, file= paste0(data_dir, "/psid.RData"))
} else {
d = build.panel(datadir=data_dir,fam.vars=f,
ind.vars=i,
heads.only =TRUE,sample="SRC",
design="all")
save(d, file= paste0(data_dir, "/psid_no_wealth.RData"))
}
head(d)
> head(d)
empstat_ faminc hours hvalue mortgage own state interview bankrupt1_7_13 bankrupt1_enddebt
1: 1 9244 1976 12000 9116 1 12 1 NA NA
2: 1 17000 2040 18000 10000 1 24 2 NA NA
3: 3 2552 360 2000 0 1 12 13 NA NA
4: 1 5500 1880 1500 0 1 12 14 NA NA
5: 1 4815 420 15000 0 1 32 15 NA NA
6: 3 6229 73 3500 400 1 32 16 NA NA
bankrupt1_initdebt bankrupt1_property bankrupt1_property_val bankrupt1_repay_amt
1: NA NA NA NA
2: NA NA NA NA
3: NA NA NA NA
4: NA NA NA NA
5: NA NA NA NA
6: NA NA NA NA
bankrupt1_repay_done bankrupt1_repay_length bankrupt1_repay_per bankrupt1_where bankrupt1_why1
1: NA NA NA NA NA
2: NA NA NA NA NA
3: NA NA NA NA NA
4: NA NA NA NA NA
5: NA NA NA NA NA
6: NA NA NA NA NA
bankrupt1_why2 bankrupt1_why3 bankrupt1_year bankrupt2_7_13 bankrupt2_enddebt bankrupt2_initdebt
1: NA NA NA NA NA NA
2: NA NA NA NA NA NA
3: NA NA NA NA NA NA
4: NA NA NA NA NA NA
5: NA NA NA NA NA NA
6: NA NA NA NA NA NA
bankrupt2_property bankrupt2_property_val bankrupt2_repay_amt bankrupt2_repay_done
1: NA NA NA NA
2: NA NA NA NA
3: NA NA NA NA
4: NA NA NA NA
5: NA NA NA NA
6: NA NA NA NA
bankrupt2_repay_length bankrupt2_repay_per bankrupt2_where bankrupt2_why1 bankrupt2_why2
1: NA NA NA NA NA
2: NA NA NA NA NA
3: NA NA NA NA NA
4: NA NA NA NA NA
5: NA NA NA NA NA
6: NA NA NA NA NA
bankrupt2_why3 bankrupt2_year bankrupt_91 bankrupt_num debt wealth ID1968 pernum sequence
1: NA NA NA NA NA NA 543 1 1
2: NA NA NA NA NA NA 794 1 1
3: NA NA NA NA NA NA 1688 1 1
4: NA NA NA NA NA NA 2202 1 1
5: NA NA NA NA NA NA 721 1 1
6: NA NA NA NA NA NA 752 1 1
relation.head age educ weight empstat pid year
1: 1 66 0 27.4 NA 543001 1970
2: 1 35 0 27.4 NA 794001 1970
3: 1 75 0 26.3 NA 1688001 1970
4: 1 35 0 26.3 NA 2202001 1970
5: 1 69 0 23.8 NA 721001 1970
6: 1 59 0 7.0 NA 752001 1970
- With
wealth=TRUE
, the script fails:
INFO [2019-11-28 10:52:01] full 1984 sample has 80666 obs
INFO [2019-11-28 10:52:01] you selected 33940 obs belonging to SRC
INFO [2019-11-28 10:52:01] dropping non-heads leaves 3729 obs
Error in `[.data.table`(tmp, , codes, with = FALSE) :
column(s) not found: NA
In addition: There were 15 warnings (use warnings() to see them)
excellent, that means my lead in #36 is correct. so: build without wealth supplements for now (select the wealth vars from family files for 1999 onwards!)
Thanks for being on this guys! I manually compiled the files using lodown and your psidR helper functions but definitely handy to have a second dataset to compare to now. Great package besides that! The PSID is only useful when its easily accessible and your functions saved me a lot of time already.
the 1969 and wealth issues are separate bugs I think. there is no wealth supplement in 1969, so that cannot be the reason of the error. For completeness, here is full output
> d = build.panel(datadir="~/data/psid",fam.vars=f,
+ ind.vars=i,wealth.vars=w,
+ heads.only =TRUE,sample="SRC",
+ design="all")
INFO [2019-11-28 08:56:18] will download as WEALTH1984ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH1989ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH1994ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH1999ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2001ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2003ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2005ER.rda
INFO [2019-11-28 08:56:18] will download as WEALTH2007ER.rda
INFO [2019-11-28 08:56:18] Will download missing datasets now
INFO [2019-11-28 08:56:18] will download family files: 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1999, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015
INFO [2019-11-28 08:56:18] will download: IND2015ER
INFO [2019-11-28 08:56:18] will download missing wealth files.
This can take several hours/days to download.
want to go ahead? give me 'yes' or 'no'.yes
please enter your PSID username: *****
please enter your PSID password:
INFO [2019-11-28 08:56:47] downloading file ~/data/psid/FAM1968ER
INFO [2019-11-28 08:56:48] now reading and processing SAS file ~/data/psid/FAM1968ER into R
INFO [2019-11-28 08:57:12] downloading file ~/data/psid/FAM1969ER
INFO [2019-11-28 08:57:15] now reading and processing SAS file ~/data/psid/FAM1969ER into R
INFO [2019-11-28 08:57:42] downloading file ~/data/psid/FAM1970ER
INFO [2019-11-28 08:57:43] now reading and processing SAS file ~/data/psid/FAM1970ER into R
INFO [2019-11-28 08:58:12] downloading file ~/data/psid/FAM1971ER
INFO [2019-11-28 08:58:13] now reading and processing SAS file ~/data/psid/FAM1971ER into R
INFO [2019-11-28 08:58:42] downloading file ~/data/psid/FAM1972ER
INFO [2019-11-28 08:58:43] now reading and processing SAS file ~/data/psid/FAM1972ER into R
INFO [2019-11-28 08:59:15] downloading file ~/data/psid/FAM1973ER
INFO [2019-11-28 08:59:16] now reading and processing SAS file ~/data/psid/FAM1973ER into R
INFO [2019-11-28 08:59:34] downloading file ~/data/psid/FAM1974ER
INFO [2019-11-28 08:59:40] now reading and processing SAS file ~/data/psid/FAM1974ER into R
INFO [2019-11-28 08:59:59] downloading file ~/data/psid/FAM1975ER
INFO [2019-11-28 09:00:01] now reading and processing SAS file ~/data/psid/FAM1975ER into R
INFO [2019-11-28 09:00:29] downloading file ~/data/psid/FAM1976ER
INFO [2019-11-28 09:00:30] now reading and processing SAS file ~/data/psid/FAM1976ER into R
INFO [2019-11-28 09:01:25] downloading file ~/data/psid/FAM1977ER
INFO [2019-11-28 09:01:26] now reading and processing SAS file ~/data/psid/FAM1977ER into R
INFO [2019-11-28 09:01:57] downloading file ~/data/psid/FAM1978ER
INFO [2019-11-28 09:01:59] now reading and processing SAS file ~/data/psid/FAM1978ER into R
INFO [2019-11-28 09:02:34] downloading file ~/data/psid/FAM1979ER
INFO [2019-11-28 09:02:35] now reading and processing SAS file ~/data/psid/FAM1979ER into R
INFO [2019-11-28 09:03:12] downloading file ~/data/psid/FAM1980ER
INFO [2019-11-28 09:03:14] now reading and processing SAS file ~/data/psid/FAM1980ER into R
INFO [2019-11-28 09:03:54] downloading file ~/data/psid/FAM1981ER
INFO [2019-11-28 09:03:56] now reading and processing SAS file ~/data/psid/FAM1981ER into R
INFO [2019-11-28 09:04:39] downloading file ~/data/psid/FAM1982ER
INFO [2019-11-28 09:04:41] now reading and processing SAS file ~/data/psid/FAM1982ER into R
INFO [2019-11-28 09:05:21] downloading file ~/data/psid/FAM1983ER
INFO [2019-11-28 09:05:23] now reading and processing SAS file ~/data/psid/FAM1983ER into R
INFO [2019-11-28 09:06:12] downloading file ~/data/psid/FAM1984ER
INFO [2019-11-28 09:06:14] now reading and processing SAS file ~/data/psid/FAM1984ER into R
INFO [2019-11-28 09:07:32] downloading file ~/data/psid/FAM1985ER
INFO [2019-11-28 09:07:34] now reading and processing SAS file ~/data/psid/FAM1985ER into R
INFO [2019-11-28 09:09:15] downloading file ~/data/psid/FAM1986ER
INFO [2019-11-28 09:09:17] now reading and processing SAS file ~/data/psid/FAM1986ER into R
INFO [2019-11-28 09:10:41] downloading file ~/data/psid/FAM1987ER
INFO [2019-11-28 09:10:45] now reading and processing SAS file ~/data/psid/FAM1987ER into R
INFO [2019-11-28 09:11:57] downloading file ~/data/psid/FAM1988ER
INFO [2019-11-28 09:12:02] now reading and processing SAS file ~/data/psid/FAM1988ER into R
INFO [2019-11-28 09:13:36] downloading file ~/data/psid/FAM1989ER
INFO [2019-11-28 09:13:39] now reading and processing SAS file ~/data/psid/FAM1989ER into R
INFO [2019-11-28 09:15:09] downloading file ~/data/psid/FAM1990ER
INFO [2019-11-28 09:15:12] now reading and processing SAS file ~/data/psid/FAM1990ER into R
INFO [2019-11-28 09:17:01] downloading file ~/data/psid/FAM1991ER
INFO [2019-11-28 09:17:06] now reading and processing SAS file ~/data/psid/FAM1991ER into R
INFO [2019-11-28 09:19:04] downloading file ~/data/psid/FAM1992ER
INFO [2019-11-28 09:19:06] now reading and processing SAS file ~/data/psid/FAM1992ER into R
INFO [2019-11-28 09:21:07] downloading file ~/data/psid/FAM1993ER
INFO [2019-11-28 09:21:12] now reading and processing SAS file ~/data/psid/FAM1993ER into R
INFO [2019-11-28 09:24:10] downloading file ~/data/psid/FAM1994ER
INFO [2019-11-28 09:24:13] now reading and processing SAS file ~/data/psid/FAM1994ER into R
INFO [2019-11-28 09:28:09] downloading file ~/data/psid/FAM1995ER
INFO [2019-11-28 09:28:11] now reading and processing SAS file ~/data/psid/FAM1995ER into R
INFO [2019-11-28 09:32:01] downloading file ~/data/psid/FAM1996ER
INFO [2019-11-28 09:32:03] now reading and processing SAS file ~/data/psid/FAM1996ER into R
INFO [2019-11-28 09:35:29] downloading file ~/data/psid/FAM1997ER
INFO [2019-11-28 09:35:31] now reading and processing SAS file ~/data/psid/FAM1997ER into R
INFO [2019-11-28 09:38:12] downloading file ~/data/psid/FAM1999ER
INFO [2019-11-28 09:38:14] now reading and processing SAS file ~/data/psid/FAM1999ER into R
INFO [2019-11-28 09:42:38] downloading file ~/data/psid/FAM2001ER
INFO [2019-11-28 09:42:41] now reading and processing SAS file ~/data/psid/FAM2001ER into R
INFO [2019-11-28 09:47:16] downloading file ~/data/psid/FAM2003ER
INFO [2019-11-28 09:47:20] now reading and processing SAS file ~/data/psid/FAM2003ER into R
INFO [2019-11-28 09:51:59] downloading file ~/data/psid/FAM2005ER
INFO [2019-11-28 09:52:11] now reading and processing SAS file ~/data/psid/FAM2005ER into R
INFO [2019-11-28 09:56:49] downloading file ~/data/psid/FAM2007ER
INFO [2019-11-28 09:56:53] now reading and processing SAS file ~/data/psid/FAM2007ER into R
INFO [2019-11-28 10:04:25] downloading file ~/data/psid/FAM2009ER
INFO [2019-11-28 10:04:29] now reading and processing SAS file ~/data/psid/FAM2009ER into R
INFO [2019-11-28 10:12:08] downloading file ~/data/psid/FAM2011ER
INFO [2019-11-28 10:12:10] now reading and processing SAS file ~/data/psid/FAM2011ER into R
INFO [2019-11-28 10:20:19] downloading file ~/data/psid/FAM2013ER
INFO [2019-11-28 10:20:26] now reading and processing SAS file ~/data/psid/FAM2013ER into R
INFO [2019-11-28 13:12:58] downloading file ~/data/psid/FAM2015ER
INFO [2019-11-28 13:13:01] now reading and processing SAS file ~/data/psid/FAM2015ER into R
INFO [2019-11-28 13:22:37] downloading file ~/data/psid/WEALTH1984ER
INFO [2019-11-28 13:22:38] now reading and processing SAS file ~/data/psid/WEALTH1984ER into R
INFO [2019-11-28 13:22:42] downloading file ~/data/psid/WEALTH1989ER
INFO [2019-11-28 13:22:42] now reading and processing SAS file ~/data/psid/WEALTH1989ER into R
INFO [2019-11-28 13:22:47] downloading file ~/data/psid/WEALTH1994ER
INFO [2019-11-28 13:22:48] now reading and processing SAS file ~/data/psid/WEALTH1994ER into R
INFO [2019-11-28 13:22:54] downloading file ~/data/psid/WEALTH1999ER
INFO [2019-11-28 13:22:54] now reading and processing SAS file ~/data/psid/WEALTH1999ER into R
INFO [2019-11-28 13:23:00] downloading file ~/data/psid/WEALTH2001ER
INFO [2019-11-28 13:23:00] now reading and processing SAS file ~/data/psid/WEALTH2001ER into R
INFO [2019-11-28 13:23:06] downloading file ~/data/psid/WEALTH2003ER
INFO [2019-11-28 13:23:06] now reading and processing SAS file ~/data/psid/WEALTH2003ER into R
INFO [2019-11-28 13:23:11] downloading file ~/data/psid/WEALTH2005ER
INFO [2019-11-28 13:23:12] now reading and processing SAS file ~/data/psid/WEALTH2005ER into R
INFO [2019-11-28 13:23:17] downloading file ~/data/psid/WEALTH2007ER
INFO [2019-11-28 13:23:18] now reading and processing SAS file ~/data/psid/WEALTH2007ER into R
INFO [2019-11-28 13:23:23] downloading file ~/data/psid/IND2015ER
INFO [2019-11-28 13:23:31] now reading and processing SAS file ~/data/psid/IND2015ER into R
INFO [2019-11-28 14:02:04] finished downloading files to ~/data/psid/
INFO [2019-11-28 14:02:04] continuing now to build the dataset
INFO [2019-11-28 14:02:04] psidR: Loading Family data from .rda files
INFO [2019-11-28 14:02:13] psidR: loaded individual file: ~/data/psid/IND2015ER.rda
INFO [2019-11-28 14:02:13] psidR: total memory load in MB: 1400
INFO [2019-11-28 14:02:13] psidR: currently working on data for year 1968
INFO [2019-11-28 14:02:14] full 1968 sample has 80666 obs
INFO [2019-11-28 14:02:14] you selected 33940 obs belonging to SRC
INFO [2019-11-28 14:02:14] dropping non-heads leaves 2930 obs
INFO [2019-11-28 14:02:14] psidR: currently working on data for year 1969
Error in `[.data.table`(yind, , `:=`((as.character(ind.nas)), NA)) :
Can't assign to the same column twice in the same query (duplicates detected).
In addition: Warning messages:
1: In if (!wlth.down) { :
the condition has length > 1 and only the first element will be used
2: In `[.data.table`(tmp, , `:=`((nanames), NA_real_), with = FALSE) :
with=FALSE ignored, it isn't needed when using :=. See ?':=' for examples.
> traceback()
3: `[.data.table`(yind, , `:=`((as.character(ind.nas)), NA)) at build.panel.r#435
2: yind[, `:=`((as.character(ind.nas)), NA)] at build.panel.r#435
1: build.panel(datadir = "~/data/psid", fam.vars = f, ind.vars = i,
wealth.vars = w, heads.only = TRUE, sample = "SRC", design = "all")
well, took me only 3 months, but that should work now. @SchroederAdrian @JulienPascal
my guess is that the problem is the wealth supplement. this has changed in the PSID, so currently wrong here . the wealth variables have been moved to the family files from 1999 onwards, so you should just select those in your fam.vars data.frame. try setting
wealth=FALSE
, and including the wealth vars in fam.vars for 1999 onwards. for the years before the current approach (download separate supplement) seems still to be valid, however, the code will fail when it attempts to download the supplement for 1999 (which is gone, AFAIK)
Hi, I tried put the wealth variables in the famvars.txt documents for a 1999-2009 panel. However, it keeps warning me : "Error in [.data.table
(tmp, , codes[-na], with = FALSE) :
column(s) not found: s402, s407, s417".
Is there anything I can do to solve this?
Thanks!