mikeizbicki/cmc-csci143

Incorrect file sizes

Opened this issue · 1 comments

Hello,

After running nohup sh load_tweets_parallel.sh for about 3 hours, I get weird file sizes. I have also fixed the schema issues

docker-compose exec pg_denormalized sh -c 'du -hd0 $PGDATA
49G

docker-compose exec pg_normalized_batch sh -c 'du -hd0 $PGDATA
49G

Does anyone know why or how I could fix it?

It looks like you have probably inserted data twice into the pg_denormalized database, and your insert into pg_normalized_batch was interrupted for some reason.

The most correct thing to do is to restart from scratch: delete you existing database, and reinsert the data. As this will take a long time, however, I will waive for you the requirement that the pg_normalized_batch test cases pass, so you can begin working on the CREATE INDEX commands. I can't do that for pg_denormalized, however, because you'll need the test cases to know if the SQL SELECT statements you've written are correct. For that database, you will have to delete everything and start over.