mikeizbicki/cmc-csci143

Error when loading in pg_denormalized data (invalid input syntax for type json)

Closed this issue · 0 comments

I was wondering if anyone has gotten this error loading in their pg_denormalized data? I am wondering the cause of it. I didn't get this error previously when running nohup ./load_tweets_parallel.sh &, only now after deleting all my pg_denormalized data and loading it again.

================================================================================
load pg_denormalized
================================================================================
/data/tweets/geoTwitter21-01-02.zip
ERROR:  invalid input syntax for type json
DETAIL:  Expected end of input, but found ",".
CONTEXT:  JSON data, line 1: ..."follow_request_sent":null,"notifications":null},...
COPY tweets_jsonb, line 147797, column data: "{"created_at":"Sat Jan 02 01:08:02 +0000 2021","id":1345174980166885377,"id_str":"134517498016688537..."
/data/tweets/geoTwitter21-01-01.zip
ERROR:  invalid input syntax for type json
DETAIL:  Token "\" is invalid.
CONTEXT:  JSON data, line 1: ...":"GrandCrossoverService","indices":[11352327681\...
COPY tweets_jsonb, line 149022, column data: "{"created_at":"Fri Jan 01 00:54:05 +0000 2021","id":1344809084135108611,"id_str":"134480908413510861..."
/data/tweets/geoTwitter21-01-09.zip
ERROR:  invalid input syntax for type json
DETAIL:  "\u" must be followed by four hexadecimal digits.
CONTEXT:  JSON data, line 1: ...00000","profile_background_image_url":"http:\/\ur...
COPY tweets_jsonb, line 146592, column data: "{"created_at":"Sat Jan 09 00:57:29 +0000 2021","id":1347709042064502788,"id_str":"134770904206450278..."
/data/tweets/geoTwitter21-01-10.zip
ERROR:  invalid input syntax for type json
DETAIL:  Expected "," or "}", but found ":".
CONTEXT:  JSON data, line 1: {"created_at":"Sun Jan 10 01:03:23 +0000 20ay_url":...
COPY tweets_jsonb, line 144168, column data: "{"created_at":"Sun Jan 10 01:03:23 +0000 20ay_url":"twitter.com\/i\/web\/status\/1\u2026","indices":..."
/data/tweets/geoTwitter21-01-08.zip
ERROR:  invalid input syntax for type json
DETAIL:  Token "DDEEF6" is invalid.
CONTEXT:  JSON data, line 1: ...rs_count":203,"friends_count":218,"listed:"DDEEF6...
COPY tweets_jsonb, line 141092, column data: "{"created_at":"Fri Jan 08 00:58:48 +0000 2021","id":1347346983611203586,"id_str":"134734698361120358..."
/data/tweets/geoTwitter21-01-04.zip
ERROR:  invalid input syntax for type json
DETAIL:  Expected "," or "}", but found ":".
CONTEXT:  JSON data, line 1: ..._tw_video_thumb\/1304990340932349956\/pu_offset":...
COPY tweets_jsonb, line 145883, column data: "{"created_at":"Mon Jan 04 01:00:38 +0000 2021","id":1345897896542154753,"id_str":"134589789654215475..."
/data/tweets/geoTwitter21-01-05.zip
ERROR:  invalid input syntax for type json
DETAIL:  Expected "," or "}", but found ":".
CONTEXT:  JSON data, line 1: ...user_mentions":[{"screen_name":"purpleloveplusr":...
COPY tweets_jsonb, line 145297, column data: "{"created_at":"Tue Jan 05 01:04:07 +0000 2021","id":1346261161503825921,"id_str":"134626116150382592..."
/data/tweets/geoTwitter21-01-06.zip
ERROR:  invalid input syntax for type json
DETAIL:  The input string ended unexpectedly.
CONTEXT:  JSON data, line 1: ..."low","lang":"en","timestamp_ms":"1609894818535"}
/data/tweets/geoTwitter21-01-03.zip
ERROR:  invalid input syntax for type json
DETAIL:  Token "eriowallace" is invalid.
CONTEXT:  JSON data, line 1: ...34,"friends_count":407,"listed_count":eriowallace...
COPY tweets_jsonb, line 150236, column data: "{"created_at":"Sun Jan 03 01:01:50 +0000 2021","id":1345535810532241411,"id_str":"134553581053224141..."