Issue w/ loading pg_normalized_batch
Closed this issue · 3 comments
Hi all,
I am currently having an issue when trying to load the pg_normalized_batch data. I am currently receiving the following error:
Traceback (most recent call last):
File "/home/runner/work/twitter_postgres_parallel/twitter_postgres_parallel/load_tweets_batch.py", line 441, in <module>
connection = engine.connect()
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3325, in connect
return self._connection_cls(self, close_with_result=close_with_result)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 96, in __init__
else engine.raw_connection()
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3404, in raw_connection
return self._wrap_pool_connect(self.pool.connect, _connection)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3374, in _wrap_pool_connect
Connection._handle_dbapi_exception_noconnection(
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2208, in _handle_dbapi_exception_noconnection
util.raise_(
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3371, in _wrap_pool_connect
return fn()
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 327, in connect
return _ConnectionFairy._checkout(self)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 894, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 493, in checkout
rec = pool._do_get()
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 145, in _do_get
with util.safe_reraise():
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
compat.raise_(
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 143, in _do_get
return self._create_connection()
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 273, in _create_connection
return _ConnectionRecord(self)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 388, in __init__
self.__connect()
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 690, in __connect
with util.safe_reraise():
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
compat.raise_(
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 686, in __connect
self.dbapi_connection = connection = pool._invoke_creator(self)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 574, in connect
return dialect.connect(*cargs, **cparams)
File "/home/runner/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 598, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/home/runner/.local/lib/python3.10/site-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) connection to server at "localhost" (::1), port 13107 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 13107 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
I am not sure why I am receiving this since I believe I have setup the ports correctly and both the pg_normalized and pg_denormalized data load correctly. I have also changed the port multiple times and encountered the same issue. I will put my code for the docker-compose.yml
file as well as for the load_tweets_parallel.py
below:
docker-compose.yml
pg_normalized_batch:
build: services/pg_normalized_batch
volumes:
- ./:/tmp/db
- pg_normalized_batch:/var/lib/postgresql/data
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=pass
- PGUSER=postgres
ports:
- 13107:5432
load_tweets_parallel.sh
python3 -u load_tweets_batch.py --db=postgresql://postgres:pass@localhost:13107/ --inputs
Any help with this issue would be greatly appreciated!
The last lines of your error
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) connection to server at "localhost" (::1), port 13107 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 13107 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
suggest that the database is not running. You can verify that with docker ps
. If that is the case, then you should run docker-compose logs
to figure out why it's not running and fix the problem.
After running docker-compose up -d
and then docker ps
, I get:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7c9c7f1de6e5 twitter_postgres_parallel_pg_normalized_batch "docker-entrypoint.s…" 6 seconds ago Up 3 seconds 0.0.0.0:17368->5432/tcp, :::13107->5432/tcp twitter_postgres_parallel_pg_normalized_batch_1
2035622ab669 twitter_postgres_parallel_pg_normalized "docker-entrypoint.s…" 6 seconds ago Up 4 seconds 0.0.0.0:13106->5432/tcp, :::13106->5432/tcp twitter_postgres_parallel_pg_normalized_1
c28a48245aeb twitter_postgres_parallel_pg_denormalized "docker-entrypoint.s…" 6 seconds ago Up 2 seconds 0.0.0.0:13105->5432/tcp, :::13105->5432/tcp twitter_postgres_parallel_pg_denormalized_1
I then ran docker-compose logs
as suggested just in case I was missing anything and got:
pg_normalized_1 |
pg_normalized_1 | PostgreSQL Database directory appears to contain a database; Skipping initialization
pg_normalized_1 |
pg_normalized_1 | 2024-04-11 03:52:42.968 UTC [1] LOG: starting PostgreSQL 16.2 (Debian 16.2-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
pg_normalized_1 | 2024-04-11 03:52:42.968 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
pg_normalized_1 | 2024-04-11 03:52:42.968 UTC [1] LOG: listening on IPv6 address "::", port 5432
pg_normalized_1 | 2024-04-11 03:52:42.969 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
pg_normalized_1 | 2024-04-11 03:52:42.973 UTC [29] LOG: database system was shut down at 2024-04-11 03:52:29 UTC
pg_normalized_1 | 2024-04-11 03:52:42.979 UTC [1] LOG: database system is ready to accept connections
pg_denormalized_1 |
pg_denormalized_1 | PostgreSQL Database directory appears to contain a database; Skipping initialization
pg_denormalized_1 |
pg_denormalized_1 | 2024-04-11 03:52:44.941 UTC [1] LOG: starting PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
pg_denormalized_1 | 2024-04-11 03:52:44.941 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
pg_denormalized_1 | 2024-04-11 03:52:44.941 UTC [1] LOG: listening on IPv6 address "::", port 5432
pg_denormalized_1 | 2024-04-11 03:52:44.942 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
pg_denormalized_1 | 2024-04-11 03:52:44.945 UTC [27] LOG: database system was shut down at 2024-04-11 03:52:29 UTC
pg_denormalized_1 | 2024-04-11 03:52:44.952 UTC [1] LOG: database system is ready to accept connections
pg_normalized_batch_1 |
pg_normalized_batch_1 | PostgreSQL Database directory appears to contain a database; Skipping initialization
pg_normalized_batch_1 |
pg_normalized_batch_1 | 2024-04-11 03:52:43.916 UTC [1] LOG: starting PostgreSQL 16.2 (Debian 16.2-1.pgdg110+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
pg_normalized_batch_1 | 2024-04-11 03:52:43.917 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
pg_normalized_batch_1 | 2024-04-11 03:52:43.917 UTC [1] LOG: listening on IPv6 address "::", port 5432
pg_normalized_batch_1 | 2024-04-11 03:52:43.918 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
pg_normalized_batch_1 | 2024-04-11 03:52:43.922 UTC [29] LOG: database system was shut down at 2024-04-11 03:52:29 UTC
pg_normalized_batch_1 | 2024-04-11 03:52:43.929 UTC [1] LOG: database system is ready to accept connections
It seems that the databases are being brought up just fine, so not sure why I am still getting Connection refused
error.
FIXED: Brought down containers, deleted volumes, and then removed all existing containers. Built the containers again and it fixed the issue.