How to handle with the ConnectionResetError

Question

How to handle with the ConnectionResetError

Closed this issue 3 months ago · 11 comments

Hello, I'm a beginner of neptune-client, and the version of 1.10.4 is used. I'm trying to revise the code of "genetic expert guided learning" based on the legacy api. When I just update the codes about neptune api the benchmark program can be completed normally. However, after further introducing my revised code of the scoring function, the benchmark program raises a ConnectionResetError halfway. I have set NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE=True, but the problem is still unresolved.
What are the possible causes of the error? How should I handle with it?

Answer 1 · 2024-10-03T09:28:25.000Z

Hey @ganfisher 👋

Thank you for reporting this issue! Could you please provide us with more details:

1. Describe the bug.

A clear and concise description of the problem you're experiencing.

2. Reproduction

Please provide steps to reproduce the issue or code snippet in as much detail as possible.

3. Expected behavior

A clear and concise description of what you expected to happen.

4. Traceback

If applicable, add traceback or log output/screenshots to help explain your problem.

5. Environment info

The output of pip list:
The operating system you're using:
**The output of python --version:**

6. Additional context

Add any other context about the problem here.

Answer 2 · 2024-10-07T03:19:39.000Z

Hello, sorry for the answer delay. I have migrated the old API codes to the new API codes, and the ConnectionResetError does not appear again. However, the running might be stuck after a few epoches, and no exceptions are raised. The running stuck is like the phenomenon described in issue #1151 , but neptune logger and Pytorch Lightning are not explicitly called in my case. I'm still trying to handle the problem by myself.

Answer 3 · 2024-10-07T03:20:33.000Z

Hello, sorry for the answer delay. I have migrated the old API codes to the new API codes, and the ConnectionResetError does not appear again. However, the running might be stuck after a few epoches, and no exceptions are raised. The running stuck is like the phenomenon described in issue #1151 , but neptune logger and Pytorch Lightning are not explicitly called in my case. I'm still trying to handle the problem by myself.

I will try the offline mode first.

Answer 4 · 2024-10-07T13:36:52.000Z

Hey @ganfisher ,

Please let me know if the offline mode works.
If it does, the amount of metadata logged asynchronously might be too much for your network.

Answer 5 · 2024-10-07T15:00:40.000Z

Hey @ganfisher ,

Please let me know if the offline mode works. If it does, the amount of metadata logged asynchronously might be too much for your network.

Yeah, the offline running could finish successfully. However, HTTPError could be raised from time to time, if I try to synchronise the offline results with "neptune sync". Considering that the "ring" symbol meaning "synchronising" exists, the HTTPError may not matter?

Answer 6 · 2024-10-07T15:04:49.000Z

Is there any way to visualize the logged metadata locally?

Answer 7 · 2024-10-09T12:19:14.000Z

Considering that the "ring" symbol meaning "synchronising" exists, the HTTPError may not matter?

Could you post a screenshot of the traceback?

Is there any way to visualize the logged metadata locally?

No, unfortunately.

Answer 8 · 2024-10-09T12:27:37.000Z

Considering that the "ring" symbol meaning "synchronising" exists, the HTTPError may not matter?

Could you post a screenshot of the traceback?

Is there any way to visualize the logged metadata locally?

No, unfortunately.
Sure, the screenshots are posted here.

Answer 9 · 2024-10-09T12:33:13.000Z

Yeah, that's fine.
As long as the sync completes successfully, there should not be any cause for concern ✅

Answer 10 · 2024-10-09T12:37:32.000Z

Yeah, that's fine. As long as the sync completes successfully, there should not be any cause for concern ✅

Thanks, I will close the issue, if all the data can be uploaded successfully.

Answer 11 · 2024-10-14T08:33:35.000Z