Timestamp must either be string or int
Closed this issue · 9 comments
Hi,
I'm running the tutorial, and I get this error when running git2net.get_coediting_network
error:
AssertionError: Timestamp must either be string or int
on temporal_network.py
in add_edge
line 377
code:
t, node_info, edge_info = git2net.get_coediting_network(sqlite_db_file)
A further inspection shows me that the variable ts
is actually valued and is a np.float64
: 1548953662.0
I have run it in 2 different repos and also using:
- linux
- git version 2.12.3
- python 3.7 & 3.9
- jupyterlab==3.2.4
- pygit2==1.7.1
- python-Levenshtein==0.12.2
- gambit-disambig==1.0.3
- git2net==1.5.2
- gitdb==4.0.9
- GitPython==3.1.24
Maybe I'm using a wrong version of any of the dependecies?
Thanks!
Hi Lisette,
From the error message, the most likely source for this issue would be your version of pathpy. Could you tell me which version you are running?
Cheers,
Christoph
Hi Christoph,
I have pathpy2==2.2.0
which is the one git2net requires (it got automatically installed when I installed git2net).
thanks!
Thanks, pathpy2 should work fine. Can you confirm that the tutorial works correctly for you? Or do you also get the same issues there? If so, could you tell me which repository you are getting the error on? Then I can try to replicate the issue.
(just in case the error is on git2net.get_coediting_network
)
I did not clone this repo to use the tutorial.
I created my own notebook, on a new conda env using py39.
I first installed pygit2 and then git2net.
Everything run correctly until git2net.get_coediting_network
(the first code cell within 'Network Analysis and visualization')
I first tried https://github.com/mocnik-science/osm-python-tools.
I got the error, and then tried https://github.com/gotec/git2net.git, as shown in the tutorial.
I also tried on py37 and the error persisted.
I just realized that the problem is on pathpy
( /pathpy/classes/temporal_network.py
) and no on git2net
.
But it is still not clear to me from where it gets the list of edges and timestamps. I guess from the sqlite_db_file
.
As a quick and dirty work around, I added this on line 377 ( /pathpy/classes/temporal_network.py
):
ts = int(ts) if type(ts) ==_np.float64 else ts
Then, I could reproduce everything as shown in your tutorial.
However, when I re-run a previous cell, I got another error (will add it as another issue).
Just in case, I'm sharing the list of packages that my environment has:
requirements.txt
Thanks!
Unfortunately I am not able to reproduce your issue which makes solving it challenging :)
Running the following code yields correct temporal networks for both repositories on my setup:
import git2net
# repo_path = 'git2net4analysis'
# db_path = 'git2net_mined.db'
repo_path = 'osm-python-tools'
db_path = 'osm-python-tools_mined.db'
git2net.mine_git_repo(repo_path, db_path)
net, _, _ = git2net.get_coediting_network(db_path)
net
The only thing I did before running the code was to manually clone the repositories to the respective folders.
Can you check if this code works for you? Could you share a minimal example of your setup so I can try to replicate the issue with that? Maybe I misunderstood something from your descriptions.
I could not find any differences in the packages either.
Cheers,
Christoph
Hi, thanks for looking into this.
I tried what you suggested:
- Clone the repo manually (not using the function from
pygit2
) - Run the code you suggested. Here I got an error that the author ids were not set, and that I needed to run the disambiguation first.
Exception: The author_id is not yet computed. To use author_id as identifier, please run git2net.disambiguate_aliases_db on the database before visualisation.
- I added the disambiguation call, run it again, and the error about the timestamp still showed up :-(
This is my code:
import git2net
import os
local_directory = '../datasets/github/git2net/'
repo_path = os.path.join(local_directory,'osm-python-tools')
db_path = os.path.join(local_directory,'osm-python-tools.db')
git2net.mine_git_repo(repo_path, db_path)
git2net.disambiguate_aliases_db(db_path)
net, _, _ = git2net.get_coediting_network(db_path)
net
Somewhere in between, the timestamp
is being set to float
.
Sometimes pandas does that with integer columns. My guess is that that might be the issue.
Just in case, I'm using pandas==1.3.4
Meanwhile, I will use my "quick and dirty" solution of casting the value to int
if it is float
.
Thanks!
Hi Lisette,
I think I've figured out why I couldn't replicate this issue. I was using the current version of git2net from github rather than the version on PyPI. I was wrongly assuming they were identical, however, it appears that I have already fixed the issue there.
Before I submit a new version of git2net to PyPI could you confirm the following to me:
- Uninstall git2net from your machine (
pip uninstall git2net
) - Clone the git repository from git2net into a local folder (
git clone https://github.com/gotec/git2net
) - Navigate to the folder where you cloned git2net to (
cd git2net
) - Install git2net from the local folder (
pip install -e .
) - Now run your code from yesterday again:
import git2net
import os
local_directory = '../datasets/github/git2net/'
repo_path = os.path.join(local_directory,'osm-python-tools')
db_path = os.path.join(local_directory,'osm-python-tools.db')
git2net.mine_git_repo(repo_path, db_path)
git2net.disambiguate_aliases_db(db_path)
net, _, _ = git2net.get_coediting_network(db_path)
net
Please let me know if this works. In that case I will commit a new version with the fixes to PyPI later today.
Cheers,
Christoph
resolved in git2net 1.5.3