catalyst-cooperative/pudl-usage-metrics

Clean Intake metrics

bendnorman opened this issue · 0 comments

  • Geocode IPs
  • Create db schema for the intake logs
  • parse out which catalogs people are accessing.
  • Fix clobber behavior. Currently, if you use the clobber argument for database managers, all of the tables in the database are deleted which is not ideal. I think the usage_metrics_metadata.drop_all(engine) just needs to be moved to append_df_to_table() method and specifying the table to be dropped.
  • Create a location column with the full location information, remote_ip_city + remote_ip_region + remote_ip_country.
  • Add integration tests.
  • Update the Add new data section in the README to reflect the changes to how repositories are organized.