IFTTT/polo

How are you using Polo?

nettofarah opened this issue · 7 comments

I'm curious to know what cool stuff people are using Polo for.

  • What environments are you generating data for?
  • What version of Rails?
  • How big is your final .sql file?
  • What database are you using?
  • How many database tables is Polo touching?
  • How hard was it to find a good sample size?
  • Are you using any advanced features?
  • How are you running Polo in prod? rake task, rails runner, rails console, etc.
  • Do you have an automated process to generate the files?
  • How do you transfer your data across environments? Publishing artifacts, rsync, scp...

No need to answer everything, but I would love to know how people are using the library so we know what to prioritize.

Thank you for using Polo <3

We gonna use to allow customers dump their data from our web application (www.runrun.it).
Rails 3.2
Depends on customer account size.
Postgresql
10 tables
We are running as a background job
We upload the data to S3 and generate a link.

@khani3s that is really cool!
Good luck.

  • What environments are you generating data for? Depends on the task.
  • What version of Rails? rails-api-0.4.0
  • How big is your final .sql file? Varies on the task.
  • What database are you using? mysql-enterprise-5.6.25
  • How many database tables is Polo touching? Varies on the task.
  • How hard was it to find a good sample size? N/A
  • Are you using any advanced features? Not officially.
  • How are you running Polo in prod? rake task, rails runner, rails console, etc. Not yet.
  • Do you have an automated process to generate the files? https://gist.github.com/belt/c8f7b1c45834ce6fa485
  • How do you transfer your data across environments? Publishing artifacts, rsync, scp... rsync & scp
  • Generating compact and scrubbed DBs for development environments.
  • Our smallest working set is 8k .sql file, but it's growing slowly.
  • MySQL DB
  • We're using Rails.3.2 but migrating to Rails 4.2+.
  • Getting good data for a full runnable db subset was rather easy.
  • I was using the obfuscation features mentioned in pull #28
  • Running polo in production soon with rake
  • Fully automated and pushes to S3

Polo is great, thanks!

  • What environments are you generating data for?

Staging and development

  • What version of Rails?

4.2.3

  • How big is your final .sql file?

50MB

  • What database are you using?

Postgres

  • How many database tables is Polo touching?

15

  • How hard was it to find a good sample size?

Tiny bit because one model in particular, i.e. Project, can belong to 2 separate models. When I first used this, I picked a random sample of each table and specified the Project dependency in only one of them, and as a result ended up with quite a few orphaned records. I don't know what the best way is to sample correctly to ensure you're not omitting all the dependent records in your sample.

  • Are you using any advanced features?

Not yet.

  • How are you running Polo in prod? rake task, rails runner, rails console, etc.

Rake task.

  • Do you have an automated process to generate the files?

Just the rake task. I run it manually right now.

  • How do you transfer your data across environments? Publishing artifacts, rsync, scp...

None of the above. I'm just connecting to the staging DB and running the SQL script.

I've just open sourced a tool we've been using internally for a month or two largely built around Polo, called Brillo.

It uses Polo to make prod db scrubs > uploading them to S3 > download them to dev machines > load db.

It used to take us over an hour to load a "lightweight" copy of our DB with 4% of our biggest tables on a dev machine. Now with Polo we just take the last 1000 records from a few tables, and crawl their associations. Loads are down to < 10 minutes.

Thank you for making an awesome gem, and getting me to contribute to open source myself for the first time ever :)

  • What environments are you generating data for?
    Development
  • What version of Rails?
    4.2.7
  • How big is your final .sql file?
    ~ 100MB unzipped
  • What database are you using?
    Mysql
  • How many database tables is Polo touching?
    28
  • How hard was it to find a good sample size?
    Reasonably easy. By default we only fetch a fairly shallow copy of recent data (eg we find recent posts and copy those and the posts' authors, but not the posts' authors' followers or other associated metadata). Devs can manually specify individual entities when they do want to get a deep copy of data. Otherwise the dataset ballooned in size very quickly.
  • Are you using any advanced features?
    Don't think so.
  • How are you running Polo in prod? rake task, rails runner, rails console, etc.
    A capistrano task, which then uses rails-runner to generate the dump on a remote machine. eg -
    • cap production db:fetch FETCH=all to get a relatively shallow copy of recent data
    • cap production db:fetch FETCH=user=jon to get a deep copy of a specific entity & associations.
  • Do you have an automated process to generate the files?
    No, we just run capistrano by hand, on demand.
  • How do you transfer your data across environments? Publishing artifacts, rsync, scp...
    Capistrano (so basically scp)