stopstalk/stopstalk-deployment

[DB-DESIGN] Database design needs to be more flexible to add new sites support and multiple user handles support

sandywadhwa opened this issue · 3 comments

Current Database design has two blockers for site extensibility.

  1. Every "new site support addition" needs new columns to be added to USER database table (NEWSITE_handle and NEWSITE_lr) for adding user handle and last retrieval time.
  2. Many a times a user ends up having more than one handle on online sites (e.g. During college someone created an id using college roll number as user handle). But after graduating, user created another handle. As a user if someone wants his progress from two accounts to be merged, he needs to create two separate stopstalk accounts. A re-design can address both scenarios to make site more elegant.

CURRENT DESIGN
user
codeforces_handle
codeforces_lr
codechec_handle
codeforces_lr
spoj_handle
spoj_lr
......

PROPOSED DESIGN
Add a new table

coding_profiles
user_id : (reference field to user table)
site_name: (string: one of -- CODEFORCES, CODECHEF, SPOJ, ...)
site_handle: user's handle of site
last_retrieved: DateTime of last retrieval

This will allow adding new sites support easy and elegant and allows a user to add his multiple handles into his single account.

Lastly, this also allows allows LOSE COUPLING between crawlers and user table or sites. e.g. a Crawler can be written to just fetch one site specific handles and crawl just for that rather than crawling for all user handles of a single user at once.

That's a great idea @sandywadhwa :) Would be the biggest change in terms of code to get this up and running, not sure if can be picked right away.

This can be done in backward compatible way with low cost of code churn or codebreak (by having redundant data for some time).

  1. A new table for coding profiles can be added.
  2. A db script can be written to create entries for every existing user profile (from user table to to coding profiles table).
  3. User registration code and Update Details code paths can be enhanced to make updates/entries into new table along with old table.
  4. APIs to send user profile details to front end can be enhanced to read from new table and sent to front end.
  5. Once changes are thoroughly tested a switch can be made from old code table and apis to new table and apis.

This should reduce cost of changes and any regressions. While providing flexibility to anytime switch back to old code paths.

Yep, that's the release plan if going with it - the issue is urgency/bandwidth not feasibility.
Feel free to raise a PR :)