wlonk/wheretofind.me

Infer Name from URL (URL-first design)

Opened this issue · 1 comments

If the Name field is empty when a user enters/pastes a URL into the URL field, try to guess a suitable Name. There are a variety of techniques that could be used, perhaps several of them in order.

  1. Load the page, grab the <title>, strip off any well-known parts (typically the site name), and suggest that. Benefits from adding site-specific knowledge to strip junk, but works okay even without. On the down side: Need to throttle to limit being used to DOS the linked site or WTF.M itself; need to timeout outbound connections to avoid being DOSed on open network connections; need to limit size of download you'll take to avoid being DOSed with large files; need to limit size of <title> for same reason, probably want to skip anything that isn't text/html.

  2. Look for a well known name/title in the URL. Simple, but requires researching a bunch of sites.

    • Example: https: //plus.google.com/+AlanDeSmet could use "AlanDeSmet"
    • Example: https://grrm.livejournal.com/ could use "grrm"
  3. Just use the hostname. Very simple.

Now, people will be unlikely to discover this if the Name field is above the URL field; most people will enter the Name and never see the behavior. Swapping the order of URL and Name would increase the number of people who discovered and benefited from it.

wlonk commented

Proposal 1 is interesting, as it offloads the idea of a background-worker into the client, which makes it simpler than, say, #3, which requires server-side worker processes.

Proposal 2 could play well with our existing "guess Icon from URL" logic.

Proposal 3 is a decent fallback, but wouldn't be my first choice.