hasgeek/hasjob

Improve search engine ranking for pages on Hasjob

shreyas-satish opened this issue · 18 comments

Search behaviour

To improve Hasjob's search engine ranking, it helps to understand what people search for. Here's a sample of suggestions the Google's AdWords Keywords tool provides, given keywords:

  • Marketing jobs in bangalore (1k - 10k)
  • Logistics jobs in bangalore (1k - 10k)
  • Project manager jobs in bangalore (1k - 10k)
  • Machine learning startups in Bangalore (100 - 1K)
  • Machine learning jobs in Bangalore (100 - 1K)
  • Android developer Bangalore (100 - 1K)
  • Android app developers in Bangalore (100 - 1K)
  • Fintech companies in Bangalore (100 - 1K)
  • Healthcare companies in Bangalore (100 - 1K)

The numbers in brackets indicate monthly searches.

Currently, Hasjob doesn't rank well when it comes to the above search patterns.

Guidelines

Google's guidelines suggests the following (in addition to a few more):

Ensure that all pages on the site can be reached by a link from another findable page. The referring link should include either text or, for images, an alt attribute, that is relevant to the target page.

Provide a sitemap file with links that point to the important pages on your site. Also provide a page with a human-readable list of links to these pages (sometimes called a site index or site map page).

Proposal

As a first milestone towards improve search engine ranking I propose we:

  1. Make landing pages based on search patterns.
  • Startup jobs in location (This is already done Startup jobs in Bangalore)
  • technology jobs in location
  • technology companies in location
  • role jobs in location
  • industry jobs in location
  • industry companies in location
  • technology jobs
  • role jobs
  • industry jobs

technology refers to Python, JavaScript, Machine Learning and role refers to Project Manager, Frontend developer and so on. We currently have category as a filter, but the listed categories (eg: Programming, Interaction design) are too broad, and don't capture the granularity mentioned in the job posts (eg: Full stack developer, Devops engineer).

Here's one way we can categorize job posts with better granularity. We add a Role model, populate it based on past Hasjob data and match the job posts against these roles.

Regarding industry, we can use the same tactic. We can add an Industry model, populate it with past data (eg: Fintech, Healthcare) and tag job posts with the appropriate industry.

Open question: how do we make filtering accessible to search engines? This will help address the first guideline laid out by Google.

  1. Add the above landing page links to sitemap.xml.
jace commented

We need a FilteredView model to complement the Location model. It holds the equivalent of a selection of search filters and terms. Just as with Location (at /in/<location>), going to the filter URL (/f/<filter>) renders a filtered view with all filters populated in the UI (pending #302 being resolved).

Filtered views are not included in the regular sitemap. They will be in the access-controlled sitemap. The other suggested models are overthinking this problem and not necessary at this time.

jace commented

Instead of complementing categories with roles, we could just add more categories and re-categorise existing jobs (in the past 30 days).

jace commented

"Industry" applies to employers, not jobs, and should be a separate ticket.

Filtered views are not included in the regular sitemap. They will be in the access-controlled sitemap.

They should not be in sitemap.xml? I'm not sure how search engines will crawl the filtered views if they aren't in sitemap.xml.

jace commented

if authorized_sitemap:
# Add domains to sitemap
for domain in Domain.query.filter(
Domain.title != None,
Domain.description != None,
Domain.description != ''
).order_by('updated_at desc').all(): # NOQA
sitemapxml += ' <url>\n'\
' <loc>%s</loc>\n' % domain.url_for(_external=True) + \
' <lastmod>%s</lastmod>\n' % (domain.updated_at.isoformat() + 'Z') + \
' <changefreq>monthly</changefreq>\n'\
' </url>\n'

So, the description can be as follows?

`type` `q` `category` jobs in `location`

Eg: "Full-time python programming jobs in Bangalore" or "Internship jobs in Mumbai" or "Testing jobs".

jace commented

Yes, but something more elaborate may be better for SEO.

What if we start by generating sitemap links dynamically based on the available jobs? That is, we make permutations of type, category, keyword and location and the description can follow a pattern similar to "type q category jobs in location".

keyword is the tricky part, since we probably shouldn't limit it to keywords searched for by users. Also, treating the job posts as the source of truth as opposed to keywords entered by users seems to make more sense to me. Does it make sense to use the entries in the Tag model for the keyword part?

jace commented

Can we not over-engineer this, please? This is an audience manager's tool. It's meant to empower them to reach appropriate audiences. Let the manager decide what traffic to attract.

Just as with Location (at /in/), going to the filter URL (/f/) renders a filtered view with all filters populated in the UI.

How should <filter> be represented here? Should it be something like /f/programming/jobs/in/bangalore for instance?

jace commented

/f/programming-jobs-in-bangalore. It's readable text written by the audience manager.

It seems like this feature could also be useful to implement email alerts for candidates? In that, a candidate can set email alerts for filtered views.

jace commented

It's similar, but not the same. There's nothing in filtered views to detect a new job matching the spec. We could refresh all filtered views, but there will be hundreds of them.

Getting reliable notification infra in place is an entire other ticket.

For the FilteredView model, does this schema look okay to start with?:

class FilteredView(BaseScopedNameMixin, db.Model):
    __tablename__ = 'filtered_view'
   
    description = db.Column(db.UnicodeText, nullable=False)
    filters = db.Column(JsonDict, nullable=False, server_default='{}')
    board_id = db.Column(None, db.ForeignKey('board.id'), nullable=False, primary_key=True, index=True)
    board = db.relationship(Board, backref=db.backref('filtered_views', lazy='dynamic', cascade='all, delete-orphan'))
jace commented

All filters will have to be join tables, just like in the board model. That's how we implement detection for when a new job is posted.

jace commented

If you can get indexes to work within a JSONB column, however, that's usable as well.

jace commented

Actually, you need foreign keys from the JSONB column to the respective type/category/tag/domain models. If PostgreSQL doesn't support foreign keys in JSON, JSON is the wrong data type.

Just making sure I understand correctly. For a given route like /f/programming-jobs-in-bangalore, would it be sufficient to call and return index() with the specified filters for this filtered view?