(inspired by https://github.com/busbud/coding-challenge-backend-c)
https://felixpelletier-coveo-challenge.herokuapp.com/suggestions
The "suggestion engine" uses multiple metrics with configurable weights. Each metric is given the full query and some information about a city. They assign a score from 0.0 to 1.0 for each city. Scores are then combined according to the metric's given weight.
This metric assigns a perfect score to a query which is exactly the beginning of a city's name. It is ignored from the score calculation if this is not the case. This skews the results towards the right city if the user writes it perfectly.
Ex: With query "Queb", the city "Quebec" gets a perfect score.
The score is computed according to number of edits needed to transform the query into the city's name.
The score is computed according to the city's distance to the location given in the query, if one was given.
This metric favors more populated cities, as they may be more likely to be searched by the user.
Each metrics can be configured using the metrics_config.yaml file.
There is currently only one parameter (the weight).
Here's an example:
NameStartsWith:
weight: 10
LevenshteinCityNameSimilarity:
weight: 7
HaversineLocationDistance:
weight: 3
LogarithmicPopulation:
weight: 3
- This wasn't my first Python project, but my I was a little rusty.
This was my first time:
- Trying out the type hints capabilities of Python.
- Trying out Googles import style (full import path, import only modules).
- Using Flask.
- Using Heroku.
- Using Yaml.
- Configuring Travis CI (more used to Jenkins).
- Have the metrics be a little more modular, with the ability to add some new ones simply by importing a new package.
- Adding the ability to configure each metric with their own specific options.
- Adding some logic between the metrics, such as checking distances only if the name is similar enough.
If necessary:
- Have some way of pruning cities quickly. Maybe have some threshold on some metrics. If below a certain threshold, just prune the city.
- Add a cache for common queries.
- Trying out a static type checker, such as mypy or Microsoft's pyright.
- Trying out a load tester, such as Locust.
- Adding a style checker and some linter (ex: pylint) as commit tests.
- Automating the requirements.txt generation on commit.
(inspired by https://github.com/busbud/coding-challenge-backend-c)
Design a REST API endpoint that provides auto-complete suggestions for large cities.
- The endpoint is exposed at
/suggestions
- The partial (or complete) search term is passed as a querystring parameter
q
- The caller's location can optionally be supplied via querystring parameters
latitude
andlongitude
to help improve relative scores - The endpoint returns a JSON response with an array of scored suggested matches
- The suggestions are sorted by descending score
- Each suggestion has a score between 0 and 1 (inclusive) indicating confidence in the suggestion (1 is most confident)
- Each suggestion has a name which can be used to disambiguate between similarly named locations
- Each suggestion has a latitude and longitude
- You can use the language and technology of your choosing. It's OK to try something new (tell us if you do), but feel free to use something you're comfortable with. We don't care if you use something we don't; the goal here is not to validate your knowledge of a particular technology.
- End result should be deployed on a public Cloud (Heroku, AWS etc. all have free tiers you can use).
- Try to design and implement your solution as you would do for real production code. Show us how you create clean, maintainable code that does awesome stuff. Build something that we'd be happy to contribute to. This is not a programming contest where dirty hacks win the game.
- Documentation and maintainability are a plus, and don't you forget those unit tests.
- We don’t want to know if you can do exactly as asked (or everybody would have the same result). We want to know what you bring to the table when working on a project, what is your secret sauce. More features? Best solution? Thinking outside the box?
If you wish, it's OK to use external systems such as a database, an Elastic index, etc. in your solution. But this is certainly not required to complete the basic requirements of the challenge. Keep in mind that our goal here is to see some code of yours; if you only implement a thin API on top of a DB we won't have much to look at.
Our advice is that if you choose to use an external search system, you had better be doing something really truly awesome with it.
These responses are meant to provide guidance. The exact values can vary based on the data source and scoring algorithm
Near match
GET /suggestions?q=Londo&latitude=43.70011&longitude=-79.4163
{
"suggestions": [
{
"name": "London, ON, Canada",
"latitude": "42.98339",
"longitude": "-81.23304",
"score": 0.9
},
{
"name": "London, OH, USA",
"latitude": "39.88645",
"longitude": "-83.44825",
"score": 0.5
},
{
"name": "London, KY, USA",
"latitude": "37.12898",
"longitude": "-84.08326",
"score": 0.5
},
{
"name": "Londontowne, MD, USA",
"latitude": "38.93345",
"longitude": "-76.54941",
"score": 0.3
}
]
}
No match
GET /suggestions?q=SomeRandomCityInTheMiddleOfNowhere
{
"suggestions": []
}
- Geonames provides city lists Canada and the USA http://download.geonames.org/export/dump/readme.txt
Begin by forking this repo and cloning your fork. GitHub has apps for Mac and Windows that make this easier.