Spring app for simple url shortening
Pre-requisites:
- JDK >=11
- MySQL >=5
- Maven >=3
Instructions:
- Clone the repo.
- Create a local MySQL database named
urlshortener
. - Update the
spring.datasource
insrc/main/resources/application.properties
with your local db username + password. (optional) 3b. Also configure the spring.jpa.hibernate.ddl-auto to "update" instead of "create-drop". - Run
mvn clean install
. - Run
mvn spring-boot:run
to start the application.
Route | Description |
---|---|
POST /api/v1/shorten |
Takes a JSON object of { "url": "https://example.org" } as a body and will respond with a shortened url |
POST /api/v1/shorten/multiple |
Takes a JSON array of url strings and returns with each one shortened as a single response |
GET /api/v1/url/{urlId} |
Retrieves the corresponding URL for the given urlId request parameter |
GET /u/{urlId} |
Retrieves and directly redirects to the given URL from the urdId request parameter |
This URL shortener algorithm will average 6 characters, maxing at 9 (depending on the lifetime of the applications database).
It does this by first splitting the protocol and host name from the actual path/route name and query parameters. These two components are stored separately in the database using a hexacondrabimal id instead a normal decimal id (essentially an identifier that uses 0-9, all uppercase, and all lowercase characters meanings 62 character possibilities). These two identifiers are joined by a .
to create a complete url id.
By doing this, it ensures that the first part of the url id (the protocol + host name) will be very unlikely to be more than 4 characters as the number of unique domains in the database would need to reach 14,776,336 before it because 5 characters (62^4=14,776,336). It also means that the second part of the url id (path/route name + query parameters) would need to reach 238,328 before it became 4 characters (62^3=238,328).
An average of 3 (4 once it passes 238,328 host names) characters for the host name and an average of 2 characters for the route/path name means 6-7 character average of the url id (also counting the .
). This maxes at 9 (possbily 10 in the future) as there are currently only ~13 million registered domains world wide, adding 1 for the .
and 4 at most for the route/path name (not accounting for the possibility of a double registering for both the HTTP and HTTPS protocol which would count for two separate entries in the host name database table)
The algorithm for converting a number to a hexacondrabimal can be found in the IdShortener class.