This repository contains the implementation of a mini-application as requested by paperfile in the context of its recruiting efforts.
- https://www.notion.so/Paperpile-full-stack-test-project-7953d77a7fe64de0a3c9c0bc9a2fa313
- cf. Figma file.
In short, the application consists of a front-end that let a user select a Microsoft Word file (docx) and upload it to a server for the correction of possible spelling mistakes. The server corrects the mistakes if any and returns an URL for the corrected document.
Install dependencies and run start.
There is a front-end and a back-end.
- React/JavaScript stack
- no build or module system are used for simplicity and velocity purposes
- Types are added in specific places through JSDoc
- unit tests are available with good old QUnit, and run in the browser (so far tested only in Chrome)
- the front-end implementation is architectured around a event-state-action paradigm, as rendered popular by Elm, Redux, and a few others:
- The application receives events that are turned into commands that are executed
- React is used mostly as a rendering library that execute render commands
- the pattern allows to unit-test user scenarios
Entry file: public/index.html
Where we felt the requirements could be improved, we took some decisions:
- we added an error screen that gives feedback to the user when a request to process the spelling mistakes in the document has failed
- we also added limits to the documents that can be received to spare the backend (10 MB in the current implementation)
- accepts non-word files...
accept
property does not work?? We haven't found a way to intruct the browser file select input widget to only accept docx files. - for the reason previously mentioned, a docx file that has been turned into a zip file (replacing the .docx extension with .zip) will be spell-checked correctly. However, the user click on the download link will trigger the download of a zip file.
- Express stack
- router, file upload, logger, cors Express add-ons
Where we felt the requirements could be improved, we took some decisions:
- The requirement Read and extract the text of the first paragraph of the first page. may not be strictly implemented. The program will look for spelling mistakes everywhere in the document. That means the requirement is fulfilled only if it did not mean to read and extract only the text of the first paragraph of the first page.
- we do not check that the file that is being uploaded is indeed a docx file.
Tests are available for both front- and back-end in the tests
directory. Cf. tests/README.md
.
- remove download file after a while has passed or some other criteria to free server space
- could do more to validate incoming data:
- post request (we do not check that the file that is being uploaded is indeed a docx file)
- spell checking response
- could run some extra tests for files with unsafe characters
- tests with a larger variety of docx files
- we made some assumptions (marked with
ASSUMPTIONS
in comments) that may be proved wrong, so more tests would help support/dispell these assumptions - variety means a test set with miscellaneous:
- length: . < SIZE_LIMIT, . = SIZE_LIMIT, . > SIZE_LIMIT
- content: text with spelling mistakes in paragraphs, titles, image captions, with revision marks, etc -- the idea is to cover the full docx markup to surface possible incongruencies
- content: in particular we want to check Word's trimming behavior, so we don't inadvertently remove spaces from the input document
- we made some assumptions (marked with
- tests (e2e)
- we do not check that the file that is being uploaded is indeed a docx file
- tests (e2e)
- Would be great if the Figma file would provide actual HTML/CSS that can be plugged in the implementation. Designers using a design system may facilitate the handoff process.