- Rename
.env.example
to.env.local
. - Fill
GOOGLE_CLIENT_ID
andGOOGLE_CLIENT_SECRET
. yarn build && yarn start
yarn dev
doesn't work because we use an internal next
server and in a dev
mode it doesn't support module sharing (we share a default task manager).
Open http://localhost:3000 in a browser
- User click
Analyze my emails
-> - Server redirects him to the google oauth ->
- Google sends him back to the callback url ->
- We create an instance of the email provider (async iterator with a few decorators) ->
- We create a combined processor (each processor analyzes emails in a different way) ->
- Create a task with appropriate command and send it to the task manager ->
- Redirect user to the result page where he will wait for the task result
- We know that we process emails. Can we take advantage of this info? For example, we can analyze email parts differently (subject, header, body)
- What is the average size of the email?
- How many emails do we need to analyze per user?
- Do we need to have all emails at the same time or we can process them one by one or use a batching?
- How much time/memory do we need to process emails for one user?
- How to split text correctly? (n-grams, ?)
- If we found similar sentences, how to check that they are not in similar paragraphs?
- Do we need to clean up somehow text before comparing?
- What algorithms do we have for text comparison?
- Do we need to use an exact comparison, or should we also find almost identical texts? (NLP, cosine similarity ?)
- Can we use something like Elasticsearch/Solr for finding similar texts?
- ...