Reference Materials for System Design Interview

Chapter 1: Introduction and Overview

[1] Data warehouse. https://cloud.google.com/learn/what-is-a-data-warehouse.
[2] Structured vs. unstructured data. https://signal.onepointltd.com/post/102gjab/machine-learning-libraries-for-tabular-data-problems.
[3] Bagging technique in ensemble learning. https://en.wikipedia.org/wiki/Bootstrap_aggregating.
[4] Boosting technique in ensemble learning. https://aws.amazon.com/what-is/boosting/.
[5] Stacking technique in ensemble learning. https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/.
[6] Interpretability in Machine Mearning. https://blog.ml.cmu.edu/2020/08/31/6-interpretability/.
[7] Traditional machine learning algorithms. https://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/.
[8] Sampling strategies. https://www.scribbr.com/methodology/sampling-methods/.
[9] Data splitting techniques. https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms/.
[10] Class-balanced loss. https://arxiv.org/pdf/1901.05555.pdf.
[11] Focal loss paper. https://arxiv.org/pdf/1708.02002.pdf.
[12] Focal loss. https://medium.com/swlh/focal-loss-an-efficient-way-of-handling-class-imbalance-4855ae1db4cb.
[13] Data parallelism. https://www.telesens.co/2017/12/25/understanding-data-parallelism-in-machine-learning/.
[14] Model parallelism. https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-intro.html.
[15] Cross entropy loss. https://en.wikipedia.org/wiki/Cross_entropy.
[16] Mean squared error loss. https://en.wikipedia.org/wiki/Mean_squared_error.
[17] Mean absolute error loss. https://en.wikipedia.org/wiki/Mean_absolute_error.
[18] Huber loss. https://en.wikipedia.org/wiki/Huber_loss.
[19] L1 and l2 regularization. https://www.analyticssteps.com/blogs/l2-and-l1-regularization-machine-learning.
[20] Entropy regularization. https://paperswithcode.com/method/entropy-regularization.
[21] K-fold cross validation. https://en.wikipedia.org/wiki/Cross-validation_(statistics).
[22] Dropout paper. https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf.
[23] Overview of optimization algorithm. https://ruder.io/optimizing-gradient-descent/.
[24] Stochastic gradient descent. https://en.wikipedia.org/wiki/Stochastic_gradient_descent.
[25] AdaGrad optimization algorithm. https://optimization.cbe.cornell.edu/index.php?title=AdaGrad.
[26] Momentum optimization algorithm. https://optimization.cbe.cornell.edu/index.php?title=Momentum.
[27] RMSProp optimization algorithm. https://optimization.cbe.cornell.edu/index.php?title=RMSProp.
[28] ELU activation function. https://ml-cheatsheet.readthedocs.io/en/latest/activation_functions.html#elu.
[29] ReLU activation function. https://ml-cheatsheet.readthedocs.io/en/latest/activation_functions.html#relu.
[30] Tanh activation function. https://ml-cheatsheet.readthedocs.io/en/latest/activation_functions.html#tanh.
[31] Sigmoid activation function. https://ml-cheatsheet.readthedocs.io/en/latest/activation_functions.html#softmax.
[32] FID score. https://en.wikipedia.org/wiki/Fr%C3%A9chet_inception_distance.
[33] Inception score. https://en.wikipedia.org/wiki/Inception_score.
[34] BLEU metrics. https://en.wikipedia.org/wiki/BLEU.
[35] METEOR metrics. https://en.wikipedia.org/wiki/METEOR.
[36] ROUGE score. https://en.wikipedia.org/wiki/ROUGE_(metric).
[37] CIDEr score. https://arxiv.org/pdf/1411.5726.pdf.
[38] SPICE score. https://arxiv.org/pdf/1607.08822.pdf.
[39] Quantization-aware training. https://pytorch.org/docs/stable/quantization.html.
[40] Model compression survey. https://arxiv.org/pdf/1710.09282.pdf.
[41] Shadow deployment. https://christophergs.com/machine%20learning/2019/03/30/deploying-machine-learning-applications-in-shadow-mode/.
[42] A/B testing. https://en.wikipedia.org/wiki/A/B_testing.
[43] Canary release. https://blog.getambassador.io/cloud-native-patterns-canary-release-1cb8f82d371a.


Chapter 2: Visual Search System

[1] Visual search at pinterest. https://arxiv.org/pdf/1505.07647.pdf.
[2] Visual embeddings for search at Pinterest. https://medium.com/pinterest-engineering/unifying-visual-embeddings-for-visual-search-at-pinterest-74ea7ea103fo.
[3] Representation learning. https://en.wikipedia.org/wiki/Feature_learning.
[4] ResNet paper. https://arxiv.org/pdf/1512.03385.pdf.
[5] Transformer paper. https://arxiv.org/pdf/1706.03762.pdf.
[6] Vision Transformer paper. https://arxiv.org/pdf/2010.11929.pdf.
[7] SimCLR paper. https://arxiv.org/pdf/2002.05709.pdf.
[8] MoCo paper. https://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf.
[9] Contrastive representation learning methods. https://lilianweng.github.io/posts/2019-11-10-self-supervised/.
[10] Dot product. https://en.wikipedia.org/wiki/Dot_product.
[11] Cosine similarity. https://en.wikipedia.org/wiki/Cosine_similarity.
[12] Euclidean distance. https://en.wikipedia.org/wiki/Euclidean_distance.
[13] Curse of dimensionality. https://en.wikipedia.org/wiki/Curse_of_dimensionality.
[14] Curse of dimensionality issues in ML. https://www.mygreatlearning.com/blog/understanding-curse-of-dimensionality/.
[15] Cross-entropy loss. https://en.wikipedia.org/wiki/Cross_entropy.
[16] Vector quantization. http://ws.binghamton.edu/fowler/fowler\%20personal\%20page/EE523_files/Ch_10_1\%20VQ\%20Description\%20(PPT).pdf.
[17] Product quantization. https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd.
[18] R-Trees. https://en.wikipedia.org/wiki/R-tree.
[19] KD-Tree. https://kanoki.org/2020/08/05/find-nearest-neighbor-using-kd-tree/.
[20] Annoy. https://towardsdatascience.com/comprehensive-guide-to-approximate-nearest-neighbors-algorithms-8b94f057d6b6.
[21] Locality-sensitive hashing. https://web.stanford.edu/class/cs246/slides/03-1sh.pdf.
[22] Faiss library. https://github.com/facebookresearch/faiss/wiki.
[23] ScaNN library. https://github.com/google-research/google-research/tree/master/scann.
[24] Content moderation with ML. https://appen.com/blog/content-moderation/.
[25] Bias in $\mathrm{AI}$ and recommendation systems. https://www.searchenginejournal.com/biases-search-recommender-systems/339319/\#close.
[26] Positional bias. https://eugeneyan.com/writing/position-bias/.
[27] Smart crop. https://blog.twitter.com/engineering/en_us/topics/infrastructure/2018/Smart-Auto-Cropping-of-Images.
[28] Better search with gnns. https://arxiv.org/pdf/2010.01666.pdf.
[29] Active learning. https://en.wikipedia.org/wiki/Active_learning_(machine_learning).
[30] Human-in-the-loop ML. https://arxiv.org/pdf/2108.00941.pdf.


Chapter 3: Google Street View Blurring System

[1] Google Street View. https://www.google.com/streetview.
[2] DETR. https://github.com/facebookresearch/detr.
[3] RCNN family. https://lilianweng.github.io/posts/2017-12-31-object-recognition-part-3.
[4] Fast R-CNN paper. https://arxiv.org/pdf/1504.08083.pdf.
[5] Faster R-CNN paper. https://arxiv.org/pdf/1506.01497.pdf.
[6] YOLO family. https://pyimagesearch.com/2022/04/04/introduction-to-the-yolo-family.
[7] SSD. https://jonathan-hui.medium.com/ssd-object-detection-single-shot-multibox-detector-for-real-time-processing-9bd8deac0e06.
[8] Data augmentation techniques. https://www.kaggle.com/getting-started/190280.
[9] CNN. https://en.wikipedia.org/wiki/Convolutional_neural_network.
[10] Object detection details. https://dudeperf3ct.github.io/object/detection/2019/01/07/Mystery-of-Object-Detection.
[11] Forward pass and backward pass. https://www.youtube.com/watch?v=qzPQ8cEsVK8.
[12] MSE. https://en.wikipedia.org/wiki/Mean_squared_error.
[13] Log loss. https://en.wikipedia.org/wiki/Cross_entropy.
[14] Pascal VOC. http://host.robots.ox.ac.uk/pascal/VOC/voc2008/index.html.
[15] COCO dataset evaluation. https://cocodataset.org/\#detection-eval.
[16] Object detection evaluation. https://github.com/rafaelpadilla/Object-Detection-Metrics.
[17] NMS. https://en.wikipedia.org/wiki/NMS.
[18] Pytorch implementation of NMS. https://learnopencv.com/non-maximum-suppression-theory-and-implementation-in-pytorch/.
[19] Recent object detection models. https://viso.ai/deep-learning/object-detection/.
[20] Distributed training in Tensorflow. https://www.tensorflow.org/guide/distributed_training.
[21] Distributed training in Pytorch. https://pytorch.org/tutorials/beginner/dist_overview.html.
[22] GDPR and ML. https://www.oreilly.com/radar/how-will-the-gdpr-impact-machine-learning.
[23] Bias and fairness in face detection. http://sibgrapi.sid.inpe.br/col/sid.inpe.br/sibgrapi/2021/09.04.19.00/doc/103.pdf.
[24] AI fairness. https://www.kaggle.com/code/alexisbcook/ai-fairness.
[25] Continual learning. https://towardsdatascience.com/how-to-apply-continual-learning-to-your-machine-learning-models-4754adcd7f7f.
[26] Active learning. https://en.wikipedia.org/wiki/Active_learning_(machine_learning).
[27] Human-in-the-loop ML. https://arxiv.org/pdf/2108.00941.pdf.


Chapter 4: YouTube Video Search

[1] Elasticsearch. https://www.tutorialspoint.com/elasticsearch/elasticsearch_query_dsl.htm.
[2] Preprocessing text data. https://huggingface.co/docs/transformers/preprocessing.
[3] NFKD normalization. https://unicode.org/reports/tr15/.
[4] What is Tokenization summary. https://huggingface.co/docs/transformers/tokenizer_summary.
[5] Hash collision. https://en.wikipedia.org/wiki/Hash_collision.
[6] Deep learning for NLP. http://cs224d.stanford.edu/lecture_notes/notes1.pdf.
[7] TF-IDF. https://en.wikipedia.org/wiki/Tf\%E2\%80\%93idf.
[8] Word2Vec models. https://www.tensorflow.org/tutorials/text/word2vec.
[9] Continuous bag of words. https://www.kdnuggets.com/2018/04/implementing-deep-learning-methods-feature-engineering-text-data-cbow.html.
[10] Skip-gram model. http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/.
[11] BERT model. https://arxiv.org/pdf/1810.04805.pdf.
[12] GPT3 model. https://arxiv.org/pdf/2005.14165.pdf.
[13] BLOOM model. https://bigscience.huggingface.co/blog/bloom.
[14] Transformer implementation from scratch. https://peterbloem.nl/blog/transformers.
[15] 3D convolutions. https://www.kaggle.com/code/shivamb/3d-convolutions-understanding-use-case/notebook.
[16] Vision Transformer. https://arxiv.org/pdf/2010.11929.pdf.
[17] Query understanding for search engines. https://www.linkedin.com/pulse/ai-query-understanding-daniel-tunkelang/.
[18] Multimodal video representation learning. https://arxiv.org/pdf/2012.04124.pdf.
[19] Multilingual language models. https://arxiv.org/pdf/2107.00676.pdf.
[20] Near-duplicate video detection. https://arxiv.org/pdf/2005.07356.pdf.
[21] Generalizable search relevance. https://livebook.manning.com/book/ai-powered-search/chapter-10/v-10/20.
[22] Freshness in search and recommendation systems. https://developers.google.com/machine-learning/recommendation/dnn/re-ranking.
[23] Semantic product search by Amazon. https://arxiv.org/pdf/1907.00937.pdf.
[24] Ranking relevance in Yahoo search. https://www.kdd.org/kdd2016/papers/files/adf0361-yinA.pdf.
[25] Semantic product search in E-Commerce. https://arxiv.org/pdf/2008.08180.pdf.


Chapter 5: Harmful Content Detection

[1] Facebook’s inauthentic behavior. https://transparency.fb.com/policies/community-standards/inauthentic-behavior/.
[2] LinkedIn’s professional community policies. https://www.linkedin.com/legal/professional-community-policies.
[3] Twitter’s civic integrity policy. https://help.twitter.com/en/rules-and-policies/election-integrity-policy.
[4] Facebook’s integrity survey. https://arxiv.org/pdf/2009.10311.pdf.
[5] Pinterest’s violation detection system. https://medium.com/pinterest-engineering/how-pinterest-fights-misinformation-hate-speech-and-self-harm-content-with-machine-learning-1806b73b40ef.
[6] Abusive detection at LinkedIn. https://engineering.linkedin.com/blog/2019/isolation-forest.
[7] WPIE method. https://ai.facebook.com/blog/community-standards-report/.
[8] BERT paper. https://arxiv.org/pdf/1810.04805.pdf.
[9] Multilingual DistilBERT. https://huggingface.co/distilbert-base-multilingual-cased.
[10] Multilingual language models. https://arxiv.org/pdf/2107.00676.pdf.
[11] CLIP model. https://openai.com/blog/clip/.
[12] SimCLR paper. https://arxiv.org/pdf/2002.05709.pdf.
[13] VideoMoCo paper. https://arxiv.org/pdf/2103.05905.pdf.
[14] Hyperparameter tuning. https://cloud.google.com/ai-platform/training/docs/hyperparameter-tuning-overview.
[15] Overfitting. https://en.wikipedia.org/wiki/Overfitting.
[16] Focal loss. https://amaarora.github.io/2020/06/29/FocalLoss.html.
[17] Gradient blending in multimodal systems. https://arxiv.org/pdf/1905.12681.pdf.
[18] ROC curve vs precision-recall curve. https://machinelearningmastery.com/roc-curves-and-precision-recall-curves-for-classification-in-python/.
[19] Introduced bias by human labeling. https://labelyourdata.com/articles/bias-in-machine-learning.
[20] Facebook’s approach to quickly tackling trending harmful content. https://ai.facebook.com/blog/harmful-content-can-evolve-quickly-our-new-ai-system-adapts-to-tackle-it/.
[21] Facebook’s TIES approach. https://arxiv.org/pdf/2002.07917.pdf.
[22] Temporal interaction embedding. https://www.facebook.com/atscaleevents/videos/730968530723238/.
[23] Building and scaling human review system. https://www.facebook.com/atscaleevents/videos/1201751883328695/.
[24] Abusive account detection framework. https://www.youtube.com/watch?v=YeX4MdU0JNk.
[25] Borderline contents. https://transparency.fb.com/features/approach-to-ranking/content-distribution-guidelines/content-borderline-to-the-community-standards/.
[26] Efficient harmful content detection. https://about.fb.com/news/2021/12/metas-new-ai-system-tackles-harmful-content/.
[27] Linear Transformer paper. https://arxiv.org/pdf/2006.04768.pdf.
[28] Efficient AI models to detect hate speech. https://ai.facebook.com/blog/how-facebook-uses-super-efficient-ai-models-to-detect-hate-speech/.


Chapter 6: Video Recommendation System

[1] YouTube recommendation system. https://blog.youtube/inside-youtube/on-youtubes-recommendation-system.
[2] DNN for YouTube recommendation. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf.
[3] CBOW paper. https://arxiv.org/pdf/1301.3781.pdf.
[4] BERT paper. https://arxiv.org/pdf/1810.04805.pdf.
[5] Matrix factorization. https://developers.google.com/machine-learning/recommendation/collaborative/matrix.
[6] Stochastic gradient descent. https://en.wikipedia.org/wiki/Stochastic_gradient_descent.
[7] WALS optimization. https://fairyonice.github.io/Learn-about-collaborative-filtering-and-weighted-alternating-least-square-with-tensorflow.html.
[8] Instagram multi-stage recommendation system. https://ai.facebook.com/blog/powered-by-ai-instagrams-explore-recommender-system/.
[9] Exploration and exploitation trade-offs. https://en.wikipedia.org/wiki/Multi-armed_bandit.
[10] Bias in AI and recommendation systems. https://www.searchenginejournal.com/biases-search-recommender-systems/339319/#close.
[11] Ethical concerns in recommendation systems. https://link.springer.com/article/10.1007/s00146-020-00950-y.
[12] Seasonality in recommendation systems. https://www.computer.org/csdl/proceedings-article/big-data/2019/09005954/1hJsfgT0qL6.
[13] A multitask ranking system. https://daiwk.github.io/assets/youtube-multitask.pdf.
[14] Benefit from a negative feedback. https://arxiv.org/abs/1607.04228?context=cs.


Chapter 7: Event Recommendation System

[1] Learning to rank methods. https://livebook.manning.com/book/practical-recommender-systems/chapter-13/53.
[2] RankNet paper. https://icml.cc/2015/wp-content/uploads/2015/06/icml_ranking.pdf.
[3] LambdaRank paper. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/lambdarank.pdf.
[4] LambdaMART paper. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf.
[5] SoftRank paper. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/SoftRankWsdm08Submitted.pdf.
[6] ListNet paper. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-2007-40.pdf.
[7] AdaRank paper. https://dl.acm.org/doi/10.1145/1277741.1277809.
[8] Batch processing vs stream processing. https://www.confluent.io/learn/batch-vs-real-time-data-processing/#:~:text=Batch%20processing%20is%20when%20the,data%20flows%20through%20a%20system.
[9] Leveraging location data in ML systems. https://towardsdatascience.com/leveraging-geolocation-data-for-machine-learning-essential-techniques-192ce3a969bc#:~:text=Location%20data%20is%20an%20important,based%20on%20your%20customer%20data.
[10] Logistic regression. https://www.youtube.com/watch?v=yIYKR4sgzI8.
[11] Decision tree. https://careerfoundry.com/en/blog/data-analytics/what-is-a-decision-tree/.
[12] Random forests. https://en.wikipedia.org/wiki/Random_forest.
[13] Bias/variance trade-off. http://www.cs.cornell.edu/courses/cs578/2005fa/CS578.bagging.boosting.lecture.pdf.
[14] AdaBoost. https://en.wikipedia.org/wiki/AdaBoost.
[15] XGBoost. https://xgboost.readthedocs.io/en/stable/.
[16] Gradient boosting. https://machinelearningmastery.com/gentle-introduction-gradient-boosting-algorithm-machine-learning/.
[17] XGBoost in Kaggle competitions. https://www.kaggle.com/getting-started/145362.
[18] GBDT. https://blog.paperspace.com/gradient-boosting-for-classification/
[19] An introduction to GBDT. https://www.machinelearningplus.com/machine-learning/an-introduction-to-gradient-boosting-decision-trees/.
[20] Introduction to neural networks. https://www.youtube.com/watch?v=0twSSFZN9Mc. [21] Bias issues and solutions in recommendation systems. https://www.youtube.com/watch?v=pPq9iyGIZZ8.
[22] Feature crossing to encode non-linearity. https://developers.google.com/machine-learning/crash-course/feature-crosses/encoding-nonlinearity.
[23] Freshness and diversity in recommendation systems. https://developers.google.com/machine-learning/recommendation/dnn/re-ranking.
[24] Privacy and security in ML. https://www.microsoft.com/en-us/research/blog/privacy-preserving-machine-learning-maintaining-confidentiality-and-preserving-trust/.
[25] Two-sides marketplace unique challenges. https://www.uber.com/blog/uber-eats-recommending-marketplace/.
[26] Data leakage. https://machinelearningmastery.com/data-leakage-machine-learning/.
[27] Online training frequency. https://huyenchip.com/2022/01/02/real-time-machine-learning-challenges-and-solutions.html#towards-continual-learning.


Chapter 8: Ad Click Prediction on Social Platforms

[1] Addressing delayed feedback. https://arxiv.org/pdf/1907.06558.pdf.
[2] AdTech basics. https://advertising.amazon.com/library/guides/what-is-adtech.
[3] SimCLR paper. https://arxiv.org/pdf/2002.05709.pdf.
[4] Feature crossing. https://developers.google.com/machine-learning/crash-course/feature-crosses/video-lecture.
[5] Feature extraction with GBDT. https://towardsdatascience.com/feature-generation-with-gradient-boosted-decision-trees-21d4946d6ab5.
[6] DCN paper. https://arxiv.org/pdf/1708.05123.pdf.
[7] DCN V2 paper. https://arxiv.org/pdf/2008.13535.pdf.
[8] Microsoft’s deep crossing network paper. https://www.kdd.org/kdd2016/papers/files/adf0975-shanA.pdf.
[9] Factorization Machines. https://www.jefkine.com/recsys/2017/03/27/factorization-machines/.
[10] Deep Factorization Machines. https://d2l.ai/chapter_recommender-systems/deepfm.html.
[11] Kaggle’s winning solution in ad click prediction. https://www.youtube.com/watch?v=4Go5crRVyuU.
[12] Data leakage in ML systems. https://machinelearningmastery.com/data-leakage-machine-learning/.
[13] Time-based dataset splitting. https://www.linkedin.com/pulse/time-based-splitting-determining-train-test-data-come-manraj-chalokia/?trk=public_profile_article_view.
[14] Model calibration. https://machinelearningmastery.com/calibrated-classification-model-in-scikit-learn/.
[15] Field-aware Factorization Machines. https://www.csie.ntu.edu.tw/~cjlin/papers/ffm.pdf.
[16] Catastrophic forgetting problem in continual learning. https://www.cs.uic.edu/~liub/lifelong-learning/continual-learning.pdf.


Chapter 9: Similar Listings on Vacation Rental Platforms

[1] Instagram’s Explore recommender system. https://ai.facebook.com/blog/powered-by-ai-instagrams-explore-recommender-system.
[2] Listing embeddings in search ranking. https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e.
[3] Word2vec. https://en.wikipedia.org/wiki/Word2vec.
[4] Negative sampling technique. https://www.baeldung.com/cs/nlps-word2vec-negative-sampling.
[5] Positional bias. https://eugeneyan.com/writing/position-bias/.
[6] Random walk. https://en.wikipedia.org/wiki/Random_walk.
[7] Random walk with restarts. https://www.youtube.com/watch?v=HbzQzUaJ_9I.
[8] Seasonality in recommendation systems. https://www.computer.org/csdl/proceedings-article/big-data/2019/09005954/1hJsfgT0qL6.


Chapter 10: Personalized News Feed

[1] News Feed ranking in Facebook. https://engineering.fb.com/2021/01/26/ml-applications/news-feed-ranking/.
[2] Twitter’s news feed system. https://blog.twitter.com/engineering/en_us/topics/insights/2017/using-deep-learning-at-scale-in-twitters-timelines.
[3] LinkedIn’s News Feed system LinkedIn. https://engineering.linkedin.com/blog/2020/understanding-feed-dwell-time.
[4] BERT paper. https://arxiv.org/pdf/1810.04805.pdf.
[5] ResNet model. https://arxiv.org/pdf/1512.03385.pdf.
[6] CLIP model. https://openai.com/blog/clip/.
[7] Viterbi algorithm. https://en.wikipedia.org/wiki/Viterbi_algorithm.
[8] TF-IDF. https://en.wikipedia.org/wiki/Tf%E2%80%93idf.
[9] Word2vec. https://en.wikipedia.org/wiki/Word2vec.
[10] Serving a billion personalized news feed. https://www.youtube.com/watch?v=Xpx5RYNTQvg.
[11] Mean absolute error loss. https://en.wikipedia.org/wiki/Mean_absolute_error.
[12] Means squared error loss. https://en.wikipedia.org/wiki/Mean_squared_error.
[13] Huber loss. https://en.wikipedia.org/wiki/Huber_loss.
[14] A news feed system design. https://liuzhenglaichn.gitbook.io/system-design/news-feed/design-a-news-feed-system.
[15] Predict viral tweets. https://towardsdatascience.com/using-data-science-to-predict-viral-tweets-615b0acc2e1e.
[16] Cold start problem in recommendation systems. https://en.wikipedia.org/wiki/Cold_start_(recommender_systems).
[17] Positional bias. https://eugeneyan.com/writing/position-bias/.
[18] Determine retraining frequency. https://huyenchip.com/2022/01/02/real-time-machine-learning-challenges-and-solutions.html#towards-continual-learning.


Chapter 11: People You May Know

[1] Clustering in ML. https://developers.google.com/machine-learning/clustering/overview.
[2] PYMK on Facebook. https://youtu.be/Xpx5RYNTQvg?t=1823.
[3] Graph convolutional neural networks. http://tkipf.github.io/graph-convolutional-networks/.
[4] GraphSage paper. https://cs.stanford.edu/people/jure/pubs/graphsage-nips17.pdf.
[5] Graph attention networks. https://arxiv.org/pdf/1710.10903.pdf.
[6] Graph isomorphism network. https://arxiv.org/pdf/1810.00826.pdf.
[7] Graph neural networks. https://distill.pub/2021/gnn-intro/.
[8] Personalized random walk. https://www.youtube.com/watch?v=HbzQzUaJ_9I.
[9] LinkedIn’s PYMK system. https://engineering.linkedin.com/blog/2021/optimizing-pymk-for-equity-in-network-creation.
[10] Addressing delayed feedback. https://arxiv.org/pdf/1907.06558.pdf