/ML-for-High-Schoolers

This guide details a learning path for high school students looking to explore the field of Machine Learning & Artificial Intelligence.

Learning Artificial Intelligence and Machine Learning as a High Schooler

English 中文

Hi, I'm Karan, a high school student based in Singapore. Having spent the last year exploring the field of Artificial Intelligence (AI) and Machine Learning (ML), I believe that there does not exist a learning path in this field that is built specifically for High School students. This is my attempt to create one.

Since I started my journey into this area, I've tried to spend a couple of hours every day understanding as much as I can, whether it be watching Youtube videos, undertaking personal projects or simply reading books. I've been guided by older peers who've had far more experience than me, but know that such guidance is not available to everyone - so this is my attempt to relay all the learnings into one concrete document.

All the information that I have compiled in this guide is intended for high schoolers wishing to excel in this up and coming field. It is intended to be followed chronologically, and unlike most guides/learning paths that I've come across, doesn't require an understanding of linear algebra, partial derivatives and other complex mathemathical concepts which one cannot find in their high school syllabuses. However, it does include a course which covers the fundamentals of the essential math for Machine Learning - the level of which I'd consider comparable to high school maths. If you work through this path on a regular basis, I believe that you could get to a reasonably proficient level in about three months. However, this learning path does provide content that can keep you learning for the rest of your time in high school.

So, lets get to it.

1. Learn the basics of programming in Python.

I strongly suggest Python as a starting point, as it's a language that ticks most boxes when it comes to being used in the AI/ML domain - not only is it extremely easy to learn, it provides libraries and frameworks for pretty much every basic algorithm known in the field. While R is useful, I find that Python is far more suitable for high school students due to its readability and learnability. Besides basic programming, for Machine Learning in particular, the libraries that are most useful are Numpy, Pandas and Matplotlib.

  • For those of you who have never coded before, I suggest going to a course provided by the University of Toronto (one of the best universities for AI/ML right now). It will take you a few weeks, but its well worth your time - most of the knowledge you gain through this course can be applied to any other programming language, the only difference being the syntax. The course is free, and can be found here.
  • For those of you who have coding experience in a language besides Python, just skim through this tutorial for a basic understanding of Python syntax - it shouldn't take you more than a day.
  • ML and AI are built on mathematical principles like Calculus, Linear Algebra, Probability, Statistics, and Optimization - many hopeful AI practitioners (like myself) find this daunting. This course on edX Essential Math for Machine Learning: Python Edition by Microsoft is not designed to make you a mathematician. Rather, it aims to help you learn some essential foundational concepts and the notation used to express them. The course provides a hands-on approach to working with data and applying the techniques you’ve learned in real-world problem settings. Financial aid is available for those who need it.
  • Now, after you've learnt the basics of Python, you need to understand the fundamental two libraries used in the field - Numpy and Pandas, which are used primarily for data manipulation, representation and storage. Matplotlib, the third 'core' library in the area, is used to visualize this data through graphs and diagrams - but we'll get to that later. These two courses together shouldn't take more than a couple of days: Numpy and Pandas.

With this in your back pocket, Now you should be set in the core programming needed to learn Machine Learning and Artificial Intelligence.

2. Understand the fundamentals of Machine Learning.

If there's one universal course for Machine Learning, it has to be Andrew Ng's. It may seem slightly challenging for high school students, as it refers to concepts such as partial derivatives

  • but I firmly believe that understanding these aren't required to gain tangible knowledge from the course. I found it particularly beneficial to re-watch some lectures in Weeks 3 to 5 - these topics are advanced, so it may feel a bit fast the first time you watch it. Don't be too worried if you can't fully follow the core mathematics, especially with respect to the calculus - some of this certainly requires university-level math knowledge. It is more important that you are able to follow the thought process that Prof. Ng uses when relaying his knowledge, as this enables you to gain an understanding of what is going on under the hood of Machine Learning processes.

I would encourage you to take notes during the course, as writing down what you learn helps ensure that you are truly understanding the information relayed. Completing the programming tutorials and exercises is not essential, as these are done in Matlab - which (in my experience) can be tricky to grasp, as it is a matrix-based language. But don't worry, we will be doing the very same (and far more advanced) algorithms in Python in just a short amount of time.

This free course can be found here.

3. Gain exposure to an assortment of Mchine learning algorithms, and implement them in real-world scenarios.

Implementing ML algorithms without the university-level math knowledge that powers the nuts and bolts of these algorithms sounds like a paradoxical task - however, a team from Australia set out to do just this.

Kirill Eremenko and Hadelin de Ponteves, a pair of researchers part of the 'SuperDataScience' team, are absolutely fantastic at finding relevant ways to apply simple algorithms in real life. Furthermore, they go into a suitable amount of depth to understand the functionality of the algorithm, but without the complex maths that a high school student would not be able to understand. Their course covers both Python and R, though I would not worry about R at this point - simply go through the Python tutorials. Also, if you find that they are going a bit too slow, play this course at 1.25x speed (I did that and found it much more suitable to my learning).

Their course is on Udemy, and is only offered as a paid version, though Udemy regularly has discounts of 90% or more on their courses. It can be found here, and is usually around $10. It covers everything from basic regression algorithms to deep neural networks, the latter of which is the core architecture used in many modern day applications like ChatGPT and AlphaFold. If you wish to explore even more advanced areas, their Deep Learning course is offered at the end of the Machine Learning for a 90% discount.

If you're unwilling to pay for this course, you can check out Google's free Deep Learning course here or University of Michigan's free course here. In my opinion though, these are far from as well-rounded as the SuperDataScience team's courses.

For these courses, taking notes aren't a necessity - there are tons of 'algorithm cheat sheets' online, which offer a quick intution on how they work. This website lists a few.

4. Explore, explore and explore.

Now that you've covered a wide range of machine learning concepts, it's time for you to independently use this knowledge to complete some projects. I'd suggest exploring Kaggle and the UCI Machine Learning Repository - find a dataset you have an interest in, and model some solutions to problems that they relate to. Play around with different algorithms and work towards optimizing performance.

Ensure that the datasets you use are simple and clean in nature - i.e. they shouldn't require too much pre-processing or domain-specific knowledge to work with. Some easy datasets off the top of my head are the Iris, Wine, Breast Cancer Wisconsin, Autism Screening, Congress Voting, Handwritten Digits MNIST and Fashion MNIST ones.

If you ever come across a road block, Stack Overflow is your best friend - they have an answer to almost any question that you'd have. If it doesn't, just post one - you should get replies within a couple of hours! There's nothing much more to this step - when you find that you've become comfortable with the whole modelling process from back to front, feel free to move on!

5. Find a niche and dive deeper.

Now you should not only have a great and broad understanding of all the basics, but also have an ability to apply it to some real-world data problems. However, it's important to understand that these basics don't span the whole world of ML/AI - rather, many of them have been known ways of tackling such data problems for years, but unfortunataly only more recently were computers powerful enough to truly leverage them in a reasonable amount of runtime. Most modern work in the area focuses on improving these in a variety of novel ways, and building systems tangential to these that leverage the underlying algorithms but improve, extent and enhance them in a variety of ways. Thus, I suggest you find an area of interest in the broader field of Machine Learning, and delve deeper into it in order to become more experienced with the state of the art of that field as it is today. You probably won't have time to become experts in all of the areas I outlined during your high school tenure, but try and conquer one or two.

Before getting into these areas, I'd recommend truly understanding what it pertains - a simple Youtube search for a high-level explanation will give you all you need. So let's get to it.

  • Computer Vision: This area pertains to making computers see and understand things using a special type of neural network. Stanford publishes their course on this online here, with lectures, course notes and assignments available online. Go through this, but don't worry about the math being too complicated at times - the course is intended primarily to deepen your knowledge, which it inevitably will do. You can also look to OpenCV, a computer vision library that does a lot of the complex stuff for you. A great tutorial can be found here. Once you're done with these, look at more advanced image datasets on Kaggle and UCI, or even enter some Kaggle Competitions.

  • Natural Language Processing: Understanding how computers learn to speak is also a prominent topic today. Once again, Stanford offers a great course thats online and can be found here. If you don't understand some of the math concepts, don't worry, just gain an understanding of how this domain works. For implementations, you could undertake this Udemy course. However, you could alternatively go through some of well-known Machine Learner Siraj Raval's videos. One you've done these, try undertaking simple, well-known projects like building a chatbot, sentiment analysis or creating lyrics to a song - simple Youtube searches should help you out. More modern applications like ChatGPT and Claude are built on a Neural Network-based system called Large Language Models, which are primarily based on the Transformer architecture - this course by Andrew Ng is a good starting point.

  • Reinforcement Learning: This domain focuses on how machines learn to act in a particular setting, and its most popular application is in the field of video games. Siraj Raval has a pretty good playlist on this, which can be found here. If you are looking for implementation-based tutorials specifically using a high-level package like Tensorflow, Denny Britz has a solid set of tutorials which can be found here. David Silver's UCL course is also great, though beginners may find it a bit tricky - it can be found here. Once you're done with these, its pretty logical to just start downloading base projects or games from online, and adding an element of AI to govern how the agents act. Simple walkthroughs can again be found via a simple Youtube search.

  • Data Science & Analytics: This field is a budding domain with tons of exciting job oppurtunities, and is used extensively in most modern corporations to derive insights from the hoards of data being collected in order to inform business decisions. I suggest undertaking either SuperDataScience's paid course or UC San Diego's Python-based free course, though you can find specific learning paths for data science with a simple Google search. You can also use the following links to learn SQL and Matplotlib, which are tangential languages used for a lot of modern data analytics. The advantage in learning this at a student level is employability - I have numerous friends in high school who've been offered data science internships, as the insights gained work in this discipline can instantly be monetized by companies. Data-driven decision making is really the only form of decision making in today's corporate world.

  • There are also areas like Boltzmann Machines (used for recommendation systems), Adversial Networks (AI improving AI) and Genetic Algorithms (improving a solution to a problem in a way similar to natural evolution), but in my opinion, the combination of their niche applicability and requiring more advanced levels of math make them less desirable as a starting point. Do feel free to explore these if you have a particular passion for one of them, though they also aren't as well documented as the other areas, which may make mastering them slightly more tricky.

BONUS (though still extremely important): Broaden your AI/ML horizons.

If you want to work in this field in the long run, its crucial to understand from a more holistic perspective - by this I mean learning about groundbreaking discoveries, the discource around how it should be applied and its general implications on society. You should start doing things listed in this section as soon as you have the necessary understanding of how the technology works - I believe a good starting point is as you begin Section 4 of this learning path. This kind of information may not specifically help with implementing algorithms for data problems, but for a technology that is so integral to today's world, it really helps shape a more robust understanding of its role, true potential and limits.

There's a few things that a high schooler can do to deepen their general understanding of the field and make them more knowledgeable:

  • Start reading research papers: I'd like to emphasize these really aren't as challenging as they sound. While the math that governs modern techniques may be very advanced, simply gaining exposure to what's going on in the front lines of the industry is never a bad thing. If you ever come across one you don't understand, just put it down - there are more than enough alternatives to keep you busy. This website offers a host of great papers to start with, though after you finish those, this offers a more lengthy list - simply read ones you're interested in or related to your area of 'expertise' from Section 5. It's helpful to keep a small diary of learnings from each paper. If you are unable to truly understand many of these reserach papers, try going through this guide I wrote which provides more digestible breakdowns of some recent innovations. This Youtube channel also has a host of more introductory explanations of papers, each covered in just two minutes.
  • Follow pioneers: People like Andrew Ng, Ian Goodfellow and Yann LeCunn are regularly interviewed, providing the perspective of the 'founders' of what we know as AI & ML today. This Youtube channel gathers the best of these talks, and compiles them into a central resource - watch one a night, and I guarantee that you'll feel like an expert within weeks.
  • Stay up-to-date with the field: Wired is one of the best platforms for anyone interested in tech. It publishes multiple AI-related stories every day (though doesn't everyone these days?), which can be found here. It's a quick and engaging way to understand the trends of the time. Alternatively, subsribe to TechCrunch's Facebook Messenger bot - it often has interested AI-related articles, and prompts you with information every day.
  • Understand the implications: There's no better way to do this than listening to TED talks. Their speakers are extremely knowledgeable in the field, and there is an increasing emphasis on AI in their speeches. A collection of videos can be found here.
  • The Philosophy: AI has its supporters and its opposers. The philosophy behind it, however, is intriguing. My favourite books that explore this area, and are suitable for high school students, include 'How to Create a Mind' by Ray Kurzweil and 'Life 3.0' by Max Tegmark - do give them a shot. They look at the more long-term trajectory of AI, which may not feel as relevant on a day-to-day basis, but helps understand the wider context of the technology as a whole.
  • Contributing: If you're the kind of person who likes to learn from others experience, check out avenues of discourse such as this Facebook group, where people regularly post insightful articles and papers relating to advances in the area. Alternatively, for more casual conversations, check out subreddits on AI like this one.
  • Delve into the math: Yes, you do need university level math fundamentals for these, but if you're a strong math student, there's nothing stopping you from taking some online courses. Microsoft has a free course that I've heard good things about, and requires just high school-level maths. This Quora thread also has some great resources that you should check out. 3Blue1Brown is a famous name in the community too, as his Youtube videos are fantastic for learning the maths (primarily linear algebra and calculus) behind some of the more complicated concepts.

Conclusion

I've heard far too many people tell me that learning Machine Learning and Artificial Intelligence is too much of a stretch for a high schooler to not write this - with a well-paved learning path, it can be studied by anyone. And with that, I wish everyone the best of luck in undertaking this learning path.

If you have additions or possible improvements to this guide, feel free to make a PR to this repository. And for feedback, collaborations or just general queries, feel free to write to me @ kj.jaisingh@gmail.com.

Contributors