/Acing-the-Software-Engineering-Interview

Lessons on how to become a Software Engineer / Data Scientist

Primary LanguageJava

Acing the Software Engineering Interview

For a while now I want to pursue a career in the software industry as a Data Scientist. I did not roll into Computer Science with a degree. Instead, I initially pursued a degree in Business before I made the switch. I discovered the wonderful world of Machine Learning and slowly drifted towards becoming a Data Scientist / Software Engineer. Yet, since leaving college I have not done as much with either as I would have liked. I have a high paying job at a defense contractor, but due to the nature of working on an established application my abilities are under-utilized. I am mostly a software developer, not an engineer! I believe it is time to do something about that! In early 2017 ago I came across John Washam's blog and GitHub page on how to 'Ace the Interview' at a big tech company and decided I want do the same. However, John readily admits he made a few mistakes. I intend to learn from them, and I hope you will too. This list of topics will prepare you for becoming a Software Engineer / Data Scientist. Just like John's 'aim high' attitude, that includes Google, Apple, Amazon and Microsoft.

Essentially this list is structured into 4 knowledge parts: preparation for the Software Engineering Interview, the basics of Software Engineering followed by shorter sections on Machine Learning and Cybersecurity.

The list includes references to texts, web pages, YouTube videos, MOOCs and Books. I intend to go easy on reading books and just give you the essential list. Reading takes up a lot of time and I fear knowledge is not retained. The scope of this list is also paired down as much as possible. However, I advice you to practice coding as much as possible. It is a good way to find your niche. You may become disheartened by this list, but try to take heart nontheless. If all of this is new then at least try to integrate these topics into your work or hobby. That way it this information will become more meaningful.

Short description of Goals

Things I want to achieve

- Go through all the relevant subject material - even stuff I already know. By proceeding through all the material over a period of several months may help me understand the big picture of Computer Science.
- Code a lot. I want to implement all the relevant examples of algorithms, data structures and applications to gain practical experience.
- Test a lot. Code testing is laborious, boring but absolutely necessary. As this guide primarily deals with Java I will write test code using Junit.
- Retain as much knowledge as possible. Try and find a way to retain the knowledge.
- Create showcase projects for future reference.
- Learn practical tips to becoming a Software Engineer. This means practicing for the interview, deal with issues such as fear, my natural introvertedness and how to work in an Agile / SCRUM way.

Things I do not want to do

- Spend all of my free time on becoming a Software Engineer. John Washam admits he spent far too much time on his project. Currently I have a job and I do not need a burnout.
- Lose overview. There is such a thing as doing too much. Knowing when to stop is important and I will put that to the test.
- Not to be afraid to re-invent the wheel. The goal is not to do something new, the goal is to learn to do something new.

Preparing for 2021

- After 2020 and COVID I intend to adjust my goals. Right now I have achieved seniority in management so that migth give me leverage to prsue new goals. For 2021 I still need to fully establish those.

Part 1. Preparation for the Software Engineering Interview

If you want to become a software engineer there are non-CS skills and habits you need to develop, often called soft-skills. In this section on preparing for the interview the most important ones are listed.

Interview Process & General Interview Prep

Here are some quick tutorials and tips to prepare for the interview process. I think it will give you a quick impression of what you do not know!

After learning many of the topics described in the sections on Software Engineering and Machine Learning you should have a well-rounded knowledge base. To actually get a job at the company of your choice you will need to ace the interview. This section will concentrate on all that is necessary to accomplish just that. If you want to continue learning take a look at the section below entitled Other Topics.

Coding Exercises and habits

When I started this page I really did not know how good I was at programming, let alone being a software engineer. In any case becoming the latter is a never ending journey. To discover how good you are and to show your skills to others the following good habits should be adopted.

Pick a language

For the interview you will need to select one programming language. My choice is Java. However, you will need to know more. You will also need to know a scripting language. The obvious choice is Python, which is also popular with Machine (Deep) Learning. As such there are coding example in both Java and Python.

Book List

Despite my reluctance at giving you an endless list of books to read there are a number that can be considered essential. If you read them, delve into them. Don't just rush through cover to cover, but to the exercises, answer the practice questions. That is the best way to retain the knowledge they contain. An important book listed below is 'Algorithms' by Sedgewick and Wayne. It assumes you know how to program Java, but other than that it is the most important book listed on this page and on its own contain maybe a quarter of the knowledge required to pass an interview - maybe more if only general topics are discussed. It took me about a year to read it, having stopped along the way numerous times.

Java

  • Algorithms (Sedgewick and Wayne)
    • I have already touted its importance above. Note that plenty of videos on this page discuss the same topics - especially those on Data Structure and Algorithms.

Python

Machine / Deep Learning

When it comes to Machine learning, and these days more likely Deep Learning there are only two books that will really make a difference.

Part 2. Basics of Software Engineering: Data Structure, Algorithms and Graphs

This part covers the basic of what a software engineer should know, by heart.

Google tips

The people at Google have supplied a handy list of topics you need to master should you want to apply.

Algorithmic complexity / Big-O / Asymptotic analysis

Data Structures

For the study of data structures and algorithms a three-pronged approach is used. First, a short descriptive video is listed (usually YouTube). Second, a simple implementation is referenced which the reader can examine. Third a small task is proposed for the reader. usually this means creating your own implementation.

Algorithms

Sorting algorithms

Searching algorithms

Both search algorithm below are simple to implement, perhaps even trivial. Yet they naturally flow into the topic of Graphs.

  • Linear Searching
  • Binary Search

Graphs

Graphs are an important aspect in Computer Science. Many problems can be explained and solved as Graphs.

Graphs can be represented as

Graphs can be traversed with algorithms such as:

Graphs are a complicated topic. The algorithms to traverse them are not easy to implement. If you find it difficult to use the videos and texts above then sign up for a MOOC course. 2 years ago I did Object Oriented Java Programming: Data Structures and Beyond. It really helped me see the big picture. The third course in the specialization (Advanced Data Structures in Java) is on Graphs.

Mathematics

Logarithms

Part 3. Advanced Topics: Computer Science, Network coding, Design Patterns and Multi-Threading

In this part I cover the advanced topics. I used Java because it is the language I am most comfortable with, but try and code in Python if you can.

Advanced Java

Advanced Python

Some topics do not lend themselves easily explained in Java, so I have created this separate section.

Network coding

A basic understanding of network coding is needed if you are expected to connect applications. Luckily Java is very powerfull and makes network coding easy!

Now you should have the basic skills to create applications. If you want to prepare thoroughly for the interview I suggest you also watch and read the following tutorials. Try to create small programs to improve your understanding. It is not necessary to have 2 computers but it is recommended to try

Basics of Internet, OSI model

Remote Method Invocation

Arguably Remote Method Invocation straddles Network coding and application programming, but if you want a job at a large company you may need to know it.

Web servers and servlets

To generate webpages you could use Apache, but Java also comes with its own web server - Jetty. This section continues the theme of network programming, but now through the use of http. We will start of easy with some introductory reading, followed some in-depth videos before some suggested programming assignments.

Garbage collection

Lambda functions

Lambda functions were introduced with Java 8 in 2014. Lambda functions are anonymous functions that can be passed as an argument or even returned as a value. They form a major part of Functional Programming, but can be tricky to understand. Below is a link to the first video in a series of 25 by Koushik as he explains their relevance and use. The videos are short and to the point. So grab a coffee and start your introduction to Lambda

Parallel Programming

With parallel programming you can write algorithms that can efficiently make use of all the processing cores of a computer. This topic makes use of Lambda expression. In Java there are now APIs that allow programmers easy use of this concept.

For those eager to do a deep dive, there is a Coursera specialization: [Parallel, Concurrent, and Distributed Programming in Java Specialization](Parallel, Concurrent, and Distributed Programming in Java Specialization). I just finished the first course and I believe I now have a much firmer grasp of this topic. As always the lectures and coding examples allow for a much deep understanding than a YouTube video.

Compilers

More to follow

Operating Systems

More to follow

Graphical User Interfaces

GUIs are simple enough to implement but hard to master. Below are several quick videos on how to start implementing GUIs in Python with the PyQT5 framework.

Programming Languages

Papers

Below is a list of papers (scientific and otherwise). They are not mandatory to read but they do discus the latest developments in IT. I find them inspiring to read.

Part 3. Machine (Deep) Learning & Computer Vision

This part covers the most interesting parts, the very reason I continue to study computer science. Note that a lot of the material is new and not well established. That is what makes it so exciting.

Google Machine Learning Crash course

Again the people at Google are the best organized and have come up with a small online course that cover most of the relevant topics

Deeplearning.ai

Currently I am doing this specialization on coursera given by renowned AI specialist Andrew Ng. It is a successor to his earlier course offered by the University of Stanford which I did in 2014. This new specialization is highly recommended. It will give you skills you can immediately apply, though not the experience.

Books

Deep Learning

Deep Learning is quickly becoming a craze, and I think it is deservedly so. For an excellent but easy introduction read Python Deep Learning. There is a link above. For specific applications I have posted links below

At this moment there are a number of ways to implement Deep Learning on datasets: Theano, Tensorflow and Keras to name a few. Below is an introductory video.

Keras

The easiest way to quickly learn how to program Deep Learning solutions is through Keras. Developed by Francois Chollet at Google it is layer on top of Theano or TensorFlow - the two main frameworks used. Below are a set of easy to follow tutorials that culminate in a solution designed to be scalable and which can be accessed through an REST Api.

All topics above are about using Deep Learning with Python. It shows you why Python is so important.

Vectorization

With Vectorization we can greatly increase the number of operations per instruction - that is, reduce the number of loop iterations because we do more per loop. With Machine Learning using Linear Algebra this has become vital to optimize resources.

Lets get dirty with vectorization and use the Python Numpy library

Numpy

Numpy is an important library that any Data Scientist should be familiar with, use it to vectorize, lambdadize and otherwise perform operations on data.

Part 4. Penetration Testing & Security

I will fill out this section in soon enough.

If you have gone through the list above you may become disheartened at the volume of knowledge. And yet there is plenty more to learn. How well are you versed using Linux, what about JavaScript? Below is a list of topics I could not fit elsewhere. I think they are optional, but I cannot judge that without knowing what kind of job you want in IT.

More Books

There are always more books to read. However, if you have gotten this far I think books on Programming and Programming Languages becomes superfluous. You should concentrate on the book picture. Below are several books on hacking that discuss a wide variety of topics. They won't make you a hacker but they will give you insight into how your software migth be vulnerable.

More Programming languages

The coding interviews taking by tech-companies usually concentrate on just one language. The interviewee can choose from a small selection: usually C, Java or Python. However, there is a much larger world involving databases and web programming. Knowing other languages becomes vital just not for the interview. As an option choose one of the languages to round out your skills. I choose the Go programming language.

  • Go: Google's multi-paradigm language. Despite being compiled it feels a lot like a scripting language such as Python.
  • R: A language that focuses on statistical analysis, but it has lost most of its importance in Data Science in favor of Python.
  • Swift
  • Ruby
  • Julia: A new scientific general purpose language.
  • JavaScript

Cloud computing and containers

Currently I am pursuing the Coursera specialization Developing Applications with Google Cloud Platform. Being able to deploy your application and machine learning models quickly is a definite must.

Web Programming

More on REST APIs here

Learning Linux

I already covered a lot of this in my book - Linux, Programming and Hacking for Beginners - but it is something you will need to concentrate on.

The following videos are good to view if your job will involve Linux

Databases and SQL

NoSQL

Although relational databases are the bread and butter of data management there are other variants - such as NoSQL

Loose ends

Here is a list of skills I am working on to better represent myself. This list focuses on soft skills, neat office skills and some psychology.

  • EdClub
    • It may surprise you I never learned blind typing. I mostly type with my index and middle fingers. The typing speed is not bad but people have remarked on it.
  • Hot keys

Goals for 2021

In this final section I have created a wish list of things I want to investigate over the cause of the following year (or months). I will focus my attention on developing the four major themes of this page.

  • [Preparation for the Software Engineering Interview]
  • [Basics of Software Engineering: Data Structure, Algorithms and Graphs]
  • [Machine (Deep) Learning & Computer Vision]
  • [Penetration Testing & Security]

For 2021 I will focus more on actually creating implementations, beef up my coding skills so to speak. The following projects will be my focus, they can be found on my GitHub page

  • [A.I/O - Hacking / Coding game implemented with Java]
  • [AutoKeras - Deep Learning application to quickly create, train and deploy models with Keras and TensorFlow]
  • [Face Recognition application - a secretive project I have been working on and not currently available]
  • [Penetration Testing application & guide - my own take on how to perform enumeration]

Monthly Goals

January

  • []
  • [Breadth-first search]
  • [Decision Trees]
  • [NMAP & Renumeration]

February

  • []
  • [Depth-first search]
  • [Deep Learning Algorithm taxonomy]
  • [Viruses & worms]

March

  • []
  • [Dijkstra's Algorithm]
  • [Density-based clustering]
  • []

April

  • []
  • [Geometric Algorithms]
  • [CNN]
  • []

May

  • []
  • [String searching]
  • [Bayesian Networks]
  • []

June

  • []
  • [Multi-Threading]
  • [RNN]
  • []

July

  • []
  • [Bitwise Operations]
  • [Support Vector Machines]
  • []

August

  • []
  • [Object-Oriented Programming]
  • [GANs]
  • []

September

  • []
  • [Design Patterns]
  • [Neural Networks]
  • []

October

  • []
  • [Functional Programming]
  • [Gradient Descent]
  • []

November

  • []
  • [Compilers]
  • [Hierarchical Clustering]
  • []

December

  • []
  • [Operating Systems]
  • [Reinforcement Learning]
  • []