/i-want-to

Opinionated advices for recurring topics

I want to...

Opinionated advices for recurring topics

There are a lot of awesome-* things, but what do you actually use?

... pull data from multiple data sources and play around with them

Use Spark

... process data in a reproducible fashion without lossing track of spreadsheet cell dependencies

Use Pandas

... orchestrate a set of actions across several machines

Sorry. Though google/zx and Ansible are kind of close enough.

... recreate my servers' configuration from ground up in case of failure/change of IaaS provider/OS update

Use Ansible. Better yet, roll out Kubernetes

... throw together a script to perform some recurring task using a bunch of Unix tools I know well in understandable and supportable way

Sorry. Tough google/zx and Tcl feel promising

... create data visualizations fast

Use Jupyter and matplotlib

... keep my applications running despite flaky machines

Use Kubernetes

... understand Artificial Intelligence/Machine Learning/Deep Learning/Data Science/Big Data

In short:

  • DS = ways of working with data in general
  • AI = ways of making machines intelligent
  • DL = technique of training neural networks which are the epitome of connectionist approach to artificial intelligence; hyped due to greatly pushing the approach forward and sprawling wide spectrum of research around applicability of different network architectures towards specific problems
  • ML = generally any techniques where a machine learns something, included into AI and DS, includes DL
  • Big Data = tools and approaches on how to deal with large quantities of data (≈can't fit a single machine) on commodity hardware

... explore DL

Read

Play around with any one of TensorFlow, Keras, PyTorch. Hugging Face, maybe?

... explore self-driving

Go through Udacity's Arificial Intelligence for Robotics

... know which requests some pesky app is making

Use mitmproxy

... stop wasting time figuring out how to run an application

Use Docker

... stop wasting time figuring out how to build an application from the source code

Use make and build inside Docker

... keep some private data

Use VeraCrypt

... write some automation without caring about deployment or making sure it will be reliably run

Use repl.it, maybe? Or Heroku? Or AWS Lambda? Keywords: no-code, low-code, serverless

... send notifications from your machine to your email

Use postfix with custom transport using gmail-oauth2-tools

... shrink disk space used by duplicate virtual environments

Develop inside Docker. Produce images tailored for specific purposes (e.g. image for data science, image for django development, image for go development, etc)

... keep track of tasks to do

Use org-mode & beorg

... develop a habit

Use Nomie

... manage PKI infrastructure

Use Vault