Pinned Repositories
kinesis-analytics-data-generator
Shows how to use the Kinesis Data Generator to send 1000's of test records per second to Kinesis Analytics and do real-time SQL processing on the data stream
marketdatadownloading
An ebook describing how to download financial market data from Bloomberg, Reuters , Markit and the web
read-big-file-aws-athena-glue
Continuing with my case study on reading a big data file, this is the fifth part of my trilogy :-) on how I got on reading a big'ish file with C, Python, spark-python and spark-scala, AWS Elastic Map reduce and AWS Athena.
read-big-file-with-python
The first part of a case study in reading a large (21GB) text file with python.
read-big-parquet-file-aws-athena
A comparison between reading a big text data file and its parquet equivalent using AWS Athena
read-file-with-s3-select
How to read a text file on S3 using AWS S3 SELECT
read-write-kinesis-stream
A small example of reading and writing an AWS kinesis stream with python lambdas
spark-performance-tip-avoid-inferSchema
Increase the performance of reading big text files in Spark Sql by avoiding the inferSchema option
spark-tip-find-malformed-records
Finding and dealing with malformed records when processing CSV data in Spark SQL
using-aws-step
Shows the use of AWS Step functions and how you can call them from a python lambda
taupirho's Repositories
taupirho/read-write-kinesis-stream
A small example of reading and writing an AWS kinesis stream with python lambdas
taupirho/read-big-file-with-python
The first part of a case study in reading a large (21GB) text file with python.
taupirho/spark-performance-tip-avoid-inferSchema
Increase the performance of reading big text files in Spark Sql by avoiding the inferSchema option
taupirho/spark-tip-find-malformed-records
Finding and dealing with malformed records when processing CSV data in Spark SQL
taupirho/read-file-with-s3-select
How to read a text file on S3 using AWS S3 SELECT
taupirho/marketdatadownloading
An ebook describing how to download financial market data from Bloomberg, Reuters , Markit and the web
taupirho/read-big-file-with-spark-python
The second part of a case study in reading a big data file. This time we read the same big file we read with python but using Spark.
taupirho/using-aws-step
Shows the use of AWS Step functions and how you can call them from a python lambda
taupirho/calculate-a-hedged-index-simple
Calculate a hedged version of a simplified index
taupirho/ml-decisiontree
iPython notebook showing a use case for decision trees
taupirho/read-big-file-with-amazon-emr
Continuing with my case study on reading a big data file, this is the fourth part in my increasingly innacurately named trilogy on how I got on reading a bigg'ish file with C, Python, spark-python and spark-scala.
taupirho/read-big-file-with-spark-scala
The third and last of my series on how I got on reading a big file with C, Python, spark-python and this - spark-scala
taupirho/scala-spark-on-pc
Setting up your PC for SCALA/Spark development and running your first code
taupirho/kinesis-analytics-data-generator
Shows how to use the Kinesis Data Generator to send 1000's of test records per second to Kinesis Analytics and do real-time SQL processing on the data stream
taupirho/read-big-file-aws-athena-glue
Continuing with my case study on reading a big data file, this is the fifth part of my trilogy :-) on how I got on reading a big'ish file with C, Python, spark-python and spark-scala, AWS Elastic Map reduce and AWS Athena.
taupirho/read-big-parquet-file-aws-athena
A comparison between reading a big text data file and its parquet equivalent using AWS Athena
taupirho/aws-data-wrangler
Pandas on AWS
taupirho/calculate-a-hedged-index-complex
A JAVA program to hedge an index in a forward currency
taupirho/cirrusdata.github.io
taupirho/dream-catcher
A streamlit app that interprets user's dreams
taupirho/factorial
testing some CI/CD with github actions
taupirho/getting-fluent-in-vi
A few useful commands to help you become fluent in the Unix text editor vi
taupirho/mesop
Build delightful web apps quickly in Python
taupirho/ml-super-mult
A bunch of supervised learning techniques applied to credit card default data set
taupirho/read-twitter-java
Read twitter feed using JAVA
taupirho/setting-up-aws-billing-alerts
How to set up AWS billing alerts so you don't get faced with an unexpected bill
taupirho/spark-tip-using-ranking-analytics
Using ranking analytic functions in Pyspark SQL
taupirho/stock_prices
taupirho/text-to-parquet-with-aws-glue
A zero coding approach to converting text files to Parquet format using AWS Glue
taupirho/text-to-parquet-with-spark
Shows how to convert text files to columnar Parquet format using Spark and Scala