Pinned Repositories
druid
Apache Druid: a high performance real-time analytics database.
DySparkExtensionOS
Make spark bring back partitioner when reading from hdfs/s3
elasticsearch-spark-offline
Elasticsearch snapshot preparation with spark
every-programmer-should-know
A collection of (mostly) technical things every software developer should know
hadoop-hbase-spark-playground
hadoop, hbase, spark playground. Contains simple vagrant(could be used on windows) + fabric file that setups single machine cluster(needs vagrant-fabric plugin)
spark-gotchas
Few things we've met during our etl project based on spark
sublime-avro
spark-tfrecord
Read and write Tensorflow TFRecord data from Apache Spark.
IgorBerman's Repositories
IgorBerman/spark-gotchas
Few things we've met during our etl project based on spark
IgorBerman/every-programmer-should-know
A collection of (mostly) technical things every software developer should know
IgorBerman/sublime-avro
IgorBerman/druid-complexaggs-extension
Druid extension to count occurrences of some metric as complex object
IgorBerman/dyn-allocation
Technical report of setting up dynamic allocation in Apache Spark for production jobs
IgorBerman/spark-bucketing
technique to optimise or remove shuffles
IgorBerman/bigquery-object-mapper
The BigQueryObjectMapper helps to map a POJO to a BQ Row and generate a BQ schema based on the POJO using reflection
IgorBerman/burry.sh
Cloud Native Infrastructure BackUp & RecoveRY
IgorBerman/cassandra_exporter
Apache Cassandra® metrics exporter for Prometheus
IgorBerman/directavro
direct output committer for avro key value format for spark newHadoopApi
IgorBerman/druid
Apache Druid: a high performance real-time analytics database.
IgorBerman/flink
Mirror of Apache Flink
IgorBerman/google-research
Google Research
IgorBerman/IgorBerman.github.io
IgorBerman/jupyterhub
Multi-user server for Jupyter notebooks
IgorBerman/ksqldbdemo
ksqldb playground
IgorBerman/lmdb-go
Bindings for the LMDB C library
IgorBerman/mssh-zsh
Open multiple ssh with tmux in zsh or iTerm
IgorBerman/package_control_channel
Default channel file for Package Control. Follow the directions at:
IgorBerman/parquet-mr
Apache Parquet
IgorBerman/RSA-Regression
Regularization Self-Attention Regression
IgorBerman/scipy_con_2019
Tutorial Sessions for SciPy Con 2019
IgorBerman/spark
Mirror of Apache Spark
IgorBerman/spark-tfrecord
Read and write Tensorflow TFRecord data from Apache Spark.
IgorBerman/sparkwatchdogs
simple implementation for spark watchdogs for etl processing
IgorBerman/statsd
Daemon for easy but powerful stats aggregation
IgorBerman/system-design-interview
System design interview for IT companies
IgorBerman/tensorflow-1-public
IgorBerman/TimeSeriesAnalysisWithPython
IgorBerman/Trigger-POC