ThachNgocTran
I'm a software engineer, major in Communication and Computer Security, with experience in Big Data technologies and background in Data Science.
iATROS GmbHMunich, Germany
Pinned Repositories
ApacheSparkRDDPersist
Benchmark the RDD with and without using persist().
AutomaticWebsiteBooking
A tool that helps automate the checking of availability and the booking. It's website-specific.
CodingPractice
Practicing coding small puzzles in a variety of programming languages.
ConnectionHandler
Connection Handler for clients toward a variety of servers (Amazon S3, Amazon DynamoDB, MongoDB, RServe, SQL Server, ElasticSearch)
ExtractExcelDataInParallel
A small script to extract Excel Data in parallel into Dataframes, using xlwings and Python's multiprocessing.
GetSystemInfo
A quick way to get common system information
HyperLogLogInApacheHive
Integrate HyperLogLog into Apache Hive under the form of User-Defined Aggregate Function (UDAF). Written in JAVA.
KatzBackOffModelImplementationInR
Katz's Backoff Model implementation in R
PredictLiverPatientWithRandomForestAndLogisticRegression
Predict Liver Patient With Random Forest And Logistic Regression. Written in Python (with Scikit, Pandas, Seaborn).
YelpDataset2Neo4j
Yelp Dataset is imported into Neo4j's Graph Database.
ThachNgocTran's Repositories
ThachNgocTran doesn’t have any repository yet.