##About Tom Tang
###Using python:
-
Implemented a pixel server tracks 40+ millions pageviews/day across all Newscorp digital properties (news.com.au, dailytelegraph, taste.com.au, ... etc.).
Implemented Audience API supports enables newscrop to target customer on second pageview.
-
Pixel server + Audience API is a core part of our user-data / event realtime pipeline is the foundation of targeting advertisement business and user customization.
-
Implemented S3 clean up scripts using multi-process. Early version of awscli hangs when you have too many files in S3 bucket.
-
Maintain and enhance the user id Graph modeling tool to group all the related ID mappings.
-
Implemented a dynamodb ingestion tool ships segment attributes at 900 TPS with a ec2 t2.small instance.
-
Implemented a dynamodb table clone tool tops at 3500 TPS on a single EC2 m3.medium instance (tested with table with 50 millions records ), whereares the AWS offical data pipeline (beta version) failed to clone the tables in a EMR cluster.
-
Generate a couple daily reports that tracks accruacy of 3rd party data partner.
-
Adlog reporting, including identifing DFP feq-cap issues and internal identifier issues.
-
Probably the first one to use AWS Kinesis for project, as he found a blocking bug in boto (AWS official python sdk) and fix it and created pull-request to boto on github, and it was accepted.
-
They all run on AWS ec2 and heavly use AWS service ( SQS / Kinesis / DynamoDB / S3 / EC2-ELB / EC2-Autoscalegroup / SNS / RDS ) and some of the project run on docker container on top of EC2
-
What is mentioned above mostly implemented by himself.
domain | Projects | Comments |
---|---|---|
devops | agile_conf | CFN generation using Ninja2 templates |
devops | provisioner | Server continus provisioning Using S3 / Gsutils |
tools | dynoclone | DynamoDB clone tool |
tools | senv | Secure your enviornment variables using Mac keychain |
Here is my other github projects or my slides.