##About Tom Tang
###Using python:
Implemented a pixel server tracks 40+ millions pageviews/day across all Newscorp digital properties (news.com.au, dailytelegraph, taste.com.au, ... etc.).
Implemented Audience API supports enables newscrop to target customer on second pageview.
Pixel server + Audience API is a core part of our user-data / event realtime pipeline is the foundation of targeting advertisement business and user customization.
Implemented S3 clean up scripts using multi-process. Early version of awscli hangs when you have too many files in S3 bucket.
Maintain and enhance the user id Graph modeling tool to group all the related ID mappings.
Implemented a dynamodb ingestion tool ships segment attributes at 900 TPS with a ec2 t2.small instance.
Implemented a dynamodb table clone tool tops at 3500 TPS on a single EC2 m3.medium instance (tested with table with 50 millions records ), whereares the AWS offical data pipeline (beta version) failed to clone the tables in a EMR cluster.
Generate a couple daily reports that tracks accruacy of 3rd party data partner.
Adlog reporting, including identifing DFP feq-cap issues and internal identifier issues.
Probably the first one to use AWS Kinesis for project, as he found a blocking bug in boto (AWS official python sdk) and fix it and created pull-request to boto on github, and it was accepted.
They all run on AWS ec2 and heavly use AWS service ( SQS / Kinesis / DynamoDB / S3 / EC2-ELB / EC2-Autoscalegroup / SNS / RDS ) and some of the project run on docker container on top of EC2
What is mentioned above mostly implemented by himself.
domain | Projects | Comments |
devops | agile_conf | CFN generation using Ninja2 templates |
devops | provisioner | Server continus provisioning Using S3 / Gsutils |
tools | dynoclone | DynamoDB clone tool |
tools | senv | Secure your enviornment variables using Mac keychain |
Here is my other github projects or my slides.