Uthaipon (Tao) Tantipongpipat Personal Site

Research and Work Experience

I am a Lead Data Scientist at Agoda in customer-facing LLM products and ranking and recommendations. My work spans from analyzing raw data, creating data pipeline, developing machine learning models, to deploying models to production with engineers. I also strategize and communicate with managers and product owners to plan and drive team quarterly milestones.

My research topics at Twitter includes Fairness in Machine Learning and Human-Centered Computing (HCI). The work involves identification, evaluation, and mitigation of ML model bias, developing new rigorous statistical analysis including for uncommon situations, providing guidelines for company-wide engineers to utilize fairness metrics, and communication internally, cross-functionally, and externally through company blogs and academic publications.

I graduated from Algorithms, Combinatorics, and Optimization (ACO) PhD program at Georgia Institute of Technology, based in School of Computer Science. My research interests include machine learning algorithms, combinatorial optimization, differential privacy, and fairness in machine learning. I am grateful to be advised by Mohit Singh. Our current research has been finding better (randomized and deterministic) polynomial-time approximation algorithms for optimal design problems in statistics. The first algorithmic work is joint with Aleksandar (Sasho) Nikolov and Vivek Madan. My colleages Vivek Madan, Mohit Singh, Weijun Xie and I also first prove the theoretical guarantee of commonly used heuristics, namely local search (Fedorov exchange) and greedy algorithm in this work .

In addition, I work on differentially privacy. The first project is on growing databases with Rachel Cummings and Sara Krehbiel. Part of the work was presented at TPDP2017. Rachel and I also is a part of the team winning first prize and people's choice award ($20000 total) for NIST's privacy challenge . Our proposed solution is by differentially private generation of synthetic data via GANs and is presented at TPDP2018. More recently, I am exploring the practicality of employing differential privacy in training large deep learning models (during summer 2019 internship, hosted by Janardhan Kulkarni and Sergey Yekhanin).

My other direction of work is fairness in machine learning. In particular, my coauthors and I defined a notion of fairness in PCA (the blog on this notion) and proposed algorithms for two groups. We later improve the result to solving fairness over multiples groups and more general objective. The algorithms have proven theoretical guarantee and scale to large datasets. The implementation is publicly available on Github.

Background

During my undergraduate, I studied mathematics at University of Richmond. My undergraduate research was in discrete geometry and algebraic combinatorics, mainly on bent functions (coding theory) and partial different sets.

My undergraduate thesis, under the supervision of James Davis, was in the area of algebraic combinatorics, coding theory, and discrete geometry, specifically on Cameron-Liebler line classes and partial difference sets, which includes a new non-existence result of partial difference sets in a certain class of abelian groups.

I am originally from Bangkok, Thailand. I graduated high school from Bangkok Christian College. During the middle and high school period, I was involved in and very much enjoyed national and international mathematics competitions and many serious training that came with those.

Publications

For authors with *, the author order is alphabetical or authors have equal contributions.

County-level Algorithmic Audit of Racial Bias in Twitter's Home Timeline
Luca Belli, Kyra Yee, Uthaipon Tantipongpipat, Aaron Gonzales, Kristian Lum, Moritz Hardt.
Preprint, 2023.
Disparate Outcomes of Content Recommendation Algorithms with Distributional Inequality Metrics
Tomo Lazovich, Luca Belli, Aaron Gonzales, Amanda Bower, Uthaipon Tantipongpipat, Kristian Lum, Ferenc Huszar, Rumman Chowdhury.
Patterns Journal, 2022.
Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design
* Aleksandar Nikolov, Uthaipon Tantipongpipat, and Mohit Singh.
Mathematics of Operations Research 2022; ACM-SIAM Symposium on Discrete Algorithms (SODA) 2019.
Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency
* Kyra Yee, Uthaipon Tantipongpipat, and Shubhanshu Mishra.
Computer-Supported Cooperative Work and Social Computing (CSCW), 2021.
Twitter Blog | Code | CSCW Presentation | WIRED | Reuters | Protocol | Platformer | ZDNET | Houston 2 (TV) | CNN
Fast and Memory Efficient Differentially Private-SGD via JL Projections
* Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Judy Hanwen Shen, and Uthaipon Tantipongpipat.
Conference on Neural Information Processing Systems (NeurIPS), 2021.
λ-Regularized A-Optimal Design and its Approximation by λ-Regularized Proportional Volume Sampling
Uthaipon Tantipongpipat.
Preprint, 2020.
Differentially Private Mixed-Type Data Generation For Unsupervised Learning
Uthaipon Tantipongpipat, Chris Waites, Digvijay Boob, Amaresh Ankit Siva, and Rachel Cummings.
International Conference on Information, Intelligence, Systems and Applications (IISA) 2021.
Code on GitHub | 1st Place NIST privacy challenge
Maximizing Determinants under Matroid Constraints
* Vivek Madan, Aleksandar Nikolov, Mohit Singh, and Uthaipon Tantipongpipat.
Symposium on Foundations of Computer Science (FOCS) 2020.
Multi-Criteria Dimensionality Reduction with Applications to Fairness
Uthaipon Tantipongpipat, Samira Samadi, Jamie Morgenstern, Mohit Singh, and Santosh Vempala.
Conference on Neural Information Processing Systems (NeurIPS) 2019, spotlight (top 2.5% of submitted papers)
Code on GitHub | GT Press Release
Combinatorial Algorithms for Optimal Design
* Vivek Madan, Mohit Singh, Uthaipon Tantipongpipat, and Weijun Xie.
Conference on Learning Theory (COLT) 2019.
The Price of Fair PCA: One Extra Dimension
Samira Samadi, Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, and Santosh Vempala.
Conference on Neural Information Processing Systems (NeurIPS) 2018.
Website on fair PCA | Code on GitHub | Poster | GT Press Release
Differential Privacy for Growing Databases
* Rachel Cummings, Sara Krehbiel, Kevin Lai, and Uthaipon Tantipongpipat.
Conference on Neural Information Processing Systems (NeurIPS) 2018.
A Combinatorial Approach to Ebert's Hat Game with Many Colors
Uthaipon Tantipongpipat
The Electronic Journal of Combinatorics 21.4 (2014): P4-33.

Experiences

Microsoft Research intern, Redmond, WA. Summer 2019.
Supervisor: Janardhan Kulkarni and Sergey Yekhanin
Research topic: implementation of differential privacy in large-scale deep machine learning models (NLP) and privacy analysis of correlation clustering.

Theses

Fair and Diverse Data Representation in Machine Learning
PhD's thesis, May 2020. Link to the university thesis page.
Cameron-Liebler Line Classes and Partial Difference Sets
Undergraduate's thesis, May 2016. Link to the university thesis page.