2025
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
With Chong You, Ananda Theertha Suresh, Robin Nittka, Felix Yu, and Sanjiv Kumar.
NeurIPS 2025
Approximating High-Dimensional Earth Mover's Distance as Fast as Closest Pair
With Lorenzo Beretta, Vincent Cohen-Addad, and Erik Waingarten.
FOCS 2025
Metric Embeddings Beyond Bi-Lipschitz Distortion via Sherali-Adams
With Ainesh Bakshi, Vincent Cohen-Addad, Samuel B. Hopkins, and Silvio Lattanzi.
COLT 2025
Randomized Dimensionality Reduction for Euclidean Maximization and Diversity Measures
With Jie Gao, Benedikt Kolbe, Shay Sapir, Chris Schwiegelshohn, Sandeep Silwal, and Erik Waingarten.
ICML 2025
Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search
With Laxman Dhulipala, Lars Gottesbüren, and Jakub Lacki.
VLDB 2025
Near-Optimal Spectral Density Estimation via Explicit and Implicit Deflation
With Rajarshi Bhattacharjee, Cameron Musco, Christopher Musco, and Archan Ray.
SODA 2025
Massively Parallel Minimum Spanning Tree in General Metric Spaces
With Amir Azarmehr, Soheil Behnezhad, Jakub Łącki, Vahab Mirrokni, and Peilin Zhong.
SODA 2025
2024
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
With Laxman Dhulipala, Majid Hadian, Jason Lee, and Vahab Mirrokni.
NeurIPS 2024
Efficient Centroid-Linkage Clustering
With MohammadHossein Bateni, Laxman Dhulipala, Willem Fletcher, Kishen N Gowda, D Ellis Hershkowitz, and Jakub Łącki.
NeurIPS 2024
Metric Clustering and MST with Strong and Weak Distance Oracles
With MohammadHossein Bateni, Prathamesh Dharangutte, and Chen Wang.
COLT 2024
Parallel and Sequential Hardness of Hierarchical Graph Clustering
With Mohammad Hossein Bateni, Laxman Dhulipala, Kishen Gowda, D Ellis Hershkowitz, and Jakub Lacki.
ICALP 2024
Dynamic PageRank: Algorithms and Lower Bounds
With Jakub Łącki, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski.
ICALP 2024
Data-Dependent LSH for the Earth Mover's Distance
With Erik Waingarten and Tian Zhang.
STOC 2024
HyperAttention: Long-Context Attention in Near-Linear Time
With Insu Han, Amin Karbasi, Vahab Mirrokni, David Woodruff, and Amir Zandieh.
ICLR 2024
Massively Parallel Algorithms for High-Dimensional Euclidean Minimum Spanning Tree
With Vahab Mirrokni, Shyam Narayanan, and Peilin Zhong.
SODA 2024
Fully Dynamic Consistent k-Center Clustering
With Christoph Grunau, Bernhard Haeupler, Jakub Łącki, and Václav Rozhoň.
SODA 2024
Streaming Algorithms with Few State Changes
With David Woodruff and Samson Zhou.
PODS 2024
2023
A Near-Linear Time Algorithm for the Chamfer Distance
With Ainesh Bakshi, Piotr Indyk, Sandeep Silwal, and Erik Waingarten.
NeurIPS 2023
Streaming Euclidean MST to a Constant Factor
With Vincent Cohen-Addad, Xi Chen, Amit Levi, and Erik Waingarten.
STOC 2023
Optimal Fully Dynamic k-Centers Clustering
With MohammadHossein Bateni, Hossein Esfandiari, and Vahab Mirrokni.
SODA 2023
Merged with Hendrik Fichtenberger, Monika Henzinger, and Andreas Wiese
Differentially Oblivious Relational Database Operators
With Lianke Qin, Elaine Shi, Zhao Song, Danyang Zhuo, and Shumo Chu.
VLDB 2023
2022
Stars: Tera-Scale Graph Building for Clustering and Learning
With CJ Carey, Jonathan Halcrow, Vahab Mirrokni, Warren Schudy, and Peilin Zhong.
NeurIPS 2022
New Streaming Algorithms for High Dimensional EMD and MST
With Xi Chen, Amit Levi, and Erik Waingarten.
STOC 2022
Truly Perfect Samplers for Data Streams and Sliding Windows
With David Woodruff and Samson Zhou.
PODS 2022
2021
An Optimal Algorithm for Triangle Counting in a Stream
With John Kallaugher.
APPROX 2021
Learning and Testing Junta Distributions with Subcube Conditioning
With Xi Chen, Amit Levi, and Erik Waingarten.
COLT 2021
In-Database Regression in Input Sparsity Time
With Alireza Samadian, David Woodruff, and Peng Ye.
ICML 2021
When is Approximate Counting for Conjunctive Queries Tractable?
With Marcelo Arenas, Luis Alberto Croquevielle, and Cristian Riveros.
STOC 2021
2020
Testing Positive Semi-Definiteness via Random Submatrices
With Ainesh Bakshi and Nadiia Chepurko.
FOCS 2020
A Framework for Adversarially Robust Streaming Algorithms
With Omri Ben-Eliezer, David Woodruff, and Eylon Yogev.
PODS 2020 & Journal of the ACM
PODS Best Paper Award 2020
Invited to Journal of the ACM
2021 SIGMOD Research Highlight
Invited to HALG 2021
Span Recovery for Deep Neural Networks with Applications to Input Obfuscation
With Qiuyi Zhang and David Woodruff.
ICLR 2020
2019
Optimal Sketching for Kronecker Product Regression and Low Rank Approximation
With Huaian Diao, Zhao Song, Wen Sun, and David Woodruff.
NeurIPS 2019
Towards Optimal Moment Estimation in Streaming and Distributed Models
With David Woodruff.
APPROX 2019
Learning Two Layer Rectified Neural Networks in Polynomial Time
With Ainesh Bakshi and David Woodruff.
COLT 2019
Efficient Logspace Classes for Enumeration, Counting, and Uniform Generation
With Marcelo Arenas, Luis Alberto Croquevielle, and Cristian Riveros.
PODS 2019 & Journal of the ACM
PODS Best Paper Award 2019
Invited to Journal of the ACM
2021 SIGMOD Research Highlight
Weighted Reservoir Sampling from Distributed Streams
With Gokarna Sharma, Srikanta Tirthapura, and David P. Woodruff.
PODS 2019