Rajesh Jayaram | Google Research NYC

I am a Research Scientist at Google NYC in the Algorithms and Optimization Group. I received my PhD in theoretical computer science at Carnegie Mellon in the summer of 2021, where I was fortunate to be advised by David Woodruff. Prior to that, I received my bachelor's from Brown University in May of 2017.

Research: I am interested primarily in sublinear algorithms and high-dimensional geometry, specifically sketching, streaming, and distributed algorithms for large scale computational problems. More broadly, I am interested in dimensionality reduction methods: namely, to what extent can we compress the significant components of an enormous, noisy data-set? My work also spans the areas of property testing, machine learning, and optimization.

My CV, Google Scholar, DBLP.

Teaching:

I taught as an Adjunct Professor at NYU's Tandon School of Engineering.

Spring 2022: NYU CS-GY 6763 - Algorithmic Machine Learning and Data Science

Workshops:

I co-organized the first Workshop on Algorithms for Large Data (Online) (WALDO 2021), which took pace on Monday, August 23 through Wednesday, August 25.

I co-organized the Workshop on Robust Streaming, Sketching, and Sampling in STOC 2021. A full recording of the workshop can be found here.

Dissertation:

Sketching and Sampling Algorithms for High-Dimensional Data. PhD Thesis, May 2021
Commitee: David Woodruff (advisor), Anupam Gupta (CMU), Andrej Risteski (CMU), Alexandr Andoni (Columbia), Jelani Nelson (Berkeley)

Preprints:

A Quasi-Polynomial Time Algorithm for Multi-Dimensional Scaling via LP Hierarchies
With Ainesh Bakshi, Vincent Cohen-Addad, Sameul B. Hopkins, and Silvio Lattanzi.
Preprint [arXiv]
Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search
With Laxman Dhulipala, Lars Gottesbüren, and Jakub Lacki.
Preprint [arXiv]
Metric Clustering and MST with Strong and Weak Distance Oracles
With MohammadHossein Bateni, Prathamesh Dharangutte, and Chen Wang.
Preprint [arXiv]

Publications:

Parallel and Sequential Hardness of Hierarchical Graph Clustering
With Mohammad Hossein Bateni, Laxman Dhulipala, Kishen Gowda, D Ellis Hershkowitz, and Jakub Lacki.
ICALP 2024
Dynamic PageRank: Algorithms and Lower Bounds
With Jakub Łącki, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski.
ICALP 2024
Data-Dependent LSH for the Earth Mover’s Distance
With Erik Waingarten and Tian Zhang.
STOC 2024 [arXiv]
HyperAttention: Long-Context Attention in Near-Linear Time
With Insu Han, Amin Karbasi, Vahab Mirrokni, David Woodruff, and Amir Zandieh.
ICLR 2024 [arXiv]
Massively Parallel Algorithms for High-Dimensional Euclidean Minimum Spanning Tree
With Vahab Mirrokni, Shyam Narayanan, and Peilin Zhong.
SODA 2024 [arXiv]
Fully Dynamic Consistent k-Center Clustering
With Christoph Grunau, Bernhard Haeupler, Jakub Łącki, and Václav Rozhoň.
SODA 2024 [arXiv]
Streaming Algorithms with Few State Changes
With David Woodruff and Samson Zhou.
PODS 2024
A Near-Linear Time Algorithm for the Chamfer Distance
With Ainesh Bakshi, Piotr Indyk, Sandeep Silwal, and Erik Waingarten.
NeurIPS 2023 [arXiv]
Streaming Euclidean MST to a Constant Factor
With Vincent Cohen-Addad, Xi Chen, Amit Levi, and Erik Waingarten.
STOC 2023 [arXiv]
Optimal Fully Dynamic k-Centers Clustering
With MohammadHossein Bateni, Hossein Esfandiari, and Vahab Mirrokni.
SODA 2023 [arXiv]
[Merged Paper] with Hendrik Fichtenberger, Monika Henzinger, and Andreas Wiese
Differentially Oblivious Relational Database Operators
With Lianke Qin, Elaine Shi, Zhao Song, Danyang Zhuo, Shumo Chu.
VLDB 2023 [arXiv]
Stars: Tera-Scale Graph Building for Clustering and Learning
With CJ Carey, Jonathan Halcrow, Vahab Mirrokni, Warren Schudy, and Peilin Zhong.
NeurIPS 2022 [arXiv]
New Streaming Algorithms for High Dimensional EMD and MST
With Xi Chen, Amit Levi, and Erik Waingarten.
STOC 2022 [arXiv], [Talk @ CMU Theory Lunch]
Truly Perfect Samplers for Data Streams and Sliding Windows
With David Woodruff and Samson Zhou.
PODS 2022 [arXiv]
An Optimal Algorithm for Triangle Counting in a Stream
With John Kallaugher.
APPROX 2021 [arXiv]
Learning and Testing Junta Distributions with Subcube Conditioning
With Xi Chen, Amit Levi, and Erik Waingarten.
COLT 2021 [arXiv]
In-Database Regression in Input Sparsity Time
With Alireza Samadian, David Woodruff, and Peng Ye.
ICML 2021 [arXiv]
When is Approximate Counting for Conjunctive Queries Tractable?
With Marcelo Arenas, Luis Alberto Croquevielle, and Cristian Riveros.
STOC 2021 [arXiv]
Testing Positive Semi-Definiteness via Random Submatrices
With Ainesh Bakshi and Nadiia Chepurko.
FOCS 2020 [arXiv], [Extended Abstract], [Talk @ WOLA'20], [Longer Talk]
A Framework for Adversarially Robust Streaming Algorithms
With Omri Ben-Eliezer, David Woodruff, and Eylon Yogev.
PODS 2020 and Journal of the ACM [arXiv]
PODS Best Paper Award, 2020
Invited to the Journal of the ACM
2021 ACM SIGMOD Research Highlight Award
Invited to HALG 2021
Span Recovery for Deep Neural Networks with Applications to Input Obfuscation
With Qiuyi Zhang and David Woodruff.
ICLR 2020 [arXiv], [Short Talk]
Optimal Sketching for Kronecker Product Regression and Low Rank Approximation
With Huaian Diao, Zhao Song, Wen Sun, and David Woodruff.
NeurIPS 2019 [arXiv] [Poster]
Towards Optimal Moment Estimation in Streaming and Distributed Models
With David Woodruff.
APPROX 2019 [arXiv]
Learning Two Layer Rectified Neural Networks in Polynomial Time
With Ainesh Bakshi and David Woodruff.
COLT 2019 [arXiv], [Talk @ COLT]
Efficient Logspace Classes for Enumeration, Counting, and Uniform Generation
With Marcelo Arenas, Luis Alberto Croquevielle, and Cristian Riveros.
PODS 2019 and Journal of the ACM [arXiv], [Talk @ CMU Theory Lunch], [SIGMOD Technical Perspective]
PODS Best Paper Award, 2019
Invited to the Journal of the ACM
2021 ACM SIGMOD Research Highlight Award
Weighted Reservoir Sampling from Distributed Streams
With Gokarna Sharma, Srikanta Tirthapura, and David P. Woodruff.
PODS 2019 [arXiv]
Perfect L_p Sampling in a Data Stream
With David Woodruff.
FOCS 2018 and SIAM Journal on Computing [arXiv]
Data Streams with Bounded Deletions
With David Woodruff.
PODS 2018 [arXiv]
Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!
With Barna Saha.
ICALP 2017 [Full Version], [Conference Version]

Miscellaneous:

Learning Stochastically Evolving Networks via Local Probing
Rajesh Jayaram, advised by Eli Upfal (Undergrad Thesis)
[pdf] [Defense Slides]

Prior Teaching

In the fall of 2019, I TA'd CS15-859 – Algorithms for Big Data at CMU, taught by David Woodruff.

In the spring of 2019, I TA'd CS15-451/651 – Algorithms at CMU, taught by David Woodruff and Anupam Gupta.

Teaching at Brown

In the fall of 2016, I was the Head TA for CS157 – Design and Analysis of Algorithms, taught by Paul Valiant.

In the spring of 2016, I was a TA for CS22 – Discrete Structures and Probability .