I am a research scientist at Google Research. I obtained my Ph.D. in Computer Science from the University of Southern California, where I was fortunate to be advised by Aram Galstyan and Greg Ver Steeg. Prior to that, I received my M.S. and B.S. degrees in Applied Mathematics and Computer Science from Yerevan State University.
I do both applied and theoretical research on various aspects of deep learning, often taking an information-theoretic perspective. My main research direction is studying information stored in neural network weights or activations, and its connections to generalization, memorization, stability and learning dynamics. More broadly, I am interested in learning theory, generalization under domain shifts, unsupervised/self-supervised representation learning, and in the generalization phenomenon of deep neural networks.
- [July 17, 2023] Excited to share that I have joined Google Research NYC as a research scientist.
- [June 16, 2023] I have graduated from USC with a PhD in Computer Science!
- [Jan 21, 2023] Our work “Supervision Complexity and its Role in Knowledge Distillation” was accepted to ICLR 2023.
- [Jan 11, 2023] I am invited to the Rising Stars in AI Symposium 2023 at KAUST in Saudi Arabia (Feb. 19-21).
- [Aug 3, 2022] Our work “Formal limitations of sample-wise information-theoretic generalization bounds” was accepted to the 2022 IEEE Information Theory Workshop conference.
- [May 16, 2022] Started a summer internship at Google Research, New York. Will be working with Ankit Singh Rawat and Aditya Menon.
- [March 2, 2022] Our work “Failure Modes of Domain Generalization Algorithms” was accepted to CVPR 2022.
- [Sept. 28, 2021] Our work “Information-theoretic generalization bounds for black-box learning algorithms” was accepted to NeurIPS 2021.
- [May 17, 2021] Started a summer internship at AWS Custom Labels team. Will be working with Alessandro Achille and Avinash Ravichandran.
- [Jan. 12, 2021] Our work "Estimating informativeness of samples with Smooth Unique Information" got accepted to ICLR 2021.
- [Oct. 20, 2020] Received a free NeurIPS 2020 registration by making it to the list of the top 10% of high-scoring reviewers.
- [June 3, 2020] Our work "Improving generalization by controlling label-noise information in neural network weights" got accepted to ICML 2020.
- [May 18, 2020] Starting a summer internship at AWS Custom Labels team. Going to work with Alessandro Achille, Avinash Ravichandran, and Orchid Majumder!
- [Jan. 3, 2020] I will be TA-ing CSCI 270: "introduction to algorithms and theory of computing" taught by Prof. Shahriar Shamsian this spring.
- [Oct. 1, 2019] Our work titled "Reducing overfitting by minimizing label information in weights" got accepted to NeurIPS'19 information theory and machine learning workshop.
- [Sept. 3, 2019] Our paper "Fast structure learning with modular regularization" got accepted to NeurIPS'19 as a spotlight presentation.
- [Aug. 15, 2019] I will be the teaching assistant of CSCI 670: "advanced analysis of algorithms" taught by Prof. Shang-Hua Teng this fall.
Publications and preprints
Supervision Complexity and its Role in Knowledge Distillation
ICLR 2023, [paper, bibTeX]
Formal limitations of sample-wise information-theoretic generalization bounds
IEEE Information Theory Workshop 2022 [arXiv, bibTeX]
Failure Modes of Domain Generalization Algorithms
CVPR 2021 [arXiv, code 1 2, bibTeX]
Information-theoretic generalization bounds for black-box learning algorithms
NeurIPS 2021 [arXiv, code, bibTeX]
Estimating informativeness of samples with smooth unique information
ICLR 2021 [arXiv, code, bibTeX]
Improving generalization by controlling label-noise information in neural network weights
ICML 2020 [arXiv, code, bibTeX]
Fast structure learning with modular regularization
NeurIPS'19 [arXiv, code, bibTeX]
Efficient Covariance Estimation from Temporal Data
arXiv preprint [arXiv, code, bibTeX]
Mixhop: Higher-order graph convolution architectures via sparsified neighborhood mixing
ICML'19 [arXiv, code, bibTeX]
Multitask learning and benchmarking with clinical time series data
Nature, Scientific data 6 (1), 96 [arXiv, code, bibTeX]
Disentangled representations via synergy minimization
Allerton'17 [arXiv, bibTeX]