I am a PhD student in the Computer Science department at the University of Oxford, advised by Prof. Phil Blunsom and Prof. Varun Kanade. My PhD is generously supported by Google DeepMind. I am primarily interested in understanding and improving machine learning algorithms for modelling languages. My work focuses on analyzing the expressiveness and learning abilities of neural network architectures in order to gain insights that can help us develop more effective models.

Before I joined Oxford, I spent 2 amazing years as a Research Fellow at Microsoft Research India, where I worked with Dr. Navin Goyal. Prior to that, I spent a wonderful semester working with Dr. Partha Talukdar in the Indian Institute of Science. I (used to) occasionally solve problems on websites like Kaggle and answer questions on stats.stackexchange forum. I graduated with B.E. (Hons.) in Computer Science and Int. M.Sc. (Hons.) in Biological Science from BITS Pilani, India in 2019. For more details, refer to my CV or drop me an email.
  Google Scholar|   Semantic Scholar

Separations in the Representational Capabilities of Transformers and Recurrent Architectures
Satwik Bhattamishra, Michael Hahn, Phil Blunsom, Varun Kanade
pdf abstract

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade
ICLR'24 [Oral]
pdf code abstract

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom
pdf code abstract

DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Amey Agrawal, Sameer Reddy, Satwik Bhattamishra, Venkata Prabhakara Sarath Nookala, Vidushi Vashishth, Kexin Rong, Alexey Tumanov
pdf abstract

Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal
pdf code abstract

Are NLP Models really able to Solve Simple Math Word Problems?
Arkil Patel, Satwik Bhattamishra, Navin Goyal
pdf code abstract article

On the Practical Ability of RNNs to Recognize Hierarchical Languages
Best Short Paper Award
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
pdf code abstract

Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities
Pratik Joshi, Christain Barnes, Sebastin Santy, Simran Khanuja, Sanket Shah, Anirudh Srinivasan, Satwik Bhattamishra, Sunayana Sitaram, Monojit Choudhury, Kalika Bali
pdf abstract cite

Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation
Ashutosh Kumar*, Satwik Bhattamishra*, Manik Bhandari, Partha Talukdar
NAACL'19 [Oral]
pdf code abstract


An easy to use python library with implementations of a set of tested optimization and regularization methods of NMF. Implemented Algorithms include graph regularized NMF, probabilistic NMF, a first-order primal-dual algorithm ...etc

A python package available in pip with modules for sampling from Determinantal Point Processes (DPP). Contains implementations of algorithms to sample from DPPs that encourage diversity in the selection of a subset of points from a grounded superset.

Reviewer   ICML 2024, 2023   ACL 2023, 2022   Neurips 2023, 2022   ICLR 2022   ACL Rolling Review   EMNLP 2022, 2021, 2020   NAACL 2021
BITS Pilani
2014 - 2019
Google Summer of Code
A*STAR, Singapore
Indian Institute of Science
Microsoft Research India
2019 - 2021
University of Oxford
2021 - Present
  Template: Sebastin