I am a PhD student in the Computer Science department at the University of Oxford, advised by Prof. Varun Kanade and Prof. Phil Blunsom. My PhD is generously supported by Google DeepMind. I am primarily interested in understanding and improving machine learning algorithms for modelling languages. My work focuses on analyzing the expressiveness and learning abilities of neural network architectures in order to gain insights that can help us develop more effective models.

Before I joined Oxford, I spent 2 amazing years as a Research Fellow at Microsoft Research India, where I worked with Dr. Navin Goyal. Prior to that, I spent a wonderful semester working with Dr. Partha Talukdar in the Indian Institute of Science. I (used to) occasionally solve problems on websites like Kaggle and answer questions on stats.stackexchange forum. I graduated with B.E. (Hons.) in Computer Science and Int. M.Sc. (Hons.) in Biological Science from BITS Pilani, India in 2019. For more details, refer to my CV or drop me an email.

Selected Publications   Show All

Separations in the Representational Capabilities of Transformers and Recurrent Architectures
Satwik Bhattamishra, Michael Hahn, Phil Blunsom, Varun Kanade
NeurIPS 2024
pdf abstract

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade
ICLR 2024 Oral
pdf code abstract

On the Ability and Limitations of Transformers to Recognize Formal Languages
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
EMNLP 2020
pdf code abstract

On the Practical Ability of RNNs to Recognize Hierarchical Languages
Best Short Paper Award
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
COLING 2020
pdf code abstract

Tools

LibNMF
An easy to use python library with implementations of a set of tested optimization and regularization methods of NMF. Implemented Algorithms include graph regularized NMF, probabilistic NMF, a first-order primal-dual algorithm ...etc
Github

PyDPP
A python package available in pip with modules for sampling from Determinantal Point Processes (DPP). Contains implementations of algorithms to sample from DPPs that encourage diversity in the selection of a subset of points from a grounded superset.
Github

Service
Teaching
Reviewer   ICML 2024, 2023   ACL 2023, 2022   Neurips 2023, 2022   ICLR 2022   ACL Rolling Review   EMNLP 2022, 2021, 2020   NAACL 2021
Selected Talks
Language Modelling with Recurrent and State Space Architectures
Georgia Tech (SysML Guest Lecture) 11/2024
Simplicity Bias in Transformers
Formal Languages and Neural Networks (FLaNN) seminar 11/2023
On the Ability of Neural Sequence Models to Recognize Formal Languages
Google DeepMind NLP reading group 03/2022
MALL Lab at Indian Institute of Science 12/2020
BITS Pilani
2014 - 2019
Google Summer of Code
S2016
Indian Institute of Science
F2018
Microsoft Research India
2019 - 2021
University of Oxford
2021 - Present
  Template: Sebastin