Satwik Bhattamishra

I am a PhD student in the Computer Science department at the University of Oxford, advised by Prof. Varun Kanade and Prof. Phil Blunsom. My PhD is generously supported by Google DeepMind. I am broadly interested in the intersection of theoretical and empirical/scientific understanding of deep learning models. Most of my research focuses on analyzing the expressiveness and algorithmic learning abilities of neural network architectures to gain insights that can help us develop more effective models.

Keywords: Expressivity, Science of Deep Learning, Algorithmic Reasoning, Transformers, RNNs/SSMs

Last winter, I was a student researcher at Google in Sunnyvale, where I worked on improving LLM agents. Over the last two summers, I interned at Cohere, where I worked on pretraining LLMs with non-Transformer architectures. Before joining Oxford, I spent two amazing years as a Research Fellow at Microsoft Research India, where I worked with Dr. Navin Goyal. Prior to that, I spent a wonderful semester working with Dr. Partha Talukdar at the Indian Institute of Science. I graduated with a B.E. (Hons.) in Computer Science and an Int. M.Sc. (Hons.) in Biological Science from BITS Pilani, India, in 2019. For more details, refer to my CV or drop me an email.

Selected Publications Show All

Google Scholar | Semantic Scholar

Separations in the Representational Capabilities of Transformers and Recurrent Architectures
Satwik Bhattamishra, Michael Hahn, Phil Blunsom, Varun Kanade
NeurIPS 2024
pdf abstract

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade
ICLR 2024 Oral
pdf code abstract

On the Ability and Limitations of Transformers to Recognize Formal Languages
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
EMNLP 2020
pdf code abstract

On the Practical Ability of RNNs to Recognize Hierarchical Languages
Best Short Paper Award
Satwik Bhattamishra, Kabir Ahuja, Navin Goyal
COLING 2020
pdf code abstract

Tools

LibNMF
An easy to use python library with implementations of a set of tested optimization and regularization methods of NMF. Implemented Algorithms include graph regularized NMF, probabilistic NMF, a first-order primal-dual algorithm ...etc
Github

PyDPP
A python package available in pip with modules for sampling from Determinantal Point Processes (DPP). Contains implementations of algorithms to sample from DPPs that encourage diversity in the selection of a subset of points from a grounded superset.
Github

Service

Teaching

Departmental Tutor for Computational Learning Theory, University of Oxford Fall 2023
Departmental Tutor for Computational Learning Theory, University of Oxford Fall 2022
Teaching assistant for Neural Networks and Fuzzy Logic, BITS Pilani Spring 2018

Reviewer ICML 2024, 2023 ACL 2023, 2022 Neurips 2023, 2022 ICLR 2022 ACL Rolling Review EMNLP 2022, 2021, 2020 NAACL 2021

Selected Talks

Language Modelling with Recurrent and State Space Architectures

[Video] [Slides]

Georgia Tech (SysML Guest Lecture) 11/2024

Simplicity Bias in Transformers

[Video] [Slides]

Formal Languages and Neural Networks (FLaNN) seminar 11/2023

On the Ability of Neural Sequence Models to Recognize Formal Languages

[Slides]

Google DeepMind NLP reading group 03/2022

MALL Lab at Indian Institute of Science 12/2020

BITS Pilani

2014 - 2019

Microsoft Research India

2019 - 2021

University of Oxford

2021 - Present

Cohere

S2023 & S2024

Google

F2024

Template: Sebastin