Hey👋! I am Varun, a Pre-Doctoral SCAI-Center Fellow at Microsoft Research India, where I am fortunate to work with Dr. Kalika Bali and Dr. Sunayana Sitaram. Prior to this, I was a Master's Student at AI4Bharat and IIT Madras, where I was co-advised by Dr. Raj Dabre and Prof. Mitesh Khapra. For what feels like eons ago now, I did my undergraduate studies at BITS Hyderabad, where I graduated with a B.E in Computer Science and a Minor in Physics.

Research Interests

I am broadly interested in Natural Language Processing (NLP), mainly towards Multilinguality clubbed with Machine Translation, Model Efficiency, Reasoning and Evaluation of Large Language Models. Some specific areas and Research Questions I am interested in are:

🗎 Document-Level Machine Translation How can we build or modify exsisting Machine Translation models to translate entire documents reliablely and efficently? In a recent work, I explored a rudimentary of switching out Positional Embeddings of a standard pretrained Transformer with ones that favour length-generalization and showed that it can lead to significant improvements in long-context understanding and generation with minimal finetuning. I consequent question I am pondering on is how can we develop suitable evaluation metrics for document-level translation that can capture the nuances of the task and provide a more reliable measure of model performance, with more emphasis on the mitigating "translationese" and "coreference-resolution" issues that plague current metrics. With the ubiquity of Large Language Models (LLMs) and Instruction-Tuning, it would be interesting to also explore how much of these problems can be solved by LLMs with "Steerable" Generations.

⚙️ Model Efficiency I was first introduced to this problem when I was working on my Master's Thesis on Knowledge Distillation for Multilingual Machine Translation models, in which I explored how simple architectural variants like Extreme Parameter Sharing, Language-Specific Parameter Augmentation and Width-vs-Height trade-off can lead to various gains or drops in model performance at the cost of minimal parameter overhead (framework) Once again, with the advent of LLMs, I am interested to explore methods like Adaptive Tokenization, KV-Cache Compression, Mixture-of-Experts, and Efficient Long-Range Attention mechanisms to make training and inference of LLMs faster and computationally less expensive. a simple

⚖️ Evaluations The rampant consumption of data on the internet to train LLMs has lead to the "contamination" of standard banchmarks (EvalEval 2024). Hence it is vital to continually develop Evaluation benchmarks and Metrics for holistic evalation of these models. Standard Metrics are unfit for a multi-dimensional assessment creative and open-ended generations, which is when we have to resort of LLM-based Evaluators. However, in our previous works we found that there is a clear disparity in the performance of LLM-based evaluators across languages, tasks and perturbations (EACL 2024, NAACL 2024, EMNLP 2024, ArXiv 2024). Therefore, it is important to develop more robust and "rational" multilingual evaluators. I believe that inducing strong-reasoning abilities in LLMs could bolster the development of such evaluators, and I am interested in exploring how we can build and "meta-evaluate" such models that can reason over the output and provide a more reliable and interpretable evaluation.

For a full list of my publications you can have a look here. Please feel free to reach out to me over email if you have any questions about my research.

News

Teaching

Courses


Theme by Ankit Sultana. Website verbatim shamelessly copied from Kabir Ahuja

Description of the image