Arnav Balyan

Senior Software Engineer | Uber

Resume

About Me


Senior Software Engineer with Uber, working on Distributed Systems and AI (2+ years). Working with MIT Media Lab, Harvard Edge Computing Lab, and UC Berkeley and Stanford on AI and Systems.

  • Hold patents and authored around 10 papers in systems, big data, and artificial intelligence.
  • Contributor to Apache Pinot, Gluten, Spark, Velox, designed and developed systems that serve 500K+ RPS and over 28M+ customers daily. Represented Uber as speaker in several talks, engineering summits and meetups.
  • National Level Government of India SIH Award Holder (<1%), hold several closed source contributions to Hadoop & Spark.

Experience


Uber Logo

Uber Technologies

Sr. Software Engineer - - Present

Worked on patents for high cardinality data and open source contributor to Apache Pinot, Spark, Gluten, Velox. Creating AI solutions for cost efficiency in big data.

  • Designed and developed Uber's next-gen Data Observability stack handling over 2.5M+ compute jobs daily.
  • Developed novel AI and distributed solutions for big data compute and storage; created features for Uber's Spark and Hadoop stack serving 100K+ QPS and 20K+ data pipelines.
Distributed Systems Big Data Apache Spark Hadoop C++ Data Observability
Harvard University Logo

Harvard University

Edge Computing Lab - Researcher [ - ]

Edge Computing Lab - Large Language Models, researching on parameter efficient fine tuning and adapting LLMs to domain specific usecases.

  • Built large language models for computer architecture usecases, data collection of QuArch dataset which represents 50+ years of computer architecture knowledge. (Paper in progress)
  • Contributed to open-source frameworks - Axolotl and Eval Harness; fine-tuned models like Llama, Mistral, Gemma, etc.
Machine Learning Large Language Models Computer Architecture Open Source Python
Placeholder image

Massachusetts Institute of Technology

MIT Media Lab - Visiting Student [ - ]

MIT Media Lab - Research in deep learning models for signal (EEG) data. Created SOTA models for classification of mental states.

  • Artificial Intelligence for Brain Computer Interfaces (BCIs). Supporting ease of communication for patients suffering from neurological disorders such as Amyotrophic Lateral Sclerosis (ALS).
  • Classification and detection of mental states using highly temporal and spatial EEG data. (Published 2 papers)
Machine Learning BCI Neuroscience EEG
UC Berkeley Logo

UC Berkeley

Skylab - Research Collaborator [ - ]

Research in distributed systems, cloud computing and creating cost efficient solutions for big data.

  • Developing on SkyShift, a compatibility layer over resource managers like Kubernetes and Slurm, addressing scheduling problems in distributed systems.
  • Created Spark jobs and load balancers, deploying them for resource manager agnostic deployments.
Distributed Systems Kubernetes Spark Resource Management Cloud Computing
SGLang Logo

SGLang Project

Research Collaborator [ - Present]

Worked on SGLang with researchers from Stanford University and UC Berkeley (fast-serving of large language models).

  • Created custom memory pools for offloading KV cache from GPU to CPU, increasing serving throughput and reducing memory requirements for LLMs.
  • Improving the throughput and serving capabilities making LLMs more efficient in distributed settings.
Machine Learning Large Language Models GPU Programming CUDA PyTorch
Placeholder image

Defence Research and Development Organisation (DRDO)

Intern - -

The Defence Research and Development Organisation is an agency of the Government of India, for military's research and development, researched on language modelling.

  • Created Artificial Intelligence based systems such as Natural Language driven chatbots.
  • Integrated these with the main website and backend databases such as PostgreSQL.
Python NLP NLU NLTK Devops
Placeholder image

Brown Eagle B.V.

Machine Learning

Brown Eagle Inc., builds next generation e-commerce platforms, based on adaptive technology with advance personalized communications and recommendations.

  • Created customized Web Scraping systems designed to crawl target data source websites.
  • Integrated the fully automated Data Acquisition Layer with the Internal Systems and dashboards.
Python NLP NLU NLTK Devops
Placeholder image

KPMG International Cooperative

Data Analytics Consulting Intern (Virtual) - -

KPMG International Cooperative is a multinational professional services network, and one of the Big Four accounting organizations based in Amstelveen, the Netherlands.

  • Analysed and cleaned over 40,000 customers' data and worked on multiple datasets.
  • Performed mutivariate and predictive analysis on the data, hence gathering deep insights and presenting them.
Python Pandas Tableu Excel Power BI
Placeholder image

Decodr Technologies

Product Research Intern - -

Decodr Technologies empower the young generation with future technology skills.They provide training and assessment programs which are designed by industry experts making it relevant for the students. They are based in New Delhi, India

  • Created Machine Learning and Artificial Intelligence based projects such as Handwritten digit identification, Web Scrapers and Object Detection using algorithms like CNNs on datasets like CIFAR and MNIST.
  • Research about the existing products based on similar lines in the industry.
Selenium Python Pandas Numpy Matplotlib Devops
Placeholder image

Startup 201

Data Science Intern - -

An organisation which works on different modules related to a startup and have in-depth knowledge of growing and expanding start-ups. Primary aim is to help startups grow on different social media platforms. (Paper in progress)

  • Created web-scrapers to gather and analyse data from professional websites such as LinkedIn.
  • Fixing website vulnerabilities to bots and crawlers by integrating techniques like reCAPTCHA into the website.
Beautiful Soup Selenium Python HTML CSS
Placeholder image

Coding Ninjas

Teaching Assistant - -

Coding Ninjas is one of the largest online tech education company in India, focusing on courses on C++, Java, Python, Android, Machine Learning, Data science, WebDev, interview prep, tech aptitude etc.

  • Debugging codes on a day to day basis and troubleshooting errors.
  • Mentoring Students, monitoring their performance and taking daily doubts.
C C++ Java Python Debug

Projects


Health AI

Causal Language Modeling

Ad-hoc Networks

Realtime RCNNs

Object Detection

Web Scraping

Research Papers / Publications


Blockchain-based Rumor Detection Approach for COVID-19

Journal of Ambient Intelligence and Humanized Computing 15 (1), 435-449, 2024 | Impact Factor: 7.1

Blockchain Rumor Detection COVID-19 Artificial Intelligence

The Hybrid Deep Learning Model for Identification of Attention-Deficit/Hyperactivity Disorder Using EEG

Journal of Clinical EEG and Neuroscience 55 (1), 22-33, 2024

Deep Learning EEG ADHD Neuroscience

Decoding Visual Imagery Using EEG/EOG Glasses: A Pilot Study

Proceedings of the Future Technologies Conference, pp. 415-432, 2022 | Best Paper Award

EEG EOG Visual Imagery BCI

Identification of ADHD Disorder in Children Using EEG Based on Visual Attention Task by Ensemble Deep Learning

Proceedings of International Conference on Data Science and Applications, 2023

Ensemble Learning EEG ADHD Children

A Secure Epidemic Routing Using Blockchain in Opportunistic Internet of Things

Data Analytics and Management: Proceedings of ICDAM, pp. 101-110, 2021

Blockchain IoT Epidemic Routing Security

A Probabilistic Routing-Based Secure Approach for Opportunistic IoT Network Using Blockchain

2020 IEEE 17th India Council International Conference (INDICON), pp. 1-7, 2020

Blockchain IoT Routing Protocols Security

Decoding Visual Covert Attention Shift from EEG for Use in BCI

ICT Systems and Sustainability: Proceedings of ICT4SD 2021, Volume 1, pp. 883-893, 2022

EEG Attention Shift BCI Neuroscience

Target Speaker Detection with EEG/EOG Glasses: A Pilot Study

Proceedings of the Future Technologies Conference, pp. 433-446, 2022

EEG EOG Speaker Detection Pilot Study

Education / Training


University of Delhi, New Delhi, India

Bachelor of Engineering - -

  • Bachelor of Engineering focused in Computer Engineering from Netaji Subhas Institute of Technology.

    Tagore International School, Vasant Vihar, New Delhi, India

    AISSCE/CBSE (Class XII) -

  • Completed Higher Secondary Education (Class XII).

    Contact