Ayman Mahfuz

Ayman Mahfuz

Computer Science & Mathematics Student | Aspiring Software Engineer & Machine Learning Researcher

About Me

Hey! I'm Ayman Mahfuz, a Computer Science and Mathematics student at the University of Texas at Austin, specializing in AI and Machine Learning. My passion for problem-solving and technology started early, and today, I focus on reinforcement learning, large-scale ML systems, and software engineering to build AI that is both intelligent and efficient.

I conduct research in multiagent reinforcement learning and high-performance machine learning, optimizing AI decision-making in real-time environments. My work spans deep reinforcement learning (DRL), distributed AI systems, and scalable ML architectures. Beyond research, I’ve built production-grade software—developing robust data pipelines, training state-of-the-art models, and deploying full-stack AI applications.

I bring a strong **systems and AI engineering** mindset, with expertise in Python, Java, C, C++, JavaScript, and SQL. I work with AI frameworks like PyTorch, TensorFlow, and optimize AI for high-performance computing (HPC) on cloud platforms like AWS and GCP.

Outside of engineering and research, I play soccer, lift weights, and stay involved with the Muslim Students Association (MSA). I’m always looking for ways to push AI forward—whether through research, software, or real-world applications.

Education

The University of Texas at Austin

Bachelor of Science

Location: Austin, TX, USA

Double Major: Computer Science, Mathematics

Minor: Business

Concentration: Artificial Intelligence and Machine Learning

Relevant Coursework:

  • Science of High Performance Computing
  • Data Structures
  • Computer Architecture and Organization
  • Computer Systems and Operating Systems
  • Algorithms
  • Linear Algebra
  • Probability

Skills

Programming Languages

Python Java C JavaScript HTML/CSS Ruby C++ PHP

Frontend Development

React.js Node.js HTML/CSS

Backend Development

Flask Django Node.js

Data Science & Machine Learning

Pandas NumPy Scikit-learn

Databases

SQL PostgreSQL

Tools & Libraries

Git AWS Google Cloud Platform

Miscellaneous

ARM64 MATLAB

Work Experience

Research Assistant

University of Texas – Aritificial Intelligence Lab, Texas Robotics | Austin, TX • Jan 2025 – Present

I develop multiagent reinforcement learning models for robotic soccer, optimizing AI coordination and decision-making in dynamic, real-time environments. My work integrates deep reinforcement learning (DRL) and supervised learning (SL) to train agents in 2D and 3D simulations, ensuring effective sim-to-real transfer for deployment on physical NAO robots. By enhancing distributed AI communication and optimizing system-level execution, I contribute to improving autonomous robotics strategies in high-speed, competitive settings.

  • Developing multiagent RL models for NAO robot soccer, applying deep reinforcement learning (DRL) and supervised learning (SL) to train agents in 2D & 3D simulations, then optimizing sim-to-real transfer for deployment on physical robots
  • Optimizing real-time decision-making in a 400K+ line C++ robotics codebase, improving distributed AI coordination for millisecond-level reactions while handling low-latency constraints & real-time system scheduling

Data Engineer Intern

The Sunwater Institute | North Bethesda, MD • Jan 2025 – Present

I work on the Legis-1 Platform. I build high-performance data pipelines to support AI-driven policy research, processing legislative documents at scale to power structured retrieval and automated analysis. My contributions include developing large-scale LLM pipelines for AI-generated news and policy insights, leveraging retrieval-augmented generation (RAG) and embeddings to analyze 500K+ legal records. By optimizing storage efficiency and retrieval speed, I enhance the AI-readiness of structured legislative data.

  • Building high-performance data pipelines for Legis-1, a legislative database with millions of legal documents, optimizing retrieval speed, storage efficiency, and AI-readiness for structured data.
  • Developing LLM pipelines to power AI-generated news and policy analysis, leveraging retrieval-augmented generation (RAG), embeddings, and scalable document processing across 500K+ records.

Software Engineer, Research Assistant

University of Texas – Center for Media Engagement | Austin, TX • Sep 2023 – Present

I conduct media research by studying how people interact with news, platforms, and each other, by designing systems and using machine learning to evaluate political opinions, storytelling patterns, and societal divides. My contributions include building a 150-million-entry dataset, developing DistilBERT models for large-scale analysis, and designing the full system architecture for MTurk-integrated React games to track and analyze user behavior.

  • Engineered large-scale data pipelines to scrape, preprocess, and upload 50M+ news articles and 70M+ comments to BigQuery, using APIs, sitemaps, and Pandas, while developing dashboards with Python and SQL for real-time monitoring.
  • Led machine learning initiatives by fine-tuning multiple BERT models for key NLP tasks—including clickbait detection, story identification, entity recognition, and sentiment analysis—achieving up to 99% accuracy
  • Designed & deployed a research platform independently with React, Flask, and Firebase, featuring 3 interactive games, MTurk integration, real-time analytics tracking 15+ metrics, and 99.99% uptime, serving 1,000+ participants.

Machine Learning Engineer, Research Assistant

University of Texas - Oden Institute for Computational Engineering and Sciences | Austin, TX • Feb 2024 – Jan 2025

I used the latest computer vision technology to advance pancreas and organ segmentation, applying state-of-the-art models like MedSAM2 to preprocess 3D MRI data. This work aimed to identify early indicators of diabetes and pancreatic cancer, enabling earlier and more accurate diagnoses for patients.

  • Trained and evaluated ViT-enhanced models (TransUNet, MedSAM2) for 3D MRI segmentation on TACC’s Lonestar6 supercomputer (H100 GPUs), leveraging Apptainer containerization, mixed precision training, and HPC optimizations to achieve ~0.82 Dice score for Dell Medical School.
  • Developed real-time training monitoring scripts to track loss curves, step efficiency, and model convergence, enabling rapid hyperparameter tuning and comparative model evaluation, while resolving I/O bottlenecks and HPC memory constraints.

Software Engineering Research Intern

University of Maryland – College Park • Remote • Jun 2023 – Present

I helped build an NLP-based chatbot to engage news readers and analyzed linguistic patterns to enhance the interaction. My contributions focus on Python scripting and publishing insights at CHI 2024.

  • Project: Towards Designing a Question-Answering Chatbot for Online News
  • Led the development of an NLP-driven chatbot to augment online news reader engagement, employing deep learning and AI techniques. Directed comprehensive studies and analyses, culminating in findings on human-chatbot interaction dynamics.
  • Executed sophisticated text analytics and data labeling using Python, encompassing Parts of Speech Tagging, LIWC, and clustering on sentence embeddings, to derive intricate linguistic patterns and insights.
  • Collaborated with a cross-disciplinary team of professors and graduate students, driving content creation and ensuring methodological precision. Contributed core analytical insights and visualizations to the research paper published at the CHI 2024 conference.
  • Authored Python scripts for in-depth data analysis, generating insightful graphs and visualizations that formed the backbone of the research findings, illustrating complex human-chatbot interaction patterns.

Software Engineer Intern

Lockheed Martin – Remote • Jun 2022 – Oct 2022

At Lockheed Martin, I optimized CRM workflows, introduced RPA solutions, and cleaned up their Configuration Database to boost operational efficiency.

  • Spearheaded the development and optimization of Customer Relationship Management (CRM) workflows at Lockheed Martin, achieving a centralized device data framework that enhanced enterprise operational efficiency.
  • Engineered advanced CRM solutions by integrating JavaScript for flow enhancements and implementing Robotic Process Automation (RPA), streamlining the data de-duplication process and elevating data integrity.
  • Administered and refined the Configuration Database, successfully purging redundant records and bolstering data accuracy. Synthesized and presented data-driven insights to executives, highlighting the tangible impact on operational efficiency and guiding strategic decisions.

Summer Learning Academy

AT&T – Austin, TX • Jun 2021 – Aug 2021

As the youngest participant, I gained exposure to AI, business strategies, and professional development while collaborating on tech-focused initiatives with industry leaders.

Software Engineer Intern

City of Austin – Austin, TX • Jun 2021 – Aug 2021

I helped Austin’s post-COVID recovery by improving loan processing workflows for small businesses. My work included Python scripting and data visualization to streamline operations.

Projects

Inkwell: YouTube for Books

A dynamic book-sharing platform that allows users to explore and share books freely while empowering authors to earn more by bypassing traditional publishers.

Tech Stack: React, PostgreSQL, Django, AWS S3, Django Rest Framework

Click the title to visit the live site.

AI Dermatologist

An AI-powered tool designed to provide personalized skincare recommendations from a single facial scan using cutting-edge vision transformers and machine learning models.

Tech Stack: Python, TensorFlow, OpenCV, Kaggle Datasets

LeetCode Matchmaker

A web application that helps users discover LeetCode problems similar to a given one, aiding in interview preparation.

Tech Stack: React, Flask, Scikit-learn, PostgreSQL

Click the title to visit the live site.

Huffman Encoder (Java)

A data compression tool implementing Huffman coding with encoding/decoding functionality and a graphical user interface.

Tech Stack: Java, File I/O

System Emulator (C)

A low-level system emulator capable of simulating basic operations like instruction execution, memory management, and I/O handling.

Tech Stack: C

Pintos Operating System

Developed an operating system kernel, implementing core functionalities such as thread scheduling, synchronization, and virtual memory management.

Tech Stack: C

Hobbies & Interests

Weightlifting

I love spending time in the gym and hitting new maxes

Soccer

I've been playing soccer since I could walk. If I'm not working, you can find me on the nearest field!

Family and Friends

I cherish spending quality time with friends and family, whether it's a casual hangout or a special gathering.

Startups

I spend a lot of time working on my startups. Outside of coding, it's asking people I'm close with for advice and feedback on my current projects.