Resume
Ayman Mahfuz

Ayman Mahfuz

Computer Science Student | Software Engineer & Machine Learning Researcher

RoboCup 2025 Bronze Patent Pending 4 Research Labs ML @ Arm

About Me

Hi, I'm Ayman. I'm a cs student at UT Austin (Hook 'em) pursuing a concentration in AI and ML. I've worked across 4 labs at UT as a undergrad research assistant, from building RL envs for robot soccer to implementing and benchmarking ViTs and LLMs to aid medical research. I currently do ML research at ARM to optimize post-silicon validation. I love building impactful projects and solving complex problems, and I'm extremely passionate about deep learning.

  • Patent-pending ML validation at Arm: Industry-first Bayesian optimization that uncovers worst-case CPU & memory stress in <1% of the search space.
  • RoboCup 2025 bronze: Worked on multi-agent reinforcement-learning skills and environments powering UT Austin Villa's 7-v-7 robot soccer team, where we won bronze in a global tournament.
  • 200 M-entry media pipeline: Built data/ML stack & fine-tuned BERT models (99%+ accuracy) for UT's Center for Media Engagement.
  • Pancreas MRI segmentation on H100s using computer vision and transformers: Engineered 3D medical-image pipeline, achieving +12% Dice at the Oden Institute.
  • Youngest speaker, UT AI Health Symposium: Presented research on multi-agent LLM clinical reasoning.

Education

The University of Texas at Austin

Bachelor of Science

Location: Austin, TX, USA

Major: Computer Science

Concentration: Artificial Intelligence and Machine Learning

Relevant Coursework:

  • Generative Visual Computing
  • Natural Language Processing
  • Science of High Performance Computing
  • Data Structures
  • Computer Architecture and Organization
  • Computer Systems and Operating Systems
  • Algorithms
  • Linear Algebra
  • Probability

Resume

Resume

View my latest resume

View

Experience

Arm

ML Research Engineer (Part-Time) | Austin, TX • May 2025 - Present

CURRENT

Independently proposed and built patent-pending ML framework using Bayesian Optimization to automatically discover worst-case hardware stress tests, achieving 99.8th-percentile stress levels while exploring <1% of configuration space. Automated 10,000+ hours of validation testing for next-generation AI platforms.

  • Invented dual-surrogate Bayesian Optimization pipeline using Random Forests to navigate vast, non-linear search space of hardware parameters
  • Achieved 99.8th-percentile hardware stress by intelligently exploring <1% of configuration space, automating 10,000+ validation hours
  • Earned executive-level (SVP) recognition and pending patent; framework now key part of Arm's validation strategy
Hover for details

University of Texas - AI Lab, Texas Robotics

Research Assistant | Austin, TX • Jan 2025 - Present

BRONZE

Developed AI-driven agent skills that helped secure 3rd place at RoboCup 2025. Built RL-based policies for walking, dribbling, and attacking in 400K-line C++ codebase, slashing training time 70% through GPU optimization.

  • Designed hierarchical RL policies for all attacker behaviors, proving more robust than classical methods
  • Reduced RL training time 70% through aggressive GPU optimization and C++ simulator tuning
  • Pioneered curriculum learning strategies and novel reward shaping (Pitch Control, xG) for multi-agent RL
Hover for details

The Sunwater Institute

Data Engineer Intern | North Bethesda, MD • Jan 2025 - March 2025

Built high-performance data pipelines for Legis-1 Platform, processing millions of legislative documents. Developed LLM pipelines using RAG and embeddings to analyze 500K+ legal records for AI-driven policy research.

  • Optimized retrieval speed, storage efficiency, and AI-readiness for legislative database with millions of documents
  • Developed LLM pipelines with RAG, embeddings, and scalable processing across 500K+ legal records
Hover for details

University of Texas - Center for Media Engagement

Software Engineer, Research Assistant | Austin, TX • Sep 2023 - May 2025

Built 150M-entry dataset processing 50M+ articles and 70M+ comments. Fine-tuned BERT models achieving 99% accuracy for NLP tasks. Designed React/Flask/Firebase platform serving 1,000+ participants with 99.99% uptime.

  • Engineered data pipelines to BigQuery using APIs, sitemaps, and Pandas; built dashboards with Python and SQL
  • Fine-tuned BERT models for clickbait detection, story ID, entity recognition, sentiment analysis (99% accuracy)
  • Built React/Flask/Firebase platform with 3 interactive games, MTurk integration, 15+ metrics tracking
Hover for details

University of Texas - Oden Institute

ML Research Assistant | Austin, TX • Feb 2024 - Jan 2025

Built containerized 3D pancreas MRI segmentation pipeline on H100 supercomputer using CNNs and transformers. Achieved +12% Dice gain matching SOTA performance across 1000+ scans with 5-fold cross-validation.

  • Engineered Apptainer/SLURM pipeline on TACC H100s, achieving +12% Dice gain with hybrid architectures
  • Benchmarked CNNs, vision transformers, and hybrids across 1000+ scans, finding PanSegNet excels for small organs
  • Resolved GPU memory bottlenecks, I/O lag, and mixed precision instability for stable large-scale training
Hover for details

University of Texas - School of Information

Research Assistant | Austin, TX • Feb 2024 - Jan 2025

Designed multiagent LLM research project studying diagnostic consistency in medical reasoning. Applied Cohen's Kappa, Chi-square tests, and logistic regression to assess agreement and bias. Presented at UT AI Health Conference as youngest speaker.

  • Project on page 124 of this report
  • Tested multiagent LLM consistency with demographic/symptom variations; analyzed inter-agent communication patterns
  • Applied Cohen's Kappa, Chi-square, logistic regression to assess agreement, accuracy, and bias across agents
Hover for details

Lockheed Martin

Software Engineer Intern | Remote • Jun 2022 - Oct 2022

Optimized CRM workflows with JavaScript and RPA, achieving centralized device data framework. Refined Configuration Database, purging redundant records and presenting data-driven insights to executives.

  • Developed CRM workflows achieving centralized device data framework for enhanced enterprise efficiency
  • Implemented RPA for data de-duplication, streamlining processes and elevating data integrity
Hover for details

University of Maryland - College Park

Research Intern | Remote • Jun 2023 - May 2024

Built NLP-driven chatbot for online news engagement using deep learning. Executed text analytics with POS tagging, LIWC, and sentence embeddings. Published research at CHI 2024 conference.

  • Published at CHI 2024 conference
  • Led NLP chatbot development for news reader engagement, conducting studies on human-chatbot dynamics
  • Performed text analytics using POS tagging, LIWC, and clustering on sentence embeddings with Python
Hover for details

City of Austin

Software Engineer Intern | Austin, TX • Jun 2021 - Aug 2021

Improved post-COVID loan processing workflows for small businesses using Python scripting and data visualization to streamline operations.

AT&T

Summer Learning Academy | Austin, TX • Jun 2021 - Aug 2021

As youngest participant, gained exposure to AI, business strategies, and professional development while collaborating on tech-focused initiatives.

Projects

Deep technical work spanning AI systems, LLMs, and generative models

Live Demo AI-First

Helm

A Jira alternative with AI-first workflows

Natural language ticket management, auto-generated daily briefings, and live meeting capture that converts discussions into tickets automatically. Built an agentic system that reasons over your entire project context.

25+Callable Tools
HybridRAG Layer
LiveMeeting Capture
  • Agentic LLM system via OpenAI function calling for contextual reasoning over tickets, GitHub PRs, commits, and transcripts
  • Hybrid RAG combining SQL filtering with vector embeddings across all project data
  • Auto-briefings that synthesize overnight activity, surface blockers, and recommend prioritized actions
OpenAI Function Calling RAG Vector Embeddings React PostgreSQL
From Scratch 27.03 PPL

Modern LLM Training Pipeline

253M parameter model outperforming GPT-2

Frontier-style LLM training pipeline from scratch with modern architectural choices and a complete alignment workflow: Pretrain → SFT → DPO → Verifier.

253MParameters
-33%vs GPT-2 PPL
4Training Stages
  • RoPE: Rotary Position Embeddings for better length extrapolation
  • RMSNorm: Root Mean Square LayerNorm for faster, stable training
  • SwiGLU: Gated Linear Units with Swish activation (2-4% better than GELU)
  • Attention Sinks: Learnable tokens for stable generation beyond training context
PyTorch RoPE RMSNorm SwiGLU DPO RLHF
From Scratch MNIST + CIFAR

Diffusion Models: DDPM & DDIM

Denoising diffusion from first principles

Complete DDPM and DDIM implementations in PyTorch without relying on Hugging Face or diffusers. Trained on MNIST and CIFAR-10 with full analysis of speed/quality trade-offs.

2U-Net Architectures
DDPM+ DDIM Samplers
EMAWeight Averaging
  • MNIST U-Net: Cosine noise schedule, sinusoidal time embeddings, self-attention bottleneck
  • CIFAR-10 U-Net: Multi-resolution self-attention, dropout, EMA for stable sampling
  • DDIM Sampling: Deterministic generation with arbitrary step counts (10-1000)
  • Full speed/quality analysis comparing DDPM vs DDIM across step counts
PyTorch U-Net DDPM DDIM EMA SLURM/HPC
~90% Acc Multimodal

TikTok Video Auto-Sorter

Multimodal ML that organizes your saved videos

Automatically categorizes saved TikTok videos into user-defined folders by analyzing visual and audio content. Reduces TikTok's 3-tap save flow to a single confirmation, achieving ~90% accuracy with minimal labeled data.

~90%Accuracy
600+Videos Sorted
2Modalities
  • Visual: Sample 5 frames per video → encode with CLIP (ViT-B/32) → 512-d vector
  • Audio: Transcribe with Whisper → encode with CLIP text encoder → 512-d vector
  • Classifier: Two-layer MLP with class-weighted loss for imbalanced categories
  • Interactive labeling UI modeled after TikTok with top-3 predictions and active learning retrain loop
PyTorch CLIP Whisper FastAPI Transfer Learning

Skills

Programming Languages

Python Java C JavaScript HTML/CSS Ruby C++ PHP

Frontend Development

React.js Node.js HTML/CSS

Backend Development

Flask Django Node.js

Data Science & ML

Pandas NumPy Scikit-learn

Databases

SQL PostgreSQL

Tools & Libraries

Git AWS Google Cloud Platform

Miscellaneous

ARM64 MATLAB

Hobbies & Interests

Deep Learning

I'm passionate about learning and keeping up with the latest advancements in deep learning, from new architectures to real-world applications.

Weightlifting

I love spending time in the gym and hitting new maxes

Soccer

I've been playing soccer since I could walk. If I'm not working, you can find me on the nearest field!

Family and Friends

I cherish spending quality time with friends and family, whether it's a casual hangout or a special gathering.

Startups

I spend a lot of time working on my startups. Outside of coding, it's asking people I'm close with for advice and feedback on my current projects.