Hey there! I’m Tuna [tu-nah]👋
Hey there! I’m Tuna [tu-nah]👋
I research generative AI at Virginia Tech, mainly figuring out how to tell image and video models exactly what to do (and why they do it). My code often runs after midnight, so I keep the coffee machine busy and the GPU fans even busier.
What I’m proud of
- CVPR 2024, ICML 2025 (oral), and ICCV 2025 (highlight) papers that tinker with diffusion models, personalization and interpretability.
- Internships with Amazon AGI and Adobe FireFly, plus fun collabs with Google.
- Co-organizer of the P13N workshop on Personalization in Generative AI at ICCV 2025.
- Once deploying image-generation services that millions of people actually used, without it catching fire.
Why I do this
I want creators to treat generative models like trusty sidekicks, not black boxes. My work mixes theory, experiments, and a dose of engineering pragmatism so the results can ship, not just sit on arXiv.
Outside the lab
You’ll occasionally find fresh paper notes and open-source snippets on my blog or Twitter. Offline, I’m usually looking at my monitor, coffee in hand, while stress-testing newly trained models.
Let’s chat
If you’re into vision generative AI, or just want to trade GPU tales, drop me a line!
📝 Latest from the Blog
Generating Pixels One by One
Research Interests
Autoregressive Vision Models
Developing next-generation autoregressive architectures for high-fidelity and efficient image and video synthesis.
Controllable Generation
Creating alignment objectives for diffusion and autoregressive models that enable precise user control without compromising generation quality.
Mechanistic Interpretability
Understanding how transformers process visual and textual information through token-level analysis and attention mechanism studies.
Zero-shot Editing
Enabling intuitive image and video manipulation through natural language interfaces and novel training-free approaches.
Recent Updates
I started Amazon AGI as an Applied Scientist Intern in San Francisco to work on Video Foundational Models
Featured Publications

CLoRA: A Contrastive Approach to Compose Multiple LoRA Models
ICCV 2025 (Highlight)
CLoRA is a training-free method that works on test-time, and uses contrastive learning to compose multiple concept and style LoRAs simultaneously.

ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features
ICML 2025 (Oral)
Without requiring additional training, ConceptAttention repurposes the parameters of DiT attention layers to produce highly contextualized concept embeddings, contributing the major discovery that performing linear projections in the output space of DiT attention layers yields significantly sharper saliency maps compared to commonly used cross-attention mechanisms.

CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models
CVPR 2024
Images produced by text-to-image diffusion models might not always faithfully represent the semantic intent of the provided text prompt where the model might overlook or entirely fail to produce certain objects. While recent studies propose various solutions, they often require customly tailored functions for each of these problems, leading to sub-optimal results, especially for complex prompts. Our work introduces a novel perspective by tackling this challenge in a contrastive context. Our approach intuitively promotes the segregation of objects in attention maps, while also maintaining that pairs of related attributes are kept close to each other.