news | Enmao Diao

May 1, 2026	OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference is published in ICML 2026. A Kinetic-Energy Perspective of Flow Matching is published in ICML 2026 (Spotlight).
Apr 7, 2026	From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models is published in ACL 2026.
Feb 24, 2026	Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models is published in DAC 2026.
Jan 26, 2026	Graph Tokenization for Bridging Graphs and Transformers is published in ICLR 2026.