news | 刁恩茂

2026年5月1日	OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference 发表于 ICML 2026. A Kinetic-Energy Perspective of Flow Matching 发表于 ICML 2026 (Spotlight).
2026年4月7日	From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models 发表于 ACL 2026.
2026年2月24日	Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models 发表于 DAC 2026.
2026年1月26日	Graph Tokenization for Bridging Graphs and Transformers 发表于 ICLR 2026.