Deep-Learning

Beyond the Hype: How I See World Models Evolving in 2025

A summary of my personal opinions on world models in 2025, covering their current state, future prospects, and implications for embodied AI. Discusses 3D modeling approaches, data challenges, research directions, and the role of JEPA-style architectures in the evolution of world models.

Titans: Learning to Memorize at Test Time

This article introduces Titans, a novel architecture that as a meta in-context learner, learns to memorize at test time. Through designing a long-term memory module, and proposing three variants of Titans (MAC, MAG, MAL), the model achieves superior performance compared to Transformers and other baselines, especially in long-context tasks.

Natural Language Processing: Part B. Modern Approaches

This is the second part of the Natural Language Processing Series. It covers modern approaches in natural language processing, including RNNs, VAE-LMs, Transformer, BERT, GPT, GAN-LMs, In-Context Learning, CoT, RLHF, DPO, etc.