Titans: Learning to Memorize at Test Time

This article introduces Titans, a novel architecture that as a meta in-context learner, learns to memorize at test time. Through designing a long-term memory module, and proposing three variants of Titans (MAC, MAG, MAL), the model achieves superior performance compared to Transformers and other baselines, especially in long-context tasks.

February 3, 2025 · 8 min · Nemo

Life Hacks Series: 1. How to manage your Python Environment

This is the first article in the Life Hacks Series. It covers how to manage your Python environment. Basically, it covers how to install packages, how to create a new environment, how to clone an environment, and how to pack an environment. A special mention is Conda-Pack, which really made my life a lot easier.

January 15, 2025 · 4 min · Nemo

Machine Learning Series: 5.Hyperparameter Selection

This is the fifth article in the Machine Learning Series. It covers classic approaches to Hyperparameter Selection, including Bayesian Optimization, Gradient Optimization, Random Search, Multi-Arm Bandits and Neural Architecture Search.

January 1, 2025 · 5 min · Nemo

Machine Learning Series: 4.Robust Machine Learning

This is the fourth article in the Machine Learning Series. It covers classic approaches to Robust Machine Learning, including Adversial Attacks, Adversial Training, Robust Features, Obfuscated Gradients and Provable Robust Certificates.

December 29, 2024 · 2 min · Nemo

Machine Learning Series: 3.Unsupervised Learning(II)

This is the third article in the Machine Learning Series. It covers the second part of unsupervised learning, including topics like Clustering, Spectral Graph Clustering, SimCLR, SNE and t-SNE.

December 3, 2024 · 7 min · Nemo

Machine Learning Series: 2.Unsupervised Learning(I)

This is the second article in the Machine Learning Series. It covers the first part of unsupervised learning, including topics like Dimension Reduction, PCA, k-NN, LSH and Metric Learning.

December 2, 2024 · 5 min · Nemo

Natural Language Processing: Part B. Modern Approaches

This is the second part of the Natural Language Processing Series. It covers modern approaches in natural language processing, including RNNs, VAE-LMs, Transformer, BERT, GPT, GAN-LMs, In-Context Learning, CoT, RLHF, DPO, etc.

November 13, 2024 · 1 min · Nemo

Machine Learning Series: 1.Optimization, Generalization and Supervised Learning

This is the first article in the Machine Learning Series. It covers the basics of optimization(GD,SGD,SVRG,Mirror Descent,Linear Coupling), generalization(No Free Lunch, PAC Learning, VC Dimension), and supervised learning(Linear Regression, Logistic Regression, Compressed Sensing).

November 9, 2024 · 22 min · Nemo

Natural Language Processing: Part A. Classical Methods

This is the first part of the Natural Language Processing Series. It covers classical methods in natural language processing, including CFG, LSA, HMM, N-gram, Word2Vec, etc.

September 26, 2024 · 1 min · Nemo