Thread: Interpretability with a chance of interesting features

It is human nature to ponder on the mechanism of things- that is how we got the wheel, agriculture, the vaccine for smallpox, the printing press and the transistor, to name a few inventions. This trait is beautifully put into words in Kurt Vonnegut’s 1963 novel Cat’s Cradle

Tiger got to hunt, bird got to fly; Man got to sit and wonder ‘why, why, why?’ Tiger got to sleep, bird got to land; Man got to tell himself he understand.

These lines beautifully capture the essence of humans- our never-ending quest for finding answers to questions. Mechanistic Interpretability tries to find answers to our questions on model internals and behaviour.

Posts in this Thread

A Litmus Test for Features

Through a neural microscope

Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Devlog 01: The Big Decisions
  • Through a neural microscope
  • A Litmus Test for Features
  • R-CNN
  • Devlog: LLM Pretraining