I think a lot of mechanistic interpretability research should find a home in academic labs because:

  1. Mech interp isn’t very expensive;
  2. Related academic research (e.g., sparsity, pruning) is strong;
  3. Mech interp should grow;
  4. Most academic safety research is less useful.