-
Four Failed Experiments in LoRA Attribution
Post-hoc methods fail to untangle multi-LoRA generations.
-
Do Model-free RL Agents Represent Each Other's Intent? (Part 1)
Linear probes decode partner intent from RL hidden states, but architecture determines whether any of it is interpretable.