Return to Article Details BRIDGING THE PERFORMANCE-INTERPRETABILITY GAP IN LARGE LANGUAGE MODELS: SPARSE AUTOENCODER DISENTANGLEMENT WITH PERFORMANCE ANCHORING Download Download PDF