[https://drive.google.com/file/d/1N1AKKG88xuNHlW6hhrN1vIj-ecNxRV4_/view?usp=sharing](https://drive.google.com/file/d/1N1AKKG88xuNHlW6hhrN1vIj-ecNxRV4_/view?usp=sharing)
Found task where LLM’s stated reasoning did not mach the reasons it articulates. Studied ways to prompt the LLM to give true information.
A study of how Dictionary learning scales with the size of the dictionary. Discovered many new interpretable features in transformers.
https://docs.google.com/presentation/d/1nSOL0pim1w7GKrexbSd_uINJ0wyVowovWW3W65SRstY/edit#slide=id.p
Studied how AIs make strategic decisions in the game of Go (using PyTorch). Worked on classifying structures in the neural network Leela Zero responsible for the strategic move of Atari.