Concept encoding in in-context learning of LLMs
Study about concept encoding phenomena in in-context learning of LLMs and how to interpret ICL sucess and failure modes.
Paper (soon to be archived) : “Concept to Context : Concept Encoding in In-Context Learning of LLMs”
TLDR: We explored the mechanistic understanding of ICL success/failure modes through how well certain concepts are encoded in intermediate reprsentatoions
Why does in-context learning (ICL) succeed or fail depending on the task? We explored the mechanisms behind these success and failure modes and sought to quantify them. To this end, we introduced the concept of Concept Encoding phenomena, which posits that representations are effectively distinguished by latent concepts. Our findings demonstrate that concept encoding occurs in the LLAMA3 8B model, as evidenced by realistic latent concepts such as verbs and nouns in part-of-speech tagging, as well as logical operators like AND and OR in bitwise operations.
We conjecture that concept decodability—specifically, how well a given concept is separated in intermediate representations—can predict in-context learning (ICL) performance. Our observations indicate that this holds true for both the part-of-speech (POS) task and bitwise operation tasks.
We do various interventions and causual relaitons experimetns supporting that concept encoding phenomena is indeed causually related with the ICL performance . cHeck the paper! : Paper (soon to be archived) : “Concept to Context : Concept Encoding in In-Context Learning of LLMs”