Back to Portfolio
ML Research

A Black Box Made Less Opaque (Part 2)

Syntax vs Semantics

A Black Box Made Less Opaque (Part 2)

Objective

Investigate whether models focus on syntax (symbols) or semantics (meaning) using sparse autoencoders.

Motivation

Understand whether models pay attention to style or substance — an important question with implications for both performance and safety.

About this installment

The second installment explores whether models focus on syntax (symbols) or semantics (meaning) using sparse autoencoders.

More in ML Research