Knowledge Extraction

Pulling structured symbolic descriptions out of trained neural networks

The central technical challenge of neurosymbolic AI — and the bottleneck of the cycle at scale.

What it is

Knowledge extraction is the procedure of producing a symbolic description — rules, logic programs, decision trees, or graphs — that approximates the behaviour of a trained neural network. It is step 2 of the neurosymbolic cycle, and the source of most of NSAI’s downstream properties.

Common extraction techniques include:

Oracle querying — treating the network as a black box and probing it to derive logic programs
Distillation — training a smaller “student” network or decision tree to imitate the original
Activation probing — examining internal representations (via tools like Concept Activation Vectors) to recover concept structure
Fixed-point semantics — mapping stable network states onto the semantics of a known logic formalism

Extraction quality is judged by fidelity — how closely the symbolic description tracks the network it came from.

Why it matters

Extraction is the bottleneck of NSAI at scale. Pulling a usable description from a frontier-scale LLM is, in Garcez’s words, “daunting, if not impossible” — the network is simply too large for current techniques. The viable path is to extract from small parts of the network during training, before scale takes over, or to work with modular architectures that admit per-module extraction.

Once extracted, the symbolic description grants the system the determinacy properties NSAI advertises: provable correctness within the fidelity error, counterfactual reasoning, knowledge reuse across tasks, and out-of-distribution extrapolation through universal quantification.

Knowledge Extraction

What it is

Why it matters

Related concepts