# Extreme Learning

**Repository Path**: knifecms/x-learning

## Basic Information

- **Project Name**: Extreme Learning
- **Description**: 极致机器学习
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-05-06
- **Last Updated**: 2026-05-09

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# X-Learning Prototype

This repository contains a minimal runnable prototype for the learning idea we
developed in conversation: a unified learning mechanism that starts with weak
structure, extracts event-like units from a stream, stores unresolved cases as
residuals, and turns recurring residual structure into reusable schemas.

For a Chinese overview of the original motivation and prototype framing, see
[docs/prototype_idea_zh.md](/d:/codes/x-learning/docs/prototype_idea_zh.md).

The design goal is not benchmark performance. The goal is to make the internal
process visible:

- early encounters are costly and uncertain
- repeated structure becomes indexed knowledge
- later encounters become cheaper because prior schemas can be reused
- unresolved observations are not discarded; they are stored and revisited
- event summaries stay centered on the action that likely caused the change
- each interpretation now carries a lightweight explanation trace for inspection
- residual promotion now preserves critical stable structure such as
  `object_kind`, so similar dynamics from different object families do not
  collapse into one over-broad schema

## Core Idea

The prototype treats learning as a loop:

1. detect salient change in a sensory stream
2. segment the stream into event candidates
3. try to explain each event with existing schemas
4. store low-compatibility events as residuals
5. promote recurring residuals into new schemas
6. reuse schemas on future events, lowering interpretation cost

This keeps the "infant" and "mature" phases unified. The same mechanism runs
throughout; the only thing that changes is how much structure is already
available in memory.

## Current Prototype Notes

- action-driven episodes such as `observe -> push -> observe` are now summarized
  as a single `push` event rather than being split into misleading
  `observe/change` and `push/no_change` fragments
- spontaneous changes under pure observation are still segmented, so passive
  novelty is not lost
- interpretation reports include the best schema explanation, including matched
  changed features and invariant context
- unresolved events now also retain a small ranked set of candidate schema
  explanations, so we can inspect partial structure reuse instead of treating
  every residual as a total failure
- probe selection now also tracks how many current candidate explanations a
  probe would help, so the next step can favor gaps that matter across several
  live hypotheses rather than only the single best partial match
- residual memory now keeps lightweight traces of the leading candidate schema
  and gap signature for each unresolved event, so repeated explanation failures
  can be inspected separately from raw event recurrence
- when repeated residuals do promote into a schema, the learner now records
  whether that schema looks like a fresh residual pattern or a differentiation
  from a repeatedly strained prior schema
- missing stable fields no longer count as immediate invariant conflicts during
  matching; they are exposed as `unseen_invariants` in the explanation trace
- the demo now includes a structure-reuse probe where surface features change
  and some prior fields disappear, testing whether schemas transfer by pattern
  rather than by exact attribute templates
- compatibility now mixes similarity with explicit penalties for outcome
  reversals, unexpected extra changes, and critical invariant mismatches such
  as `object_kind`, reducing brittle over-assimilation
- changed-feature compatibility is now support-weighted, so omitting a feature
  that almost always changes with a schema is treated as a stronger mismatch
  than omitting a rare side effect
- schemas now also expose a simple split between core changes and side effects,
  and probe selection can prioritize missing core changes before rarer
  incidental effects
- residual interpretations now emit probe suggestions, such as inspecting a
  conflicting invariant, observing a missing stable feature, repeating an
  action when outcomes disagree, or testing a possible new affordance
- the learner can now execute those probe suggestions in a minimal `ToyProbeWorld`
  and feed the resulting frames back through the same interpretive loop
- probe attempts, resolutions, and simple resolution-rate metrics are now
  tracked directly by the learner for early Phase 2 evaluation
- probe metrics are also broken down by probe type, so the prototype can compare
  the value of `repeat_action`, `observe_feature`, and other strategies
- the repository now includes a small ablation benchmark suite, so each new
  mechanism can be compared against simpler baselines instead of only judged by
  a single demo trace
- that benchmark suite now also carries an explicit protocol registry, so each
  case records its failure mode, target mechanism, comparison baseline, and
  expected winner
- probe selection now emits a `policy_hint` and can lean toward historically
  more effective probe types when multiple next steps are plausible
- that probe-history bias is now evidence-calibrated, so one lucky success does
  not outweigh a structurally stronger probe candidate before enough local
  experience accumulates
- probe policy now tracks local success by uncertainty context as well, so
  `unexpected_change` and `unseen_invariant` situations can learn different
  preferred probing strategies
- that local tracking is now refined into failure signatures such as
  `outcome_disagreement:distance` or `unseen_invariant:temperature`, allowing
  the policy to learn probe preferences at a more structural level
- those signatures are now generated automatically from probe and explanation
  structure rather than being hand-authored case by case
- when exact signatures are too sparse, probe policy now backs off to abstract
  failure shapes such as `unseen_invariant:observe_feature:scalar_state`
- those feature-family labels now come from lightweight accumulated experience
  first, with hand-written heuristics only as a fallback when a feature is new
- the abstraction layer now also learns whether a feature usually acts like a
  descriptor, precondition, controllable effect, or observed effect, and folds
  that role into probe shapes
- controllability evidence is now tracked explicitly from both ordinary events
  and executed probes, so role assignment can reflect whether the learner can
  actually make a feature change rather than only whether it often changes
- the learner now also records action-feature pair tendencies such as
  `push -> distance` and `pull -> is_open`, giving a first explicit memory of
  local causal affordance structure
- that local affordance memory now feeds back into probing, so a residual
  observe-only change in `distance` can trigger a targeted `push` probe instead
  of only more passive observation
- probe selection now also uses a lightweight information-gain bias: when two
  schemas both partially fit but imply different actions over the same feature,
  the learner can choose a `targeted_action` probe to separate them
- probe candidates now carry an explicit expected-elimination estimate, so the
  learner can favor probes that are likely to rule out more current schema
  hypotheses instead of only following generic priority or historical success
- probe candidates also now include a tiny anticipated outcome tree, estimating
  which schemas would remain under a few plausible probe results and letting
  policy prefer probes with stronger expected candidate compression
- those outcome branches are now lightly weighted by real probe history when
  available, so branch planning can start to reflect what this learner has
  actually seen rather than staying purely heuristic
- that branch history is now also conditioned on a tiny local context such as
  `object_kind`, current action, probed feature, learned feature family, and
  learned feature role, so the learner can separate how similar probes behave
  under different local structural conditions
- beneath that exact local context, branch planning now also keeps an abstract
  branch-context shape keyed by action mode, probe context, probe type, feature
  family, and feature role, so branch memory can generalize without collapsing
  all object-specific experience together
- the learner now also builds a lightweight learned branch abstraction from
  field-level branch history, letting it prefer whichever context fields have
  actually been most predictive of branch outcomes instead of always relying on
  a fixed hand-authored context template
- that learned abstraction now also considers simple field pairs, but only
  keeps them when they have enough support and add predictive value beyond
  their constituent fields, so branch memory stays discriminative without
  fragmenting on accidental co-occurrence
- when exact value-level context looks too brittle, that abstraction can now
  back off to field-name or field-pair labels, letting branch memory aggregate
  over more stable local structure instead of overfitting to one object value
- the learner now also keeps explicit branch-field retention summaries, and
  uses those long-run signals as a light bias when deciding whether a field or
  field-pair abstraction should outrank a close value-level key
- when a field-pair abstraction is clearly stronger than the remaining single
  fields, the learner now stops expanding the abstraction early, keeping local
  branch context compact instead of appending weak tail labels
- the learner now also tracks which abstractions and abstraction labels are
  repeatedly selected over time, exposing simple stability summaries and using
  that history as a light tie-break bias when a near-stable abstraction is
  almost strong enough to win again
- precondition relevance is now tracked both globally and per parent schema, so
  stable-condition probe focus can stay aligned with the local schema family
  that is actually under explanatory strain rather than drifting toward a
  globally frequent but locally irrelevant mismatch
- residual clustering is now also guarded by critical stable invariants before
  schema promotion, which keeps cross-object analogies inspectable as separate
  residual hypotheses unless their stable structural context really aligns

## Files

- [x_learning/core.py](/C:/skj/x-learning/x_learning/core.py)
- [x_learning/benchmarks.py](/d:/codes/x-learning/x_learning/benchmarks.py)
- [x_learning/demo.py](/C:/skj/x-learning/x_learning/demo.py)
- [tests/test_core.py](/C:/skj/x-learning/tests/test_core.py)
- [docs/prototype_idea_zh.md](/d:/codes/x-learning/docs/prototype_idea_zh.md)

## Run

```bash
python -m x_learning.demo
```

## Test

```bash
python -m unittest discover -s tests -v
```

## Benchmark

Run the demo to see the current benchmark suite:

```bash
python -m x_learning.demo
```

The benchmark section currently compares the full learner against ablations on:

- curriculum learning cost and schema formation
- structure-reuse transfer under surface variation
- rejection of high-support near-miss events
- probe selection under sparse local history
- probe focus on core missing changes versus rare side effects
- preservation of cross-family separation under otherwise similar local dynamics
- preservation of conditioned same-family separation under repeated stable
  precondition mismatch
- preservation of schema-conditioned precondition focus when global relevance
  would otherwise bias the learner toward the wrong stable mismatch

Each benchmark case should now be read as a claim with structure:

- what failure mode it is meant to catch
- which mechanism is supposed to address that failure
- which baseline should lose on that case
- which metric should improve, and in which direction

This is intended to keep future optimization work persuasive rather than
ad hoc: every new mechanism should add or extend a benchmark case before the
mechanism itself is treated as validated.