# H_u / H_s analysis — qwen3-8b-base · medial · joint

Idiom vs. parallel literal-VP (non-idiom) datasets. All entropies are uniform MC averages over each phrase's observed contexts; scores are unnormalized geometric-mean (or joint) per-token LM probabilities. See `README.md` / `code/SCORING_MATH.md` for the derivation.

> **How to read these magnitudes** (directions, why H_s is +inf, the new finite synergy metrics): see [`INTERPRETATION.md`](../INTERPRETATION.md). Quick key — ↑`H_u/H(p)`, ↑`syn_frac`, ↑`H_s^log` mean **more** synergy; ↑`H_s^reg` and the original ↑`H_s` mean **less** synergy.

## Configuration

| field | idioms run | non-idioms run |
|---|---|---|
| model | Qwen/Qwen3-8B-Base | Qwen/Qwen3-8B-Base |
| reduction | joint | joint |
| medial_only | True | True |
| dtype | bfloat16 | bfloat16 |
| num_idioms | 18 | 18 |
| dataset | /home/prada/PID_evaluation/data/dataset.tsv | /home/prada/PID_evaluation/data/nonidioms_dataset.tsv |

Bound `H_u + H_s ≥ 2H(p) + 2log2 ≥ H(p)` holds for 18/18 idioms and 18/18 non-idioms.

`H_s = +inf` (≥1 non-synergistic slot) for 10/18 idioms and 18/18 non-idioms. A *finite* H_s means **every** context of that phrase is synergistic (p > max(q,r) everywhere).

## Per-metric summary (idioms vs non-idioms)

Means and 95% bootstrap CIs (20k resamples, percentile method, phrase-level). Non-finite values are dropped per metric before bootstrapping; the drop count is shown.

### H(p)

*base entropy, uniform MC average over contexts (nats). ↓ smaller = idiom more concentrated*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 49.3274 | 48.6726 | 35.0724 | 59.5938 | [46.2421, 52.2824] |
| non-idioms | 18 | 0 | 55.3342 | 56.3407 | 42.8452 | 71.7074 | [52.2267, 58.4833] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -6.0068  (95% CI [-10.3449, -1.7124]) → idioms < non-idioms, **significant**.

### H_u

*unique / redundant entropy = -log min{p, max(q,r)} (nats); >= H(p)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 57.4272 | 56.5118 | 40.9675 | 71.6238 | [54.1295, 60.6560] |
| non-idioms | 18 | 0 | 57.3379 | 58.1767 | 45.6521 | 73.2747 | [54.1319, 60.5791] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.0894  (95% CI [-4.4143, 4.6430]) → idioms > non-idioms, not significant.

### H_u / H(p)

*unique-information ratio (>= 1). ↑ bigger = MORE synergy. THE headline metric*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.1682 | 1.1380 | 1.0788 | 1.2947 | [1.1338, 1.2043] |
| non-idioms | 18 | 0 | 1.0366 | 1.0259 | 1.0042 | 1.0979 | [1.0256, 1.0487] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.1316  (95% CI [0.0948, 0.1697]) → idioms > non-idioms, **significant**.

### syn_frac

*synergy coverage in [0,1] = frac. of contexts with p>m. ↑ bigger = MORE synergy (most intuitive)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.8333 | 0.8000 | 0.6000 | 1.0000 | [0.7556, 0.9111] |
| non-idioms | 18 | 0 | 0.5556 | 0.6000 | 0.2000 | 0.8000 | [0.4556, 0.6556] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.2778  (95% CI [0.1556, 0.4111]) → idioms > non-idioms, **significant**.

### H_s^log

*log-space synergy = mean max{0, log p - log m} (nats). ↑ bigger = MORE synergy; finite always*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 8.0998 | 6.6453 | 3.8424 | 13.6951 | [6.5686, 9.6600] |
| non-idioms | 18 | 0 | 2.0037 | 1.5680 | 0.2101 | 5.0984 | [1.4042, 2.6461] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 6.0961  (95% CI [4.4248, 7.8081]) → idioms > non-idioms, **significant**.

### H_s^log / H(p)

*log-space synergy ratio. ↑ bigger = MORE synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.1682 | 0.1380 | 0.0788 | 0.2947 | [0.1338, 0.2043] |
| non-idioms | 18 | 0 | 0.0366 | 0.0259 | 0.0042 | 0.0979 | [0.0256, 0.0487] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.1316  (95% CI [0.0948, 0.1697]) → idioms > non-idioms, **significant**.

### H_s^log signed

*signed log-space synergy = mean(log p - log m); can be negative (net anti-synergistic)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 7.5847 | 6.4630 | -0.6401 | 13.6951 | [5.7665, 9.4046] |
| non-idioms | 18 | 0 | 0.1055 | -0.2339 | -3.7096 | 3.6296 | [-0.9898, 1.2007] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 7.4792  (95% CI [5.3282, 9.6420]) → idioms > non-idioms, **significant**.

### H_s^reg

*regularized H_s (eps-floored, finite, continuous). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 50.1147 | 49.1077 | 36.0297 | 59.5938 | [46.9903, 53.0898] |
| non-idioms | 18 | 0 | 57.5231 | 58.0673 | 44.7382 | 74.4969 | [54.5026, 60.6378] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -7.4084  (95% CI [-11.7867, -3.1207]) → idioms < non-idioms, **significant**.

### H_s^reg / H(p)

*regularized synergy ratio (>= 1). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.0160 | 1.0164 | 1.0000 | 1.0410 | [1.0091, 1.0231] |
| non-idioms | 18 | 0 | 1.0407 | 1.0399 | 1.0162 | 1.0794 | [1.0317, 1.0501] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.0247  (95% CI [-0.0365, -0.0131]) → idioms < non-idioms, **significant**.

### H_s (original)

*synergy entropy = -log max{0, p - max(q,r)} (nats); +inf if ANY slot non-synergistic (mostly +inf)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 8 | 10 | 49.0626 | 48.7528 | 39.9585 | 59.5938 | [44.6943, 53.6241] |
| non-idioms | 0 | 18 | — | — | — | — | (no finite values) |

**Cross-dataset gap**: insufficient finite values in one dataset.

### H_s / H(p)

*original synergy ratio (mostly +inf; use H_s^log or syn_frac instead)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 8 | 10 | 1.0006 | 1.0000 | 1.0000 | 1.0033 | [1.0000, 1.0014] |
| non-idioms | 0 | 18 | — | — | — | — | (no finite values) |

**Cross-dataset gap**: insufficient finite values in one dataset.

## Per-phrase detail

#### Idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| strike a chord | 39.9585 | 51.7329 | 1.2947 | 1.00 | 11.7744 | 0.2947 | 39.9585 | 1.0000 | 39.9585 | 5 | 5 | 5 |
| cut corners | 42.3795 | 54.6695 | 1.2900 | 1.00 | 12.2900 | 0.2900 | 42.3799 | 1.0000 | 42.3799 | 5 | 5 | 5 |
| break the mold | 52.2142 | 65.9093 | 1.2623 | 1.00 | 13.6951 | 0.2623 | 52.2142 | 1.0000 | 52.2142 | 5 | 5 | 5 |
| bite the dust | 44.0659 | 55.2332 | 1.2534 | 1.00 | 11.1672 | 0.2534 | 44.0661 | 1.0000 | 44.0661 | 5 | 5 | 5 |
| call the shots | 48.4637 | 60.7092 | 1.2527 | 1.00 | 12.2456 | 0.2527 | 48.4638 | 1.0000 | 48.4638 | 5 | 5 | 5 |
| spill the beans | 46.4255 | 57.7904 | 1.2448 | 0.80 | 11.3649 | 0.2448 | 47.3466 | 1.0198 | +inf | 5 | 5 | 5 |
| turn tail | 59.5938 | 71.6238 | 1.2019 | 1.00 | 12.0300 | 0.2019 | 59.5938 | 1.0000 | 59.5938 | 5 | 5 | 5 |
| rock the boat | 35.0724 | 40.9675 | 1.1681 | 0.80 | 5.8951 | 0.1681 | 36.0297 | 1.0273 | +inf | 5 | 5 | 5 |
| clear the air | 56.7030 | 65.1913 | 1.1497 | 1.00 | 8.4882 | 0.1497 | 56.7828 | 1.0014 | 56.7828 | 5 | 5 | 5 |
| have a ball | 48.8814 | 55.0601 | 1.1264 | 1.00 | 6.1786 | 0.1264 | 49.0418 | 1.0033 | 49.0418 | 5 | 5 | 5 |
| lose ground | 57.8436 | 64.9556 | 1.1230 | 0.80 | 7.1119 | 0.1230 | 58.7649 | 1.0159 | +inf | 5 | 5 | 5 |
| get the sack | 46.9970 | 52.5207 | 1.1175 | 0.60 | 5.5237 | 0.1175 | 48.8521 | 1.0395 | +inf | 5 | 5 | 5 |
| lead the field | 48.2326 | 53.2129 | 1.1033 | 0.80 | 4.9803 | 0.1033 | 49.1737 | 1.0195 | +inf | 5 | 5 | 5 |
| run the show | 53.4115 | 58.8127 | 1.1011 | 0.60 | 5.4012 | 0.1011 | 55.2627 | 1.0347 | +inf | 5 | 5 | 5 |
| pull strings | 55.6862 | 60.8649 | 1.0930 | 0.80 | 5.1787 | 0.0930 | 56.6206 | 1.0168 | +inf | 5 | 5 | 5 |
| make waves | 45.4763 | 49.3187 | 1.0845 | 0.60 | 3.8424 | 0.0845 | 47.3395 | 1.0410 | +inf | 5 | 5 | 5 |
| raise hell | 55.8539 | 60.4949 | 1.0831 | 0.60 | 4.6410 | 0.0831 | 57.6964 | 1.0330 | +inf | 5 | 5 | 5 |
| mean business | 50.6344 | 54.6229 | 1.0788 | 0.60 | 3.9885 | 0.0788 | 52.4777 | 1.0364 | +inf | 5 | 5 | 5 |

#### Non-idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| cut hair | 52.0824 | 57.1808 | 1.0979 | 0.80 | 5.0984 | 0.0979 | 53.0813 | 1.0192 | +inf | 5 | 5 | 5 |
| call the police | 57.7300 | 61.7194 | 1.0691 | 0.80 | 3.9893 | 0.0691 | 58.6795 | 1.0164 | +inf | 5 | 5 | 5 |
| build the boat | 42.8452 | 45.6521 | 1.0655 | 0.60 | 2.8069 | 0.0655 | 44.7382 | 1.0442 | +inf | 5 | 5 | 5 |
| raise children | 58.1920 | 61.6800 | 1.0599 | 0.80 | 3.4880 | 0.0599 | 59.4181 | 1.0211 | +inf | 5 | 5 | 5 |
| tie knots | 57.3572 | 60.6969 | 1.0582 | 0.60 | 3.3397 | 0.0582 | 59.2043 | 1.0322 | +inf | 5 | 5 | 5 |
| break the window | 59.4545 | 62.6200 | 1.0532 | 0.80 | 3.1655 | 0.0532 | 60.4239 | 1.0163 | +inf | 5 | 5 | 5 |
| clear the table | 56.3184 | 59.0227 | 1.0480 | 0.80 | 2.7043 | 0.0480 | 57.4105 | 1.0194 | +inf | 5 | 5 | 5 |
| make lunch | 44.7433 | 46.4206 | 1.0375 | 0.40 | 1.6772 | 0.0375 | 47.5176 | 1.0620 | +inf | 5 | 5 | 5 |
| eat the apple | 55.4916 | 57.0194 | 1.0275 | 0.60 | 1.5278 | 0.0275 | 57.4552 | 1.0354 | +inf | 5 | 5 | 5 |
| lose keys | 64.5103 | 66.0789 | 1.0243 | 0.80 | 1.5686 | 0.0243 | 65.5570 | 1.0162 | +inf | 5 | 5 | 5 |
| see the show | 56.3631 | 57.7224 | 1.0241 | 0.40 | 1.3593 | 0.0241 | 59.1715 | 1.0498 | +inf | 5 | 5 | 5 |
| turn dials | 71.7074 | 73.2747 | 1.0219 | 0.40 | 1.5674 | 0.0219 | 74.4969 | 1.0389 | +inf | 5 | 5 | 5 |
| get a present | 47.4197 | 48.3747 | 1.0201 | 0.40 | 0.9550 | 0.0201 | 50.3691 | 1.0622 | +inf | 5 | 5 | 5 |
| throw a ball | 53.2087 | 54.2396 | 1.0194 | 0.20 | 1.0310 | 0.0194 | 56.8940 | 1.0693 | +inf | 5 | 5 | 5 |
| spill the water | 52.7849 | 53.6427 | 1.0162 | 0.60 | 0.8577 | 0.0162 | 55.2111 | 1.0460 | +inf | 5 | 5 | 5 |
| lead the meeting | 59.9479 | 60.4243 | 1.0079 | 0.60 | 0.4764 | 0.0079 | 62.4043 | 1.0410 | +inf | 5 | 5 | 5 |
| remember details | 47.4718 | 47.6819 | 1.0044 | 0.20 | 0.2101 | 0.0044 | 51.2421 | 1.0794 | +inf | 5 | 5 | 5 |
| strike a drum | 58.3868 | 58.6309 | 1.0042 | 0.20 | 0.2441 | 0.0042 | 62.1409 | 1.0643 | +inf | 5 | 5 | 5 |

