# H_u / H_s analysis — qwen3-8b · full · joint

Idiom vs. parallel literal-VP (non-idiom) datasets. All entropies are uniform MC averages over each phrase's observed contexts; scores are unnormalized geometric-mean (or joint) per-token LM probabilities. See `README.md` / `code/SCORING_MATH.md` for the derivation.

> **How to read these magnitudes** (directions, why H_s is +inf, the new finite synergy metrics): see [`INTERPRETATION.md`](../INTERPRETATION.md). Quick key — ↑`H_u/H(p)`, ↑`syn_frac`, ↑`H_s^log` mean **more** synergy; ↑`H_s^reg` and the original ↑`H_s` mean **less** synergy.

## Configuration

| field | idioms run | non-idioms run |
|---|---|---|
| model | Qwen/Qwen3-8B | Qwen/Qwen3-8B |
| reduction | joint | joint |
| medial_only | False | False |
| dtype | bfloat16 | bfloat16 |
| num_idioms | 18 | 18 |
| dataset | /home/prada/PID_evaluation/data/dataset.tsv | /home/prada/PID_evaluation/data/nonidioms_dataset.tsv |

Bound `H_u + H_s ≥ 2H(p) + 2log2 ≥ H(p)` holds for 18/18 idioms and 18/18 non-idioms.

`H_s = +inf` (≥1 non-synergistic slot) for 16/18 idioms and 18/18 non-idioms. A *finite* H_s means **every** context of that phrase is synergistic (p > max(q,r) everywhere).

## Per-metric summary (idioms vs non-idioms)

Means and 95% bootstrap CIs (20k resamples, percentile method, phrase-level). Non-finite values are dropped per metric before bootstrapping; the drop count is shown.

### H(p)

*base entropy, uniform MC average over contexts (nats). ↓ smaller = idiom more concentrated*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 55.3450 | 56.7370 | 46.9836 | 61.6924 | [53.3296, 57.2491] |
| non-idioms | 18 | 0 | 62.7486 | 62.6159 | 55.0481 | 70.4325 | [60.6214, 64.8251] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -7.4036  (95% CI [-10.2650, -4.5376]) → idioms < non-idioms, **significant**.

### H_u

*unique / redundant entropy = -log min{p, max(q,r)} (nats); >= H(p)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 61.2728 | 61.8561 | 52.6485 | 67.6109 | [59.4123, 63.0551] |
| non-idioms | 18 | 0 | 64.1101 | 64.6548 | 55.1385 | 70.8711 | [62.0559, 66.0894] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -2.8373  (95% CI [-5.5105, -0.1111]) → idioms < non-idioms, **significant**.

### H_u / H(p)

*unique-information ratio (>= 1). ↑ bigger = MORE synergy. THE headline metric*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.1092 | 1.0945 | 1.0371 | 1.2158 | [1.0874, 1.1327] |
| non-idioms | 18 | 0 | 1.0221 | 1.0193 | 1.0016 | 1.0711 | [1.0150, 1.0305] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.0871  (95% CI [0.0636, 0.1119]) → idioms > non-idioms, **significant**.

### syn_frac

*synergy coverage in [0,1] = frac. of contexts with p>m. ↑ bigger = MORE synergy (most intuitive)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.7778 | 0.8000 | 0.3000 | 1.0000 | [0.6833, 0.8611] |
| non-idioms | 18 | 0 | 0.3333 | 0.2000 | 0.1000 | 0.7000 | [0.2500, 0.4222] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.4444  (95% CI [0.3167, 0.5667]) → idioms > non-idioms, **significant**.

### H_s^log

*log-space synergy = mean max{0, log p - log m} (nats). ↑ bigger = MORE synergy; finite always*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 5.9278 | 5.4042 | 2.1451 | 10.3585 | [4.8381, 7.0678] |
| non-idioms | 18 | 0 | 1.3615 | 1.1785 | 0.0904 | 3.9812 | [0.9418, 1.8349] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 4.5663  (95% CI [3.3730, 5.7892]) → idioms > non-idioms, **significant**.

### H_s^log / H(p)

*log-space synergy ratio. ↑ bigger = MORE synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.1092 | 0.0945 | 0.0371 | 0.2158 | [0.0874, 0.1327] |
| non-idioms | 18 | 0 | 0.0221 | 0.0193 | 0.0016 | 0.0711 | [0.0150, 0.0305] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.0871  (95% CI [0.0636, 0.1119]) → idioms > non-idioms, **significant**.

### H_s^log signed

*signed log-space synergy = mean(log p - log m); can be negative (net anti-synergistic)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 4.8181 | 4.8030 | -5.1795 | 10.3585 | [3.0756, 6.4248] |
| non-idioms | 18 | 0 | -1.9453 | -2.2716 | -6.4235 | 3.5640 | [-3.1426, -0.7131] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 6.7634  (95% CI [4.6101, 8.8091]) → idioms > non-idioms, **significant**.

### H_s^reg

*regularized H_s (eps-floored, finite, continuous). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 56.4475 | 58.1380 | 47.4888 | 62.8073 | [54.2011, 58.5402] |
| non-idioms | 18 | 0 | 65.9048 | 65.5370 | 57.4388 | 74.5812 | [63.7076, 68.0875] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -9.4573  (95% CI [-12.5586, -6.4056]) → idioms < non-idioms, **significant**.

### H_s^reg / H(p)

*regularized synergy ratio (>= 1). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.0195 | 1.0166 | 1.0001 | 1.0563 | [1.0131, 1.0267] |
| non-idioms | 18 | 0 | 1.0505 | 1.0561 | 1.0256 | 1.0762 | [1.0439, 1.0568] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.0310  (95% CI [-0.0402, -0.0212]) → idioms < non-idioms, **significant**.

### H_s (original)

*synergy entropy = -log max{0, p - max(q,r)} (nats); +inf if ANY slot non-synergistic (mostly +inf)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 2 | 16 | 49.9961 | 49.9961 | 47.9989 | 51.9934 | [47.9989, 51.9934] |
| non-idioms | 0 | 18 | — | — | — | — | (no finite values) |

**Cross-dataset gap**: insufficient finite values in one dataset.

### H_s / H(p)

*original synergy ratio (mostly +inf; use H_s^log or syn_frac instead)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 2 | 16 | 1.0010 | 1.0010 | 1.0001 | 1.0019 | [1.0001, 1.0019] |
| non-idioms | 0 | 18 | — | — | — | — | (no finite values) |

**Cross-dataset gap**: insufficient finite values in one dataset.

## Per-phrase detail

#### Idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| strike a chord | 47.9936 | 58.3521 | 1.2158 | 1.00 | 10.3585 | 0.2158 | 47.9989 | 1.0001 | 47.9989 | 10 | 10 | 10 |
| break the mold | 53.3031 | 62.7322 | 1.1769 | 0.80 | 9.4291 | 0.1769 | 54.2326 | 1.0174 | +inf | 10 | 10 | 10 |
| cut corners | 49.4818 | 58.0839 | 1.1738 | 0.80 | 8.6021 | 0.1738 | 50.4067 | 1.0187 | +inf | 10 | 10 | 10 |
| call the shots | 51.8565 | 59.9725 | 1.1565 | 0.90 | 8.1161 | 0.1565 | 52.3806 | 1.0101 | +inf | 10 | 10 | 10 |
| bite the dust | 58.7131 | 67.3883 | 1.1478 | 0.90 | 8.6752 | 0.1478 | 59.1747 | 1.0079 | +inf | 10 | 10 | 10 |
| spill the beans | 56.6166 | 64.9785 | 1.1477 | 0.90 | 8.3619 | 0.1477 | 57.0937 | 1.0084 | +inf | 10 | 10 | 10 |
| rock the boat | 46.9836 | 52.6485 | 1.1206 | 0.90 | 5.6649 | 0.1206 | 47.4888 | 1.0108 | +inf | 10 | 10 | 10 |
| clear the air | 59.1815 | 65.7746 | 1.1114 | 0.80 | 6.5932 | 0.1114 | 60.1906 | 1.0171 | +inf | 10 | 10 | 10 |
| turn tail | 61.6924 | 67.6109 | 1.0959 | 0.80 | 5.9186 | 0.0959 | 62.8073 | 1.0181 | +inf | 10 | 10 | 10 |
| lead the field | 51.8955 | 56.7264 | 1.0931 | 1.00 | 4.8308 | 0.0931 | 51.9934 | 1.0019 | 51.9934 | 10 | 10 | 10 |
| lose ground | 56.8273 | 61.9707 | 1.0905 | 0.60 | 5.1434 | 0.0905 | 58.6711 | 1.0324 | +inf | 10 | 10 | 10 |
| have a ball | 56.8973 | 61.7416 | 1.0851 | 0.90 | 4.8443 | 0.0851 | 57.6048 | 1.0124 | +inf | 10 | 10 | 10 |
| pull strings | 59.8587 | 64.7706 | 1.0821 | 0.80 | 4.9119 | 0.0821 | 60.8200 | 1.0161 | +inf | 10 | 10 | 10 |
| mean business | 52.0365 | 55.5459 | 1.0674 | 0.90 | 3.5093 | 0.0674 | 52.8327 | 1.0153 | +inf | 10 | 10 | 10 |
| get the sack | 58.9753 | 62.5110 | 1.0600 | 0.50 | 3.5357 | 0.0600 | 61.3140 | 1.0397 | +inf | 10 | 10 | 10 |
| raise hell | 59.4137 | 62.6002 | 1.0536 | 0.80 | 3.1865 | 0.0536 | 60.5278 | 1.0188 | +inf | 10 | 10 | 10 |
| run the show | 56.6468 | 59.5199 | 1.0507 | 0.40 | 2.8731 | 0.0507 | 59.4225 | 1.0490 | +inf | 10 | 10 | 10 |
| make waves | 57.8370 | 59.9821 | 1.0371 | 0.30 | 2.1451 | 0.0371 | 61.0950 | 1.0563 | +inf | 10 | 10 | 10 |

#### Non-idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| call the police | 56.0024 | 59.9836 | 1.0711 | 0.70 | 3.9812 | 0.0711 | 57.4388 | 1.0256 | +inf | 10 | 10 | 10 |
| build the boat | 56.2608 | 58.5481 | 1.0407 | 0.30 | 2.2872 | 0.0407 | 59.4849 | 1.0573 | +inf | 10 | 10 | 10 |
| break the window | 63.6291 | 66.1147 | 1.0391 | 0.60 | 2.4855 | 0.0391 | 65.6485 | 1.0317 | +inf | 10 | 10 | 10 |
| cut hair | 62.1064 | 64.2968 | 1.0353 | 0.60 | 2.1904 | 0.0353 | 64.1365 | 1.0327 | +inf | 10 | 10 | 10 |
| clear the table | 62.1322 | 64.2344 | 1.0338 | 0.60 | 2.1023 | 0.0338 | 64.1074 | 1.0318 | +inf | 10 | 10 | 10 |
| raise children | 63.0995 | 65.0129 | 1.0303 | 0.50 | 1.9134 | 0.0303 | 65.4907 | 1.0379 | +inf | 10 | 10 | 10 |
| tie knots | 66.7942 | 68.2626 | 1.0220 | 0.50 | 1.4684 | 0.0220 | 69.2856 | 1.0373 | +inf | 10 | 10 | 10 |
| turn dials | 69.3639 | 70.8711 | 1.0217 | 0.40 | 1.5072 | 0.0217 | 72.4277 | 1.0442 | +inf | 10 | 10 | 10 |
| get a present | 64.0269 | 65.3492 | 1.0207 | 0.20 | 1.3223 | 0.0207 | 67.7114 | 1.0575 | +inf | 10 | 10 | 10 |
| make lunch | 57.9255 | 58.9603 | 1.0179 | 0.20 | 1.0348 | 0.0179 | 61.6108 | 1.0636 | +inf | 10 | 10 | 10 |
| see the show | 60.4304 | 61.3628 | 1.0154 | 0.20 | 0.9324 | 0.0154 | 64.2043 | 1.0625 | +inf | 10 | 10 | 10 |
| eat the apple | 65.4969 | 66.3884 | 1.0136 | 0.20 | 0.8915 | 0.0136 | 69.1835 | 1.0563 | +inf | 10 | 10 | 10 |
| throw a ball | 65.9650 | 66.6863 | 1.0109 | 0.20 | 0.7213 | 0.0109 | 69.6595 | 1.0560 | +inf | 10 | 10 | 10 |
| lead the meeting | 59.5685 | 60.0128 | 1.0075 | 0.20 | 0.4443 | 0.0075 | 63.2822 | 1.0623 | +inf | 10 | 10 | 10 |
| strike a drum | 61.7028 | 62.1012 | 1.0065 | 0.20 | 0.3984 | 0.0065 | 65.5834 | 1.0629 | +inf | 10 | 10 | 10 |
| lose keys | 69.4901 | 69.9032 | 1.0059 | 0.20 | 0.4131 | 0.0059 | 73.2058 | 1.0535 | +inf | 10 | 10 | 10 |
| spill the water | 70.4325 | 70.7548 | 1.0046 | 0.10 | 0.3223 | 0.0046 | 74.5812 | 1.0589 | +inf | 10 | 10 | 10 |
| remember details | 55.0481 | 55.1385 | 1.0016 | 0.10 | 0.0904 | 0.0016 | 59.2447 | 1.0762 | +inf | 10 | 10 | 10 |