# H_u / H_s analysis — gemma2-9b · medial · joint

Idiom vs. parallel literal-VP (non-idiom) datasets. All entropies are uniform MC averages over each phrase's observed contexts; scores are unnormalized geometric-mean (or joint) per-token LM probabilities. See `README.md` / `code/SCORING_MATH.md` for the derivation.

> **How to read these magnitudes** (directions, why H_s is +inf, the new finite synergy metrics): see [`INTERPRETATION.md`](../INTERPRETATION.md). Quick key — ↑`H_u/H(p)`, ↑`syn_frac`, ↑`H_s^log` mean **more** synergy; ↑`H_s^reg` and the original ↑`H_s` mean **less** synergy.

## Configuration

| field | idioms run | non-idioms run |
|---|---|---|
| model | google/gemma-2-9b | google/gemma-2-9b |
| reduction | joint | joint |
| medial_only | True | True |
| dtype | bfloat16 | bfloat16 |
| num_idioms | 18 | 18 |
| dataset | /home/prada/PID_evaluation/data/dataset.tsv | /home/prada/PID_evaluation/data/nonidioms_dataset.tsv |

Bound `H_u + H_s ≥ 2H(p) + 2log2 ≥ H(p)` holds for 18/18 idioms and 18/18 non-idioms.

`H_s = +inf` (≥1 non-synergistic slot) for 8/18 idioms and 17/18 non-idioms. A *finite* H_s means **every** context of that phrase is synergistic (p > max(q,r) everywhere).

## Per-metric summary (idioms vs non-idioms)

Means and 95% bootstrap CIs (20k resamples, percentile method, phrase-level). Non-finite values are dropped per metric before bootstrapping; the drop count is shown.

### H(p)

*base entropy, uniform MC average over contexts (nats). ↓ smaller = idiom more concentrated*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 57.5099 | 60.7351 | 42.0472 | 69.0590 | [54.3978, 60.3951] |
| non-idioms | 18 | 0 | 66.7677 | 66.3452 | 48.2443 | 85.7114 | [62.6151, 70.9262] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -9.2578  (95% CI [-14.3553, -4.1754]) → idioms < non-idioms, **significant**.

### H_u

*unique / redundant entropy = -log min{p, max(q,r)} (nats); >= H(p)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 67.2416 | 66.8336 | 48.4116 | 82.9853 | [63.8965, 70.3949] |
| non-idioms | 18 | 0 | 69.4203 | 68.4838 | 53.0844 | 92.5047 | [65.4550, 73.6985] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -2.1787  (95% CI [-7.4417, 2.9910]) → idioms < non-idioms, not significant.

### H_u / H(p)

*unique-information ratio (>= 1). ↑ bigger = MORE synergy. THE headline metric*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.1730 | 1.1629 | 1.0585 | 1.3194 | [1.1392, 1.2086] |
| non-idioms | 18 | 0 | 1.0413 | 1.0294 | 1.0000 | 1.1193 | [1.0263, 1.0579] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.1317  (95% CI [0.0945, 0.1712]) → idioms > non-idioms, **significant**.

### syn_frac

*synergy coverage in [0,1] = frac. of contexts with p>m. ↑ bigger = MORE synergy (most intuitive)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.8556 | 1.0000 | 0.6000 | 1.0000 | [0.7778, 0.9333] |
| non-idioms | 18 | 0 | 0.5111 | 0.5000 | 0.0000 | 1.0000 | [0.3889, 0.6333] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.3444  (95% CI [0.2000, 0.4889]) → idioms > non-idioms, **significant**.

### H_s^log

*log-space synergy = mean max{0, log p - log m} (nats). ↑ bigger = MORE synergy; finite always*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 9.7317 | 9.1084 | 3.5962 | 17.8077 | [7.9586, 11.5535] |
| non-idioms | 18 | 0 | 2.6526 | 1.9338 | 0.0000 | 6.8330 | [1.7222, 3.6750] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 7.0791  (95% CI [5.0423, 9.1618]) → idioms > non-idioms, **significant**.

### H_s^log / H(p)

*log-space synergy ratio. ↑ bigger = MORE synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.1730 | 0.1629 | 0.0585 | 0.3194 | [0.1392, 0.2086] |
| non-idioms | 18 | 0 | 0.0413 | 0.0294 | 0.0000 | 0.1193 | [0.0263, 0.0579] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.1317  (95% CI [0.0945, 0.1712]) → idioms > non-idioms, **significant**.

### H_s^log signed

*signed log-space synergy = mean(log p - log m); can be negative (net anti-synergistic)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 8.6309 | 8.8300 | 0.2266 | 17.8077 | [6.3979, 10.9056] |
| non-idioms | 18 | 0 | -0.9686 | 0.4677 | -10.4199 | 6.8330 | [-3.0488, 1.0082] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 9.5994  (95% CI [6.5973, 12.6773]) → idioms > non-idioms, **significant**.

### H_s^reg

*regularized H_s (eps-floored, finite, continuous). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 58.1951 | 60.8324 | 43.8895 | 69.0595 | [55.1164, 61.0493] |
| non-idioms | 18 | 0 | 69.1507 | 68.7596 | 50.0866 | 86.6331 | [64.8872, 73.3658] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -10.9555  (95% CI [-16.1057, -5.8051]) → idioms < non-idioms, **significant**.

### H_s^reg / H(p)

*regularized synergy ratio (>= 1). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.0123 | 1.0016 | 1.0000 | 1.0438 | [1.0056, 1.0195] |
| non-idioms | 18 | 0 | 1.0360 | 1.0389 | 1.0002 | 1.0619 | [1.0282, 1.0436] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.0237  (95% CI [-0.0340, -0.0131]) → idioms < non-idioms, **significant**.

### H_s (original)

*synergy entropy = -log max{0, p - max(q,r)} (nats); +inf if ANY slot non-synergistic (mostly +inf)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 10 | 8 | 57.3529 | 58.3977 | 49.2264 | 69.0595 | [53.6263, 61.1164] |
| non-idioms | 1 | 17 | 57.2872 | 57.2872 | 57.2872 | 57.2872 | [57.2872, 57.2872] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.0657  (95% CI [-3.6290, 3.8622]) → idioms > non-idioms, not significant.

### H_s / H(p)

*original synergy ratio (mostly +inf; use H_s^log or syn_frac instead)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 10 | 8 | 1.0004 | 1.0000 | 1.0000 | 1.0019 | [1.0000, 1.0008] |
| non-idioms | 1 | 17 | 1.0002 | 1.0002 | 1.0002 | 1.0002 | [1.0002, 1.0002] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.0002  (95% CI [-0.0002, 0.0006]) → idioms > non-idioms, not significant.

## Per-phrase detail

#### Idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| cut corners | 49.2264 | 64.9477 | 1.3194 | 1.00 | 15.7212 | 0.3194 | 49.2264 | 1.0000 | 49.2264 | 5 | 5 | 5 |
| break the mold | 56.0027 | 73.8104 | 1.3180 | 1.00 | 17.8077 | 0.3180 | 56.0027 | 1.0000 | 56.0027 | 5 | 5 | 5 |
| strike a chord | 51.8615 | 65.9115 | 1.2709 | 1.00 | 14.0500 | 0.2709 | 51.8615 | 1.0000 | 51.8615 | 5 | 5 | 5 |
| call the shots | 51.2159 | 64.9682 | 1.2685 | 1.00 | 13.7522 | 0.2685 | 51.2160 | 1.0000 | 51.2160 | 5 | 5 | 5 |
| turn tail | 69.0590 | 82.9853 | 1.2017 | 1.00 | 13.9263 | 0.2017 | 69.0595 | 1.0000 | 69.0595 | 5 | 5 | 5 |
| lose ground | 62.3357 | 74.6578 | 1.1977 | 0.80 | 12.3221 | 0.1977 | 63.2567 | 1.0148 | +inf | 5 | 5 | 5 |
| have a ball | 51.3008 | 60.9986 | 1.1890 | 1.00 | 9.6977 | 0.1890 | 51.3014 | 1.0000 | 51.3014 | 5 | 5 | 5 |
| clear the air | 60.7948 | 71.4339 | 1.1750 | 1.00 | 10.6391 | 0.1750 | 60.8720 | 1.0013 | 60.8720 | 5 | 5 | 5 |
| make waves | 51.7355 | 60.7582 | 1.1744 | 0.60 | 9.0227 | 0.1744 | 53.5839 | 1.0357 | +inf | 5 | 5 | 5 |
| rock the boat | 42.0472 | 48.4116 | 1.1514 | 0.60 | 6.3645 | 0.1514 | 43.8895 | 1.0438 | +inf | 5 | 5 | 5 |
| bite the dust | 61.9487 | 71.1428 | 1.1484 | 1.00 | 9.1941 | 0.1484 | 61.9545 | 1.0001 | 61.9545 | 5 | 5 | 5 |
| spill the beans | 61.2027 | 69.6687 | 1.1383 | 1.00 | 8.4659 | 0.1383 | 61.2422 | 1.0006 | 61.2422 | 5 | 5 | 5 |
| pull strings | 62.9543 | 69.8976 | 1.1103 | 0.80 | 6.9432 | 0.1103 | 63.8761 | 1.0146 | +inf | 5 | 5 | 5 |
| lead the field | 57.3648 | 63.5397 | 1.1076 | 0.60 | 6.1749 | 0.1076 | 59.2571 | 1.0330 | +inf | 5 | 5 | 5 |
| run the show | 62.2449 | 68.5246 | 1.1009 | 0.60 | 6.2797 | 0.1009 | 64.0937 | 1.0297 | +inf | 5 | 5 | 5 |
| raise hell | 60.6754 | 66.5900 | 1.0975 | 1.00 | 5.9147 | 0.0975 | 60.7928 | 1.0019 | 60.7928 | 5 | 5 | 5 |
| mean business | 61.7788 | 67.0771 | 1.0858 | 0.80 | 5.2983 | 0.0858 | 62.7322 | 1.0154 | +inf | 5 | 5 | 5 |
| get the sack | 61.4294 | 65.0256 | 1.0585 | 0.60 | 3.5962 | 0.0585 | 63.2942 | 1.0304 | +inf | 5 | 5 | 5 |

#### Non-idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| cut hair | 57.2746 | 64.1077 | 1.1193 | 1.00 | 6.8330 | 0.1193 | 57.2872 | 1.0002 | 57.2872 | 5 | 5 | 5 |
| build the boat | 48.2443 | 53.0844 | 1.1003 | 0.60 | 4.8401 | 0.1003 | 50.0866 | 1.0382 | +inf | 5 | 5 | 5 |
| call the police | 63.2014 | 68.7161 | 1.0873 | 0.80 | 5.5147 | 0.0873 | 64.1913 | 1.0157 | +inf | 5 | 5 | 5 |
| turn dials | 85.7114 | 92.5047 | 1.0793 | 0.80 | 6.7933 | 0.0793 | 86.6331 | 1.0108 | +inf | 5 | 5 | 5 |
| raise children | 67.6469 | 71.8304 | 1.0618 | 0.60 | 4.1834 | 0.0618 | 69.9445 | 1.0340 | +inf | 5 | 5 | 5 |
| tie knots | 69.6636 | 73.5416 | 1.0557 | 0.40 | 3.8780 | 0.0557 | 72.4272 | 1.0397 | +inf | 5 | 5 | 5 |
| clear the table | 65.5453 | 68.2514 | 1.0413 | 0.60 | 2.7061 | 0.0413 | 67.4032 | 1.0283 | +inf | 5 | 5 | 5 |
| make lunch | 59.1649 | 61.1496 | 1.0335 | 0.40 | 1.9846 | 0.0335 | 61.9344 | 1.0468 | +inf | 5 | 5 | 5 |
| break the window | 67.1408 | 69.3321 | 1.0326 | 0.40 | 2.1913 | 0.0326 | 69.9057 | 1.0412 | +inf | 5 | 5 | 5 |
| get a present | 60.7384 | 62.3241 | 1.0261 | 0.40 | 1.5857 | 0.0261 | 63.5720 | 1.0467 | +inf | 5 | 5 | 5 |
| eat the apple | 79.6405 | 81.5235 | 1.0236 | 0.60 | 1.8830 | 0.0236 | 81.5144 | 1.0235 | +inf | 5 | 5 | 5 |
| throw a ball | 56.1311 | 57.4091 | 1.0228 | 0.40 | 1.2780 | 0.0228 | 58.9110 | 1.0495 | +inf | 5 | 5 | 5 |
| see the show | 66.5185 | 67.8496 | 1.0200 | 0.40 | 1.3310 | 0.0200 | 69.3383 | 1.0424 | +inf | 5 | 5 | 5 |
| lose keys | 75.1508 | 76.4247 | 1.0170 | 0.80 | 1.2739 | 0.0170 | 76.6003 | 1.0193 | +inf | 5 | 5 | 5 |
| spill the water | 66.1718 | 67.1004 | 1.0140 | 0.80 | 0.9285 | 0.0140 | 68.1810 | 1.0304 | +inf | 5 | 5 | 5 |
| remember details | 62.6690 | 63.2106 | 1.0086 | 0.20 | 0.5416 | 0.0086 | 66.3669 | 1.0590 | +inf | 5 | 5 | 5 |
| lead the meeting | 74.4524 | 74.4524 | 1.0000 | 0.00 | 0.0000 | 0.0000 | 79.0576 | 1.0619 | +inf | 5 | 5 | 5 |
| strike a drum | 76.7524 | 76.7524 | 1.0000 | 0.00 | 0.0000 | 0.0000 | 81.3576 | 1.0600 | +inf | 5 | 5 | 5 |

