# H_u / H_s analysis — gemma2-9b · medial · geo

Idiom vs. parallel literal-VP (non-idiom) datasets. All entropies are uniform MC averages over each phrase's observed contexts; scores are unnormalized geometric-mean (or joint) per-token LM probabilities. See `README.md` / `code/SCORING_MATH.md` for the derivation.

> **How to read these magnitudes** (directions, why H_s is +inf, the new finite synergy metrics): see [`INTERPRETATION.md`](../INTERPRETATION.md). Quick key — ↑`H_u/H(p)`, ↑`syn_frac`, ↑`H_s^log` mean **more** synergy; ↑`H_s^reg` and the original ↑`H_s` mean **less** synergy.

## Configuration

| field | idioms run | non-idioms run |
|---|---|---|
| model | google/gemma-2-9b | google/gemma-2-9b |
| reduction | geometric_mean | geometric_mean |
| medial_only | True | True |
| dtype | bfloat16 | bfloat16 |
| num_idioms | 18 | 18 |
| dataset | /home/prada/PID_evaluation/data/dataset.tsv | /home/prada/PID_evaluation/data/nonidioms_dataset.tsv |

Bound `H_u + H_s ≥ 2H(p) + 2log2 ≥ H(p)` holds for 18/18 idioms and 18/18 non-idioms.

`H_s = +inf` (≥1 non-synergistic slot) for 5/18 idioms and 15/18 non-idioms. A *finite* H_s means **every** context of that phrase is synergistic (p > max(q,r) everywhere).

## Per-metric summary (idioms vs non-idioms)

Means and 95% bootstrap CIs (20k resamples, percentile method, phrase-level). Non-finite values are dropped per metric before bootstrapping; the drop count is shown.

### H(p)

*base entropy, uniform MC average over contexts (nats). ↓ smaller = idiom more concentrated*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 4.3880 | 4.3980 | 3.2997 | 6.1545 | [4.0457, 4.7421] |
| non-idioms | 18 | 0 | 5.1801 | 5.2305 | 3.5862 | 6.8543 | [4.7442, 5.6336] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.7921  (95% CI [-1.3581, -0.2267]) → idioms < non-idioms, **significant**.

### H_u

*unique / redundant entropy = -log min{p, max(q,r)} (nats); >= H(p)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 5.4043 | 5.4007 | 3.9589 | 7.0758 | [5.0499, 5.7689] |
| non-idioms | 18 | 0 | 5.6265 | 5.6746 | 4.1000 | 7.6541 | [5.2008, 6.0664] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.2222  (95% CI [-0.7910, 0.3463]) → idioms < non-idioms, not significant.

### H_u / H(p)

*unique-information ratio (>= 1). ↑ bigger = MORE synergy. THE headline metric*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.2395 | 1.2340 | 1.1101 | 1.3868 | [1.2031, 1.2763] |
| non-idioms | 18 | 0 | 1.0911 | 1.0981 | 1.0067 | 1.2103 | [1.0679, 1.1153] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.1484  (95% CI [0.1057, 0.1925]) → idioms > non-idioms, **significant**.

### syn_frac

*synergy coverage in [0,1] = frac. of contexts with p>m. ↑ bigger = MORE synergy (most intuitive)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.9333 | 1.0000 | 0.6000 | 1.0000 | [0.8778, 0.9778] |
| non-idioms | 18 | 0 | 0.7000 | 0.8000 | 0.4000 | 1.0000 | [0.6000, 0.7889] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.2333  (95% CI [0.1222, 0.3444]) → idioms > non-idioms, **significant**.

### H_s^log

*log-space synergy = mean max{0, log p - log m} (nats). ↑ bigger = MORE synergy; finite always*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.0164 | 1.0671 | 0.5538 | 1.5045 | [0.8870, 1.1429] |
| non-idioms | 18 | 0 | 0.4464 | 0.4773 | 0.0451 | 0.9438 | [0.3406, 0.5550] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.5699  (95% CI [0.4035, 0.7371]) → idioms > non-idioms, **significant**.

### H_s^log / H(p)

*log-space synergy ratio. ↑ bigger = MORE synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.2395 | 0.2340 | 0.1101 | 0.3868 | [0.2031, 0.2763] |
| non-idioms | 18 | 0 | 0.0911 | 0.0981 | 0.0067 | 0.2103 | [0.0679, 0.1153] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.1484  (95% CI [0.1057, 0.1925]) → idioms > non-idioms, **significant**.

### H_s^log signed

*signed log-space synergy = mean(log p - log m); can be negative (net anti-synergistic)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 0.9825 | 0.9853 | 0.5267 | 1.5045 | [0.8476, 1.1196] |
| non-idioms | 18 | 0 | 0.3156 | 0.3220 | -0.2777 | 0.9438 | [0.1704, 0.4609] |

**Cross-dataset gap** (idioms − non-idioms): Δ = 0.6669  (95% CI [0.4680, 0.8660]) → idioms > non-idioms, **significant**.

### H_s^reg

*regularized H_s (eps-floored, finite, continuous). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 5.1990 | 5.1641 | 3.6468 | 6.7570 | [4.7640, 5.6319] |
| non-idioms | 18 | 0 | 7.3182 | 7.5207 | 4.9944 | 10.4854 | [6.6381, 8.0109] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -2.1192  (95% CI [-2.9361, -1.3097]) → idioms < non-idioms, **significant**.

### H_s^reg / H(p)

*regularized synergy ratio (>= 1). ↑ bigger = LESS synergy*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 18 | 0 | 1.1875 | 1.1548 | 1.0652 | 1.4672 | [1.1360, 1.2457] |
| non-idioms | 18 | 0 | 1.4156 | 1.4279 | 1.1131 | 1.6472 | [1.3456, 1.4827] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.2281  (95% CI [-0.3123, -0.1373]) → idioms < non-idioms, **significant**.

### H_s (original)

*synergy entropy = -log max{0, p - max(q,r)} (nats); +inf if ANY slot non-synergistic (mostly +inf)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 13 | 5 | 4.8821 | 4.9278 | 3.6468 | 6.7570 | [4.4231, 5.3576] |
| non-idioms | 3 | 15 | 5.6531 | 5.2145 | 4.9944 | 6.7505 | [4.9944, 6.7505] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.7711  (95% CI [-1.8246, 0.1164]) → idioms < non-idioms, not significant.

### H_s / H(p)

*original synergy ratio (mostly +inf; use H_s^log or syn_frac instead)*

| dataset | N finite | non-finite dropped | mean | median | min | max | 95% CI of mean |
|---|---:|---:|---:|---:|---:|---:|---|
| idioms | 13 | 5 | 1.1242 | 1.1086 | 1.0652 | 1.2095 | [1.1003, 1.1501] |
| non-idioms | 3 | 15 | 1.3018 | 1.2479 | 1.1131 | 1.5443 | [1.1131, 1.5443] |

**Cross-dataset gap** (idioms − non-idioms): Δ = -0.1776  (95% CI [-0.4136, 0.0058]) → idioms < non-idioms, not significant.

## Per-phrase detail

#### Idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| cut corners | 3.8895 | 5.3940 | 1.3868 | 1.00 | 1.5045 | 0.3868 | 4.1783 | 1.0743 | 4.1783 | 5 | 5 | 5 |
| break the mold | 3.2997 | 4.5355 | 1.3745 | 1.00 | 1.2358 | 0.3745 | 3.6468 | 1.1052 | 3.6468 | 5 | 5 | 5 |
| call the shots | 3.4746 | 4.6821 | 1.3475 | 1.00 | 1.2075 | 0.3475 | 3.8521 | 1.1086 | 3.8521 | 5 | 5 | 5 |
| strike a chord | 4.3462 | 5.7806 | 1.3300 | 1.00 | 1.4344 | 0.3300 | 4.6297 | 1.0652 | 4.6297 | 5 | 5 | 5 |
| clear the air | 3.9718 | 5.1022 | 1.2846 | 1.00 | 1.1305 | 0.2846 | 4.3781 | 1.1023 | 4.3781 | 5 | 5 | 5 |
| have a ball | 3.3247 | 4.2296 | 1.2722 | 1.00 | 0.9049 | 0.2722 | 3.9323 | 1.1828 | 3.9323 | 5 | 5 | 5 |
| turn tail | 5.3585 | 6.7460 | 1.2589 | 1.00 | 1.3875 | 0.2589 | 5.7165 | 1.0668 | 5.7165 | 5 | 5 | 5 |
| make waves | 4.3227 | 5.4073 | 1.2509 | 0.60 | 1.0847 | 0.2509 | 6.3421 | 1.4672 | +inf | 5 | 5 | 5 |
| pull strings | 4.4498 | 5.4993 | 1.2358 | 1.00 | 1.0495 | 0.2358 | 5.0918 | 1.1443 | 5.0918 | 5 | 5 | 5 |
| lose ground | 4.9293 | 6.0735 | 1.2321 | 0.80 | 1.1443 | 0.2321 | 6.1426 | 1.2462 | +inf | 5 | 5 | 5 |
| bite the dust | 4.9042 | 6.0228 | 1.2281 | 1.00 | 1.1186 | 0.2281 | 5.4520 | 1.1117 | 5.4520 | 5 | 5 | 5 |
| rock the boat | 3.3110 | 3.9589 | 1.1957 | 0.80 | 0.6479 | 0.1957 | 4.7945 | 1.4480 | +inf | 5 | 5 | 5 |
| lead the field | 4.4935 | 5.3713 | 1.1953 | 1.00 | 0.8778 | 0.1953 | 5.2364 | 1.1653 | 5.2364 | 5 | 5 | 5 |
| run the show | 4.8508 | 5.6751 | 1.1699 | 0.80 | 0.8243 | 0.1699 | 6.2591 | 1.2903 | +inf | 5 | 5 | 5 |
| spill the beans | 4.0742 | 4.7515 | 1.1662 | 1.00 | 0.6773 | 0.1662 | 4.9278 | 1.2095 | 4.9278 | 5 | 5 | 5 |
| mean business | 6.1545 | 7.0758 | 1.1497 | 1.00 | 0.9212 | 0.1497 | 6.7570 | 1.0979 | 6.7570 | 5 | 5 | 5 |
| raise hell | 4.8011 | 5.3914 | 1.1229 | 1.00 | 0.5903 | 0.1229 | 5.6682 | 1.1806 | 5.6682 | 5 | 5 | 5 |
| get the sack | 5.0276 | 5.5813 | 1.1101 | 0.80 | 0.5538 | 0.1101 | 6.5769 | 1.3082 | +inf | 5 | 5 | 5 |

#### Non-idioms (sorted by H_u/H, descending)

| phrase | H(p) | H_u | H_u/H | syn_frac | H_s^log | H_s^log/H | H_s^reg | H_s^reg/H | H_s (orig) | n_idiom | n_head | n_non |
|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| cut hair | 4.4867 | 5.4305 | 1.2103 | 1.00 | 0.9438 | 0.2103 | 4.9944 | 1.1131 | 4.9944 | 5 | 5 | 5 |
| build the boat | 3.5862 | 4.1155 | 1.1476 | 0.60 | 0.5292 | 0.1476 | 5.7543 | 1.6046 | +inf | 5 | 5 | 5 |
| call the police | 4.3371 | 4.9723 | 1.1465 | 0.80 | 0.6352 | 0.1465 | 5.8118 | 1.3400 | +inf | 5 | 5 | 5 |
| clear the table | 4.6943 | 5.3459 | 1.1388 | 0.80 | 0.6516 | 0.1388 | 6.1254 | 1.3049 | +inf | 5 | 5 | 5 |
| break the window | 4.1786 | 4.6710 | 1.1178 | 1.00 | 0.4924 | 0.1178 | 5.2145 | 1.2479 | 5.2145 | 5 | 5 | 5 |
| turn dials | 6.8543 | 7.6541 | 1.1167 | 0.80 | 0.7998 | 0.1167 | 8.1487 | 1.1888 | +inf | 5 | 5 | 5 |
| raise children | 5.2920 | 5.8758 | 1.1103 | 0.60 | 0.5837 | 0.1103 | 7.5210 | 1.4212 | +inf | 5 | 5 | 5 |
| eat the apple | 5.7704 | 6.3833 | 1.1062 | 0.80 | 0.6130 | 0.1062 | 7.2683 | 1.2596 | +inf | 5 | 5 | 5 |
| throw a ball | 3.7271 | 4.1000 | 1.1000 | 0.80 | 0.3729 | 0.1000 | 5.4945 | 1.4742 | +inf | 5 | 5 | 5 |
| make lunch | 5.1690 | 5.6657 | 1.0961 | 0.40 | 0.4967 | 0.0961 | 8.0741 | 1.5620 | +inf | 5 | 5 | 5 |
| tie knots | 5.4310 | 5.8932 | 1.0851 | 0.60 | 0.4622 | 0.0851 | 8.0094 | 1.4747 | +inf | 5 | 5 | 5 |
| lose keys | 5.4353 | 5.7607 | 1.0599 | 0.80 | 0.3254 | 0.0599 | 7.5205 | 1.3836 | +inf | 5 | 5 | 5 |
| see the show | 5.3670 | 5.6836 | 1.0590 | 0.40 | 0.3166 | 0.0590 | 8.3761 | 1.5607 | +inf | 5 | 5 | 5 |
| lead the meeting | 5.9885 | 6.2523 | 1.0440 | 0.80 | 0.2638 | 0.0440 | 8.1792 | 1.3658 | +inf | 5 | 5 | 5 |
| get a present | 5.0294 | 5.2247 | 1.0388 | 0.40 | 0.1953 | 0.0388 | 8.2844 | 1.6472 | +inf | 5 | 5 | 5 |
| spill the water | 4.3712 | 4.4955 | 1.0284 | 1.00 | 0.1243 | 0.0284 | 6.7505 | 1.5443 | 6.7505 | 5 | 5 | 5 |
| remember details | 6.7719 | 6.9570 | 1.0273 | 0.60 | 0.1851 | 0.0273 | 9.7149 | 1.4346 | +inf | 5 | 5 | 5 |
| strike a drum | 6.7515 | 6.7965 | 1.0067 | 0.40 | 0.0451 | 0.0067 | 10.4854 | 1.5531 | +inf | 5 | 5 | 5 |

