### qwen3-8b-base · full · geo idiom config: {'model': 'Qwen/Qwen3-8B-Base', 'reduction': 'geometric_mean', 'medial_only': False, 'dtype': 'bfloat16', 'dataset': '/home/prada/PID_evaluation/data/dataset.tsv', 'num_idioms': 18, 'syn_reg_eps': 0.01} nonidiom config: {'model': 'Qwen/Qwen3-8B-Base', 'reduction': 'geometric_mean', 'medial_only': False, 'dtype': 'bfloat16', 'dataset': '/home/prada/PID_evaluation/data/nonidioms_dataset.tsv', 'num_idioms': 18, 'syn_reg_eps': 0.01} == idioms :: ratio_u_idiom == (N=18 phrases) mean median 95% CI 1.1702 1.1785 [ 1.1428, 1.1984] == non-idioms :: ratio_u_idiom == (N=18 phrases) mean median 95% CI 1.0561 1.0540 [ 1.0446, 1.0682] cross-dataset ratio_u_idiom: idioms - nonidioms Δ=+0.1141 CI=[+0.0837,+0.1447] * == idioms :: ratio_s_idiom == (N=9 phrases) (9 non-finite dropped) mean median 95% CI 1.1979 1.2073 [ 1.1674, 1.2272] == non-idioms :: ratio_s_idiom == (N=0 phrases) (18 non-finite dropped) no finite values cross-dataset ratio_s_idiom: insufficient finite values