Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

TypeError: Fraction.__new__() got an unexpected keyword argument '_normalize' #38

Closed
2 tasks
ArlanCooper opened this issue Jun 5, 2024 · 2 comments
Closed
2 tasks
Assignees

Comments

@ArlanCooper
Copy link

System Info / 系統信息

ubutu 22.04
cuda 12.1
python 3.12.3
pytorch 2.3.0
transformers 4.40.0

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

按照官方代码构建微调数据集和微调脚本,运行脚本如下:

python finetune.py  data/  /data/share/rwq/glm-4-9b-chat  configs/lora_hb.yaml

运行结果:


│ /home/powerop/work/conda/envs/glm4/lib/python3.12/site-packages/transformers │
│ /trainer.py:3719 in evaluation_loop                                          │
│                                                                              │
│   3716 │   │   │   │   │   EvalPrediction(predictions=all_preds, label_ids=a │
│   3717 │   │   │   │   )                                                     │
│   3718 │   │   │   else:                                                     │
│ ❱ 3719 │   │   │   │   metrics = self.compute_metrics(EvalPrediction(predict │
│   3720 │   │   else:                                                         │
│   3721 │   │   │   metrics = {}                                              │
│   3722                                                                       │
│                                                                              │
│ /home/powerop/work/rwq/GLM-4/finetune_demo/finetune.py:333 in                │
│ compute_metrics                                                              │
│                                                                              │
│   330 │   │   for k, v in scores[0].items():                                 │
│   331 │   │   │   metrics_dct[k].append(round(v['f'] * 100, 4))              │
│   332 │   │   metrics_dct['bleu-4'].append(                                  │
│ ❱ 333 │   │   │   sentence_bleu([label_tokens], pred_tokens, smoothing_funct │
│   334 │   return {k: np.mean(v) for k, v in metrics_dct.items()}             │
│   335                                                                        │
│   336                                                                        │
│                                                                              │
│ /home/powerop/work/conda/envs/glm4/lib/python3.12/site-packages/nltk/transla │
│ te/bleu_score.py:107 in sentence_bleu                                        │
│                                                                              │
│   104 │   :return: The sentence-level BLEU score. Returns a list if multiple │
│   105 │   :rtype: float / list(float)                                        │
│   106 │   """                                                                │
│ ❱ 107 │   return corpus_bleu(                                                │
│   108 │   │   [references], [hypothesis], weights, smoothing_function, auto_ │
│   109 │   )                                                                  │
│   110                                                                        │
│                                                                              │
│ /home/powerop/work/conda/envs/glm4/lib/python3.12/site-packages/nltk/transla │
│ te/bleu_score.py:210 in corpus_bleu                                          │
│                                                                              │
│   207 │   │   # For each order of ngram, calculate the numerator and         │
│   208 │   │   # denominator for the corpus-level modified precision.         │
│   209 │   │   for i in range(1, max_weight_length + 1):                      │
│ ❱ 210 │   │   │   p_i = modified_precision(references, hypothesis, i)        │
│   211 │   │   │   p_numerators[i] += p_i.numerator                           │
│   212 │   │   │   p_denominators[i] += p_i.denominator                       │
│   213                                                                        │
│                                                                              │
│ /home/powerop/work/conda/envs/glm4/lib/python3.12/site-packages/nltk/transla │
│ te/bleu_score.py:368 in modified_precision                                   │
│                                                                              │
│   365 │   # Usually this happens when the ngram order is > len(reference).   │
│   366 │   denominator = max(1, sum(counts.values()))                         │
│   367 │                                                                      │
│ ❱ 368 │   return Fraction(numerator, denominator, _normalize=False)          │
│   369                                                                        │
│   370                                                                        │
│   371 def closest_ref_length(references, hyp_len):                           │
╰──────────────────────────────────────────────────────────────────────────────╯
TypeError: Fraction.__new__() got an unexpected keyword argument '_normalize'
 17%|█▋        | 500/3000 [06:29<32:27,  1.28it/s]



Expected behavior / 期待表现

期望不报错,训练可以成功结束

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jun 5, 2024
@panzy25
Copy link

panzy25 commented Jun 8, 2024

nltk官方GitHub已经修复了对python 3.12.3 的支持,详细可以看下这个GitHub commit:
nltk/nltk@86fa083

可选解决方案:

  • option1: 你可以直接从GitHub安装最新的nltk库
  • option2: 可以降级python,从3.12降级到更低的版本(since Python 3.12 removed _normalize=False from the standard lib Fraction constructor.)
  • option3: 直接在你的本地库修改这个模块的代码,只要follow上面的commit修改即可

@zRzRzRzRzRzRzR
Copy link
Member

该说明的PR已经被合并,感谢你的支持

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants