v2: Handwritten character recognition is supported


2025.9

增加手写输入,也可以直接在https://huggingface.co/spaces/raycosine/Detangutify 使用(可能需要翻墙)

Handwriting is supported. You can also use it at https://huggingface.co/spaces/raycosine/Detangutify

使用 Noto Serif Tangut (Noto 西夏宋体)作为基础字体,经过图像增强及线宽归一化,对每个字形生成多个样本,以缩小手写体与印刷体的差异。前端传入手写或上传的图像后,通过特征向量比对返回候选字形。

由于衬线印刷字体与手写体还是存在差异的,所以准确率一般。是否需要书写板正、避免连笔,不太好说;只能宽泛地说,字形复杂度与识别准确率之间没有关联,但对于某些特定偏旁部件的字形准确率会显著降低。

Using Noto Serif Tangut as the base font, multiple samples are generated for each glyph through image augmentation and stroke-width normalization to reduce the gap between handwritten and printed forms. When a handwritten or uploaded image is submitted from the frontend, candidates are returned via feature-vector matching. Since serif printed fonts differ significantly from handwriting, the overall accuracy is quite limited. Whether one needs to write neatly or avoid cursive strokes is hard to say; roughly speaking, glyph complexity does not correlate with recognition accuracy, but for certain components, the accuracy drops noticeably.

问:为什么不用手写字体训练?答曰:没有数据。

Get 西夏文在线输入 Tangut IME Online

Leave a comment

Log in with itch.io to leave a comment.