Symptom
- Word segmentation in Kanji
- The standard Japanese language module
- The Japanese segmenter doesn't treat whitespace between Kanji words as a token boundary.
- Kanji words separated by whitespace are being treated as a single token by the Japanese segmenter.
Read more...
Environment
- Text Analysis 3.0
- LinguistX Platform 3.7 and 3.8
Product
BusinessObjects Text Analysis, LinguistX platform SDK 3.0 ; SAP BusinessObjects Text Analysis XI 3.0 ; SAP Text Analysis SDK (for OEMs) XI 3.0
Keywords
TA,LXP,SDK,Kanji,segment,token,whitespace , KBA , EIM-TA , Text Analysis , Problem
About this page
This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).Search for additional results
Visit SAP Support Portal's SAP Notes and KBA Search.
SAP Knowledge Base Article - Preview