Symptom
- Word segmentation in Kanji
- The standard Japanese language module
- The Japanese segmenter doesn't treat whitespace between Kanji words as a token boundary.
- Kanji words separated by whitespace are being treated as a single token by the Japanese segmenter.
Read more...
Environment
- Text Analysis 3.0
- LinguistX Platform 3.7 and 3.8
Product
BusinessObjects Text Analysis, LinguistX platform SDK 3.0 ; SAP BusinessObjects Text Analysis XI 3.0 ; SAP Text Analysis SDK (for OEMs) XI 3.0
Keywords
TA,LXP,SDK,Kanji,segment,token,whitespace , KBA , EIM-TA , Text Analysis , Problem
About this page
This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).Search for additional results
Visit SAP Support Portal's SAP Notes and KBA Search.