paint-brush
Bite-Sized Tips To Make Chinese Full-Text Searchby@gregdevogo
124 reads

Bite-Sized Tips To Make Chinese Full-Text Search

by Greg5mSeptember 27th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Chinese language belongs to the so-called CJK language family (Chinese, Japanese, and Korean) They are probably the most complicated languages for full-text search implement as in them word meanings heavily depend on numerous hieroglyphs variations and their sequences and the characters are not split up into words. To find an exact match in a full text search, we have to face the challenge of tokenization whose main task is to break down the text into low-level units of values that can be searched by the user. The easiest way of Chinese text segmentation assumes the use of N-grams.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Bite-Sized Tips To Make Chinese Full-Text Search
Greg HackerNoon profile picture
Greg

Greg

@gregdevogo

Experienced BackEnd dev, trying to balance between madness, creativity and procrastination

L O A D I N G
. . . comments & more!

About Author

Greg HackerNoon profile picture
Greg@gregdevogo
Experienced BackEnd dev, trying to balance between madness, creativity and procrastination

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite