With the increasing convenience of online pen-input devices and wide use of automatic address filling up, construction of online address recognition system is necessary. Korean address can be separated into two parts, the upper part address and the street address. Recognition of street address is more difficult than recognition of upper part address because it has non-fixed form, is hard to construct DB, and is composed of multilingual words.
So, in this work, an online Korean street address recognition system is proposed. In the first phase, words from input street address are recognized by proposed word candidate generation algorithm. Here, we use improved over-segmentation based word recognition method, and generate several word recognition candidates instead of just considering one result. Then, in the second phase, we utilized proper name DB and keyword sequence network to select best result from word recognition candidates. Proper name DB is the collection of building names and corps names. Keyword sequence network represent the sequence structure of Korean street address word, and it is constructed using training data. As the last phase, recognition results of address words are combined and outputted as the result of input street address.
Data used for experiment includes 360 addresses from 9 writers, it contains 1,098 address words, and 4,392 characters. The experimental result shows that if input address is separated into words perfectly, the proposed system can get 90.7% of word recognition accuracy. It means 50% of error is reduced compared with the method only using general word recognizer.