In this paper, we propose a framework for isolating
text regions from natural scene images. The
main algorithm has two functions: it generates text
region candidates, and it verifies of the label of
the candidates (text or non-text). The text region
candidates are generated through a modified Kmeans
clustering algorithm, which references texture
features, edge information and color information.
The candidate labels are then verified in a
global sense by the Markov Random Field model
where collinearity weight is added as long as most
texts are aligned. The proposed method achieves
reasonable accuracy for text extraction from moderately
dicult examples from the ICDAR 2003
database.