The widespread use of smart phones with GPS and orientation sensors opens up new possibilities for location-based annotations in outdoor environments. However, a completely different approach is required for indoors. In this study, we introduce IMAF, a novel indoor modeling and annotation framework on a mobile phone. The framework produces a 3D room model in situ with five selections from user without prior knowledge on actual geometry distance or additional apparatus. Using the framework, non-experts can easily capture room dimensions and annotate locations and objects within the room for linking virtual information to the real space represented by an approximated box. For registering 3D room model to the real space, an hybrid method of visual tracking and device sensors obtains accurate orientation tracking result and still achieves interactive frame-rates for real-time applications on a mobile phone. Once the created room model is registered to the real space, user-generated annotations can be attached and viewed in AR and VR modes. Finally, the framework supports object-based space to space registration for viewing and creating annotations from different views other than the view that generated the annotations. The performance of the proposed framework is demonstrated with achieved model accuracy, modeling time, stability of visual tracking and satisfaction of annotation. In the last section, we present two exemplar applications built on IMAF.