Nowaday there are increasing demand of privacy and security in mobile network. To fulfill such needs in various mobile area, security solution must be efficient and versatile. Elliptic Curve Cryptography is the right solution evolving in this trend. This paper explains and describes the efficient software implementation of Elliptic Curve(EC). The library is targeted onto ARM9 cores, which are most widely adopted and embedded in mobile devices. To implement efficient and optimized software solution, we develop the RTL(Register Transfer Level) equivalent cycle-accurate simulator, which enables us to optimize the cross-compiled executable image in micro-architectural level. Such a technique has been impossible with commercially avaliable processor simulator. Using developed cycle-accurate simulator, we optimized executable library image and achieved about 15% performance improvement over conventionally cross compiled library. With implementation results, our solution has improved performance about two and half times as fast as results of other implementations reported in recent literatures. With the statistics estimated in software implementation of kP, we make sure software our kP solution has competitive performance compared to hardware implementations, which implies in the future software solutions of kP will replace the existing hardware kP solutions in public key authentication and signature application field.