With the continuing explosion of multimedia information in today's society, searching for multimedia information by the contents of interest is becoming more demanding. It will require an efficient metadata modeling scheme and ease-of-use querying mechanism. In this paper, we propose a new metadata description scheme for image and video data which employs the well-defined formality of the Dublin Core description to describe the semantic feature part and the powerful facility of the MPEG-7 description to describe the visual and media feature part of visual information all in the XML/DTD form. It will benefit both Dublin Core users who desire to describe multimedia in their textual documents and MPEG-7 users who want to describe semantic features more formally using Dublin Core structure. The proposed scheme is used in the development of a metadata repository system called MRS for an easy XML/DTD manipulation of complex multimedia document description. MRS provides an easy-of-use user interface as a DTD editor and dictionary function to generate a uniform construct of multimedia document description according to the proposed scheme. This work has been conducted as a partial development of the Visual Information Repository System (VIRS) which is presently implemented under the project funding of a university-industry collaboration research program.