Abstract: Virtual worlds (often referred to as 3D3C for 3D visualization & navigation and the 3C’s of Community, Creation and Commerce) integrate existing and emerging (media) technologies (e.g. instant messaging, video, 3D, VR, AI, chat, voice, etc.) that allow for the support of existing and the development of new kinds of networked services. The emergence of virtual worlds as platforms for networked services is recognized by businesses as an important enabler as it offers the power to reshape the way companies interact with their environments (markets, customers, suppliers, creators, stakeholders, etc.) in a fashion comparable to the Internet and to allow for the development of new (breakthrough) business models, services, applications and devices. Each virtual world however has a different culture and audience making use of these specific worlds for a variety of reasons. These differences in existing Metaverses permit users to have unique experiences. In order to bridge these differences in existing and emerging Metaverses a standardized framework is required, i.e., MPEG-V Media Context and Control (ISO/IEC 23005), that will provide a lower entry level to (multiple) virtual worlds both for the provider of goods and services as well as the user. The aim of this paper is to provide an overview of MPEG-V and its intended standardization areas. Additionally, a review about MPEG-V’s most advanced part – Sensory Information – is given. Keywords: Virtual World, Interoperability, MPEG-V, Sensory Information