Users want IoT environments to provide them with personalized support. These environments therefore need to be able to learn user preferences, such as what temperature the room should be or which lights should be turned on. We propose a novel system that separates the tasks of learning a user's preferences and realizing them within the environment. The system is able to capture user preferences by reinterpreting the problem as an optimization problem and applying inverse reinforcement learning to it. The system is shown to be able to accurately extract simple preferences given a small number of user demonstrations. These preferences are then realized by actuates devices running reinforcement learning-based agents to provide an environment consistent with the learnt preferences, also in situations not included in any user demonstration.