Best Practices and Recommendations for Crowdsourced QoE - Lessons learned from the Qualinet Task Force ''Crowdsourcing''


Crowdsourcing is a popular approach that outsources tasks via the Internet to a large number of users. Commercial crowdsourcing platforms provide a global pool of users employed for perform-ing short and simple online tasks. For quality assessment of multimedia services and applications, crowdsourcing enables new possibilities by moving the subjective test into the crowd resulting in larger diversity of the test subjects, faster turnover of test campaigns, and reduced costs due to low reimbursement costs of the participants. Further, crowdsourcing allows easily addressing additional features like real-life environments. Crowdsourced quality assessment however is not a straight-forward implementation of existing subjective testing methodologies in an Internet-based environment. Additional challenges and differences to lab studies occur, in conceptual, technical, and motivational areas [9, 25, 26]. For example, the test contents need to be transmitted to the user over the Internet; test users may have low resolution screens influencing the user experience; also users may not understand the test or do not execute the test carefully resulting in unreliable data. This white paper summarizes the recommendations and best practices for crowdsourced qual-ity assessment of multimedia applications from the Qualinet Task Force on "Crowdsourcing". The European Network on Quality of Experience in Multimedia Systems and Services Qualinet (COST Action IC 1003, see established this task force in 2012. Since then it has grown to more then 30 members. The recommendation paper resulted from the experience in designing, implementing, and conducting crowdsourcing experiments as well as the analysis of the crowdsourced user ratings and context data. For understanding the impact of the crowdsourcing environment on QoE assessment and to derive a methodology and setup for crowdsourced QoE assessment, data from traditional lab experiments were compared with results from crowdsourc-ing experiments. Within the crowdsourcing task force, several different application domains and scientific questions were considered, among others: • video and image quality in general, • QoE for HTTP streaming [31, 32] and HTTP adaptive streaming [19, 30], • selfie portrait images perception in a recruitment context [10], • privacy in HDR images and video [39, 20, 36], • compression of HDR images [37] [38], • evaluation of 3D video [38], • image recognizability and aesthetic appeal [12, 13], • multidimensional modeling of web QoE [14], • QoE factors of cloud storage services [21], • enabling eye tracking experiments using web technologies [41]. From a crowdsourcing perspective, the following mechanisms and approaches were investigated which are relevant to understand for crowdsourced quality assessment.