El aprendizaje en sistemas autónomos e inteligentes: visión general y sesgos de fuentes de datos

Pablo  Jiménez Schlegl

doi:10.3989/arbor.2021.802005

Authors

Pablo Jiménez Schlegl Consejo Superior de Investigaciones Científicas - Universitat Politècnica de Catalunya https://orcid.org/0000-0003-3627-4938

DOI:

https://doi.org/10.3989/arbor.2021.802005

Keywords:

Autonomous and Intelligent Systems, automatic learning methods, bias in data sources

Abstract

Autonomous and Intelligent Systems (A/IS, to adhere to the terminology of the IEEE Ethically Aligned Design report) can gather their knowledge by different means and from different sources. In principle, learning algorithms are neutral; rather, it is the data they are fed during the learning period that can introduce biases or a specific ethical orientation. Human control over the learning process is more straightforward in learning from demonstration, where data sources are restricted to the choices of the demonstrator (or teacher), but even in unsupervised versions of reinforcement learning, biases are present via the definition of the reward function. In this paper we provide an overview of learning paradigms of artificial systems: supervised and unsupervised methods, with the most striking examples in each category, without too much technical detail. Furthermore, we describe the types of data sources that are presently available and in use by the robotics community. We also focus on observable bias in image datasets and originated by human annotation. We point at quite recent research on bias in social robot navigation and end with a brief reflection about ambient influences on future learning robots.

Downloads

Download data is not yet available.

References

Australian Centre for Field Robotics, ACFR. Marine Robotics Datasets. Available at http://marine.acfr.usyd.edu.au/datasets/

Berkeley Artificial Intelligence Research, BAIR. RoboNet: A Dataset for Large-Scale Multi-Robot Learning. Available at: https://bair.berkeley.edu/blog/2019/11/26/robo-net/

Buchanan, Bruce G. (1989). Can Machine Learning Offer Anything to Expert Systems? Machine-Learning, 4: 251-254. https://doi.org/10.1007/BF00130712

Choi, Sunlok. The Awesome Robotics Datasets. Available at: https://github.com/sunglok/awesome-robotics-datasets

Crawford, Kate and Paglen, Trevor. Excavating AI: The Politics of Images in Machine Learning Training Sets. Available at: https://excavating.ai

Everingham, Mark; Van Gool, Luc; Williams, Christopher K.I.; Winn, John and Zisserman, Andrew (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision. 88 (2): 303-338. https://doi.org/10.1007/s11263-009-0275-4

Ford Center for Autonomous Vehicles (FCAV), University of Michigan. FCAV M-Air Pedestrian (FMP) Dataset of monocular RGB images and Planar LiDAR data for pedestrian detection. Available at: https://github.com/umautobots/FMP-dataset [updated November 10th 2019; cited January, 15th 2021]

Geiger, Andreas; Lenz, Philip; Stiller, Christoph and Urtasun, Raquel (2013). Vision meets Robotics: The KITTI Dataset. International Journal of Robotics Research. 32 (11): 1231-1237. https://doi.org/10.1177/0278364913491297

Hervé, Nicolas and Boujemaa, Nozha (2007). Image annotation: which approach for realistic databases? In: CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, Amsterdam, The Netherlands, July 2007. New York, NY, USA: Association for Computing Machinery, pp. 170-177. https://doi.org/10.1145/1282280.1282310

Huang, Yongkiang and Sun, Yu (2019) A dataset of daily interactive manipulation. The International Journal of Robotics Research. 38(8):879-886. https://doi.org/10.1177/0278364919849091

Hurtado, Juana Valeria; Londoño, Laura, and Valada, Abhinav (2021), From Learning to Relearning: A Framework for Diminishing Bias in Social Robot Navigation. Frontiers in Robotics and AI, 8: 650325. https://doi.org/10.3389/frobt.2021.650325 PMid:33842558 PMCid:PMC8024571

ImageNet, Stanford Vision Lab, Stanford University, Princeton University. Available at: https://www.image-net.org/index.php

Inoue, Masashi (2004). On the need for annotation-based image retrieval. In: Proceedings of the ACM SIGIR Workshop on Information Retrieval in Context (IRiX), Sheffield, UK, July 29th 2004. Department of Information Studies, Royal School of Library and Information Science Copenhagen, Denmark: pp. 44-46

Johnson-Roberson, Matthew; Barto, Charles; Mehta, Rounak; Sridhar, Sarath Nittur; Rosaen, Karl and Vasudevan, Ram (2017). Driving in the matrix: Can virtual worlds replace human- generated annotations for real world tasks? In: Proceedings of the IEEE International Conference on Robotics and Automation, 29 May-3 June 2017, Singapore. IEEE: pp. 746-753. https://doi.org/10.1109/ICRA.2017.7989092

Kümmerer, Matthias; Bylinskii, Zoya; Judd, Tilke; Borji, Ali; Itti, Laurent; Durand, Frédo; Oliva, Aude and Torralba, Antonio. MIT/Tübingen Saliency Benchmark. Available at: https://saliency.tuebingen.ai/ [updated 2018; cited January, 16th 2021]

Levine, Sergey. Google Brain Robotics Data. Available at: https://sites.google.com/site/brainrobotdata/home, [updated August 5th 2016; cited January, 15th 2021]

Lim, Hengtee (2020) 18 Best Datasets for Machine Learning Robotics. Available at: https://lionbridge.ai/datasets/17-best-robotics-datasets-for-machine-learning/

Locolab, University of Michigan. The Effect of Walking Incline and Speed on Human Leg Kinematics, Kinetics, and EMG. Available at: https://ieee-dataport.org/open-access/effect-walking-incline-and-speed-human-leg-kinematics-kinetics-and-emg

Lyons, Michael (2020). Excavating "Excavating AI": The Elephant in the Gallery. Submitted: https://arxiv.org/abs/2009.01215 https://doi.org/10.2139/ssrn.3901640

Mandlekar, Ajay; Booher, Jonathan; Spero, Max; Tung, Albert; Gupta, Anchit; Zhu, Yuke; Garg, Animesh; Savarese, Silvio and Fei-Fei, Li (2019). Scaling Robot Supervision to Hundreds of Hours with RoboTurk: Robotic Manipulation Dataset through Human Reasoning and Dexterity. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3-8 November 2019. IEEE, pp. 1048-1055. https://doi.org/10.1109/IROS40897.2019.8968114

Martínez, David; Alenyà, Guillem and Torras, Carme (2017). Artificial Intelligence. 247: 295-312. https://doi.org/10.1016/j.artint.2015.02.006

Martínez Mozos, Oscar; Nakashima, Kazuto; Jung, Hojung; Iwashita, Yumi, and Kurazume, Ryo (2019). Fukuoka datasets for place categorization. International Journal of Robotics Research, 38 (5): 507-517. https://doi.org/10.1177/0278364919835603

MultiDrone EU Project Dataset. Available at: https://multidrone.eu/multidrone-public-dataset/ [updated January 23rd 2020; cited January, 15th 2021]

Nehaniv, Chrystopher and Dautenhahn, Kerstin (2002). The Correspondence Problem. In: K. Dautenhahn and C. L. Nehaniv (eds.) Imitation in Animals and Artifacts. Cambridge, MA, USA: MIT Press, pp. 41-61,

Pandey, Gaurav; McBride James R. and Eustice, Ryan M. (2011). Ford Campus vision and lidar data set. The International Journal of Robotics Research. 30 (13): 1543-1552. https://doi.org/10.1177/0278364911400640

Ramisa, Arnau; Yan, Fei; Moreno-Noguer Francesc and Mikolajczyk Krystian (2018). BreakingNews: Article annotation by image and text processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40 (5): 1072-1085. https://doi.org/10.1109/TPAMI.2017.2721945 PMid:28682246

Ruiz-Sarmiento, Jose Raul; Galindo, Cipriano and Gonzalez-Jimenez, Javier. (2017) Robot@Home, a robotic dataset for semantic mapping of home environments. The International Journal of Robotics Research, 36 (2): 131-141. https://doi.org/10.1177/0278364917695640

Torralba, Antonio and Efros, Alexei A. (2011) Unbiased look at dataset bias. In IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, 20-25 June, 2011. IEEE, pp. 1521-1528. https://doi.org/10.1109/CVPR.2011.5995347

von Ahn, Luis; Liu, Ruoran and Blum, Manuel (2006). Peekaboom: a game for locating objects in images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '06). New York, (USA): Association for Computing Machinery, pp. 55-64. https://doi.org/10.1145/1124772.1124782

Wang, John and Olson, Edwin (2016) AprilTag 2: Efficient and robust fiducial detection. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejon, Korea, 9-14 October, 2016. IEEE, pp. 4193-4198. https://doi.org/10.1109/IROS.2016.7759617

Xiao, Jianxiong; Hays, James; Ehinger, Krista A; Oliva, Aude and Torralba, Antonio (2010). Sun database: Large-scale scene recognition from abbey to zoo. In IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, 13-18 June, 2010. IEEE, pp. 3485-3492. https://doi.org/10.1109/CVPR.2010.5539970