Moving Object Tracking for SLAM-based Augmented Reality
Keywords:Augmented Reality, SLAM
Augmented Reality (AR) systems based on the Simultaneous Localization and Mapping (SLAM) problem have received much attention in the last few years. SLAM allows AR applications on unprepared environments, i.e., without markers. However, by eliminating the marker object, we lose the referential for virtual object projection and the main source of interaction between real and virtual elements. In the recent literature, we found works that integrate an object recognition system to the SLAM in a way the objects are incorporated into the map. In this work, we propose a novel optimization framework for an object-aware SLAM system capable of simultaneously estimating the camera and moving objects positioning in the map. In this way, we can combine the advantages of both marker- and SLAM-based methods. We implement our proposed framework over state-of-the-art SLAM software and demonstrate potential applications for AR like the total occlusion of the marker object.
Randall C Smith and Peter Cheeseman. On the representation and estimation of spatial uncertainty. The international journal of Robotics Research, 5(4):56–68, 1986.
Hugh F Durrant-Whyte. Uncertain geometry in robotics. IEEE Journal on Robotics and Automation, 4(1):23–31, 1988.
José A Castellanos, José M Martínez, José Neira, and Juan D Tardós. Experiments in multisensor mobile robot localization and map building. IFAC Proceedings Volumes, 31(3):369–374, 1998.
Stefan B Williams, Paul Newman, Gamini Dissanayake, and Hugh Durrant-Whyte. Autonomous underwater simultaneous localisation and map building. In Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE International Conference on, volume 2, pages 1793–1798. IEEE, 2000.
Jose E Guivant and Eduardo Mario Nebot. Optimization of the simultaneous localization and map-building algorithm for real-time implementation. IEEE transactions on robotics and automation, 17(3):242–257, 2001.
Jong-Hyuk Kim and Salah Sukkarieh. Airborne simultaneous localisation and map building. In Robotics and Automation, 2003. Proceedings. ICRA’03. IEEE International Conference on, volume 1, pages 406–411. IEEE, 2003.
Georg Klein and David Murray. Parallel tracking and mapping for small ar workspaces. In Mixed and Augmented Reality, 2007. ISMAR 2007. 6th IEEE and ACM International Symposium on, pages 225–234. IEEE, 2007.
Wei Tan, Haomin Liu, Zilong Dong, Guofeng Zhang, and Hujun Bao. Robust monocular slam in dynamic environments. In Mixed and Augmented Reality (ISMAR), 2013 IEEE International Symposium on, pages 209–218. IEEE, 2013.
Kiyoung Kim, Vincent Lepetit, and Woontack Woo. Real-time interactive modeling and scalable multiple object tracking for ar. Computers & Graphics, 36(8):945 – 954, 2012.
Datta Ramadasan, Thierry Chateau, and Marc Chevaldonné. Dcslam: A dynamically constrained real-time slam. In Image Processing (ICIP), 2015 IEEE International Conference on, pages 1130–1134. IEEE, 2015.
Hauke Strasdat, José MM Montiel, and Andrew J Davison. Visual slam: why filter? Image and Vision Computing, 30(2):65–77, 2012.
Robert Oliver Castle and David W Murray. Keyframe-based recognition and localization during video-rate parallel tracking and mapping. Image and Vision Computing, 29(8):524–532, 2011.
David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110, 2004.
Edward Rosten and Tom Drummond. Machine learning for high-speed corner detection. In European conference on computer vision, pages 430–443. Springer, 2006.
Dorian Gálvez-López, Marta Salas, Juan D Tardós, and JMM Montiel. Real-time monocular object slam. Robotics and Autonomous Systems, 75:435–449, 2016.
Dorian Gálvez-López and Juan D Tardos. Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics, 28(5):1188–1197, 2012.
Henning Tjaden, Ulrich Schwanecke, Elmar Schömer, and Daniel Cremers. A region-based gauss-newton approach to real-time monocular multiple object tracking. IEEE transactions on pattern analysis and machine intelligence, 41(8):1797–1812, 2018.
Mahdi Rad and Vincent Lepetit. Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In Proceedings of the IEEE International Conference on Computer Vision, pages 3828–3836, 2017.
Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, and Dieter Fox. Deepim: Deep iterative matching for 6d pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV), pages 683–698, 2018.
Martin Rünz and Lourdes Agapito. Co-fusion: Real-time segmentation, tracking and fusion of multiple objects. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 4471–4478. IEEE, 2017.
Martin Runz, Maud Buffier, and Lourdes Agapito. Maskfusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pages 10–20. IEEE, 2018.
Michael Strecke and Jörg Stückler. Em-fusion: Dynamic object-level slam with probabilistic data association. arXiv preprint arXiv:1904.11781, 2019.
Takehiro Ozawa, Yoshikatsu Nakajima, and Hideo Saito. Simultaneous 3d tracking and reconstruction of multiple moving rigid objects. In 2019 12th Asia Pacific Workshop on Mixed and Augmented Reality (APMAR), pages 1–5. IEEE, 2019.
Mina Henein, Jun Zhang, Robert Mahony, and Viorela Ila. Dynamic slam: The need for speed. arXiv preprint arXiv:2002.08584, 2020.
Raul Mur-Artal and Juan D Tardós. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics, 33(5):1255–1262, 2017.
Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. Orb-slam: a versatile and accurate monocular slam system. IEEE Transactions on Robotics, 31(5):1147–1163, 2015.
Rainer Kümmerle, Giorgio Grisetti, Hauke Strasdat, Kurt Konolige, and Wolfram Burgard. g2o: A general framework for graph optimization. In Robotics and Automation (ICRA), 2011 IEEE International Conference on, pages 3607–3613. IEEE, 2011.
Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. Orb: An efficient alternative to sift or surf. In Computer Vision (ICCV), 2011 IEEE international conference on, pages 2564–2571. IEEE, 2011.
Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua. Brief: Binary robust independent elementary features. In European conference on computer vision, pages 778–792. Springer, 2010.
Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. In Readings in computer vision, pages 726–740. Elsevier, 1987.
Vincent Lepetit, Francesc Moreno-Noguer, and Pascal Fua. Epnp: An accurate o(n) solution to the pnp problem. International Journal of Computer Vision, 81:155–166, 2008.
E. Schrödinger. Die gegenwärtige situation in der quantenmechanik. Naturwissenschaften, 23(48):807–812, Nov 1935.
Siltanen, Hakkarainen, and Honkamaa. Automatic marker field calibration. In Virtual Reality International Conference (VRIC2007), pages 261–267, 2007.