Efficient Position Estimation Based on GPU-Accelerated Content-based Image Retrieval
DOI:
https://doi.org/10.13052/1550-4646.1423Keywords:
GPU, LSH, SIFT, Content-based Image Retrieval, Position EstimationAbstract
We propose an efficient position estimation method based on GPUaccelerated content-based image retrieval (CBIR). The idea is to use videos of first-person vision associated with geographical position information as the database. When a user sends a current subjective image, the system estimates the position using CBIR. Since features extracted from images are in general high-dimensional vectors, thousands of vectors are extracted even from a single image, resulting in high processing cost. On the other hand, GPUs (graphics processing unit) have contributed to accelerate various processing, while they are originally for graphics processing. Therefore, we utilize GPU to accelerate CBIR with appropriate data structures and algorithms. Moreover, our proposed method considers spatial locality of pedestrians in position estimation applications in order to improve accuracy. We demonstrate the efficiency and accuracy of the proposed method through experiments using a video dataset.
Downloads
References
Kameda, Y., and Ohta, Y. (2010). Image retrieval of first-person
vision for pedestrian navigation in urban area. In 20th International
Conference on Pattern Recognition (ICPR), pp. 364–367.
Kurata, T., Kourogi, M., Ishikawa, T., Kameda, Y., Aoki, K., and
Ishikawa, J. (2011). Indoor-outdoor navigation system for visuallyimpaired
pedestrians: Preliminary evaluation of position measurement
and obstacle display. In 15th Annual International Symposium on
Wearable Computers (ISWC), pp. 123–124.
Takizawa, H., Orita, K., Aoyagi, M., Ezaki, N., and Mizuno, S.
(2017). A Spot Reminder System for the Visually Impaired Based on
a Smartphone Camera. Sensors, 17(2), 291.
Lowe, D. G. (1999). Object recognition from local scale-invariant features.
In Proceedings of the 7th International Conference on Computer
Vision (ICCV 1999), pp. 1150–1157.
Bay, H., Tuytelaars, T., and Van Gool, L. (2006). Surf: Speeded
up robust features. In European conference on computer vision,
(pp. 404–417). Springer, Berlin, Heidelberg.
Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., and
Phillips, J. C. (2008). GPU computing. In Proceedings of the IEEE,
(5), 879–899.
Indyk, P., and Motwani, R. (1998). Approximate nearest neighbors:
towards removing the curse of dimensionality. In Proceedings of
the Thirtieth Annual ACM Symposium on Theory of Computing,
pp. 604–613.
Baeza-Yates, R., and Ribeiro-Neto, B. (2011). Modern Information
Retrieval: The Concepts and Technology Behind Search, volume 2.
Addison Wesley: Boston.
Alcantarilla, P. F., Nuevo, J., and Bartoli, A. (2013). Fast explicit diffusion
for accelerated features in nonlinear scale spaces. In Proceedings
of the British Machine Vision Conference (BMVC 2013), pp. 1–11.
Cheng, J., Leng, C.,Wu, J., Cui, H., and Lu, H. (2014). Fast and accurate
image matching with cascade hashing for 3d reconstruction. In 2014
IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 1–8.
Andoni, A., and Indyk, P. (2008). Near-optimal hashing algorithms
for approximate nearest neighbor in high dimensions. Commun. ACM,
(1), 117–122.
Datar, M., Immorlica, N., Indyk, P., and Mirrokni, V. S. (2004).
Locality-sensitive hashing scheme based on p-stable distributions. In
Proceedings of the Twentieth Annual Symposium on Computational
Geometry, pp. 253–262.
Guttman, A. (1984). R-trees: A dynamic index structure for spatial
searching. In Proceedings of the 1984 ACM SIGMOD International
Conference on Management of Data, SIGMOD ’84, pp. 47–57,
New York, NY, USA.
Kamasaka, K. and Kitahara, I., and Kameda, Y. (2017). Image based
location estimation for walking out of visual impaired person. In Proceedings
of the 14th European Conference on the Advancement of
Assistive Technology, AAATE Conf. 2017, Sheffield, UK, September
–15, pp. 709–716.
Sivic, J., and Zisserman, A. (2003). Video google: A text retrieval
approach to object matching in videos. In Proceedings of the 9th
IEEE International Conference on Computer Vision (ICCV 2003),
pp. 1470–1477.
Cevahir, A., and Torii, J. (2012). GPU-enabled high performance
online visual search with high accuracy. In 2012 IEEE International
Symposium on Multimedia (ISM), pp. 413–420.
Chandrasekhar, V., et al. (2010). Survey of SIFT compression schemes.
In Proceedings of the International Workshop Mobile Multimedia
Processing, pp. 35–40.