Proposition of Rubustness Indicators for Immersive Content Filtering

Youngmo Kim1, Seok-Yoon Kim1, Chyapol Kamyod2 and Byeongchan Park1,*

1Dept. of Computer Science and Engineering, Soongsil University, Seoul, Republic of Korea
2Computer and Communication Engineering for Capacity Building Research Center, School of Information Technology, Mae Fah Luang University, Chiang Rai, Thailand
E-mail: ymkim828@ssu.ac.kr; ksy@ssu.ac.kr; chayapol.kam@mfu.ac.th; pbc866@gmail.com
*Corresponding Author

Received 19 June 2023; Accepted 15 September 2023; Publication 24 October 2023

Abstract

With the full-fledged service of mobile carrier 5G networks, it is possible to use large-capacity, immersive content at high speed anytime, anywhere. It can be illegally distributed in web-hard and torrents through DRM dismantling and various transformation attacks; however, evaluation indicators that can objectively evaluate the filtering performance for copyright protection are required. Since applying existing 2D filtering techniques to immersive content directly is not possible, in this paper we propose a set of robustness indicators for immersive content. The proposed indicators modify and enlarge the existing 2D video robustness indicators to consider the projection and reproduction method, which are the characteristics of immersive content. A performance evaluation experiment has been carried out for a sample filtering system and it is verified that an excellent recognition rate of 95% or more is achieved in about 3 s of execution time.

Keywords: Multimedia computing, image processing, image recognition, image resolution, image sampling.

1 Introduction

As 5G services have recently been full-fledged, immersive content is being serviced through media platforms [1, 2]. VR and AR technologies, as immersive technologies, are major core technologies in the 4th industrial revolution and are regarded as technologies that can create new added value as innovation tools not only in the entertainment field but also in the economy and society as a whole [3]. According to global market research firm PwC, the global market for immersive technology is expected to more than triple from $280 billion in 2025 to $1.5 trillion in 2030. In addition, virtual convergence technology (XR) applied with VR and AR is expected to create an economic ripple effect of US$476.4 billion in 2025 [4]. As such immersive content is distributed in various fields, it is prone to be illegally distributed in web-hard and torrents through DRM dismantling and various transformation attacks [5]. Accordingly, individual creators and distributors consider their copyright protection when uploading immersive content [6]. At this time, it is necessary to have a technology that is robust and resistant to deformation and transformation to verify whether or not copyright infringement has occurred and to check the usage of the rights management information for the query video, based on the registered feature data.

We propose a reinforcement algorithm for the feature-based filtering technique of immersive content by adapting and enlarging the robustness indicators of the existing 2D media [7]. We propose a new set of robustness indicators for an immersive content filtering technology that is robust to geometric transformations such as rotation, size transformation, and translation of videos, as well as change of video encoding method, and is applied to a sample filtering system for performance evaluation.

The composition of this paper is as follows. Following this introduction, Section 2 describes related research which is media filtering technology, 2D robustness indicators for filtering performance evaluation provided by the Korea Copyright Commission, and characteristics of immersive content. In Section 3, new robustness indicators of immersive content are proposed and their derivation process is explained. Section 4 describes the filtering performance evaluation method applied in this paper and Section 5 illustrates the experimental results. A conclusion is given in Section 6.

2 Related Research

2.1 Feature-based Filtering Technology

Feature-based filtering technology is a technology that extracts unique features of content such as videos, audio, and fusion multimedia content, builds the extracted information into a database, and searches based on this information [8]. In order to filter content, a feature for identifying contents is required, and this feature must satisfy the following properties.

1. Robustness

This refers to the degree to which the protected content must be robust to general deformation and can be recognized correctly even if it has undergone general damage. It should have similar feature values for the changes in compression and format as far as it maintains the perceptual characteristics of the original content.

2. Individuality

All content must have feature information that is distinct.

3. Search efficiency

To configure the content filtering system based on the feature values that satisfy the above characteristics, it is necessary to first create a feature information database, and after this process, the content can be identified.

2.2 Robustness Indicators for 2D Video

Although it may vary by country, the Korean Copyright Commission provides a performance evaluation service that evaluates the level of functionality that can block illegal content distribution through web-hard and P2P of feature-based filtering technology companies [7]. The robustness indicators should be able to extract feature information similar to the original after various transformations of digital video contents, as shown in Table 1.

Table 1 Robustness indicators for 2D videos

Evaluation item Parameter
Logo insert 100% opacity
Caption insert Font size 12
Font Size 16
Font Size 20
Severe Compression Divx 512 kbps
Divx 700 kbps
Codec change H.264/mpeg-4 avc
Xvid
WMV
Aspect ratio change 16:9 4:3 (top and bottom black and white processing)
4:3 16:9 (right and left black and white processing)
Frame-rate reduction 20 fps
Rotation 90 rotation
180 rotation
270 (-90) rotation
Flip Flip horizontally
Flip vertically
Color to monochrome conversion I = 0.299*R + 0.587*G + 0.114*B
Brightness change +18
+9
-9
-18
Contrast change Contrast (120%)
Contrast (80%)
Multi-transformation Severe Compression (divx, 512 kbps) Codec change (h.264/mpeg-4 avc) Caption insert font size 12
Codec change (Xvid)
Codec change (WMV)
Severe compression (Divx, 700 kbps) Codec Cchange (H.264/MPEG-4 AVC) Caption insert font size 12
Codec change (Xvid)
Codec change (WMV)
Etc. Zip compression (original 3-minute content)
Zip compression (double compression)
EGG compression
Exe compression
Exe compression + ZIP compression

1. Logo/subtitle insertion

This refers to a state in which a subtitle or a logo is inserted in the upper or caption part of the video content and the original content is transformed.

2. Compression and Codec support

This means transforming the video content at low resolution and high compression rates.

3. Resolution change

This means a transformation that reduces the spatial size of video content.

4. Frame rate change

This means a transformation that reduces the number of frames to envision video content.

5. Camera capture

This means the transformation of recording the video screen displayed through the LCD TV with a digital camera (camcorder).

6. Rotation

This means transforming the video content by rotating it.

7. Flip

This means transforming the video content by inverting it up/down, left/right.

8. Color to monochrome conversion

This refers to a transformation that converts color video content into black and white.

9. Brightness change

This means changing the brightness value of each pixel constituting the video content.

10. Contrast change

This means transforming the difference between the color and brightness of an object, which is a visual-specific difference that makes it possible to distinguish between an object and a background in video content.

2.3 Omnidirectional Media Format for Immersive Media

Through MPEG standardization, the production and use of VR content are defined in OMAF (omnidirectional media format) [912]. The technical part is explained so that content can be produced using VR technology through two or more cameras and content can be consumed through HMD (head mount display), as shown in Figure 1.

images

Figure 1 OMAF architecture of MPEG-immersive.

In the process of producing content, the scene is extracted with a camera, and stitched into a three-dimensional rectangular video through a projection process, and users use VR content through HMD. In particular, a projection process is required in the process of rendering a 3D spherical video. Among the several projection methods, the most used method is ERP (equi-rectangular projection), as shown in Figure 2.

images

Figure 2 Video extraction and ERP.

In VR content, since this ERP method contains all directions (front, back, top, bottom, right, left) in one video, there is a distorted area, especially in the top and bottom areas. In addition, since the overlapped part is part of the video extracted from each camera, that part of the video looks unnatural by forcibly joining the non-overlapping parts when stitching is performed.

3 Robustness Indicators for Immersive Content Filtering

3.1 Derivation of Immersive Content Robustness Indicators by Adapting 2D Video Robustness Indicators

We propose a criterion for the robustness indicators for immersive content by adapting the robustness indicators of the existing 2D media, which is summarized in Table 2, where the indicators of Nos. 1 to 14 are kept the same as those of 2D.

Table 2 Robustness indicators for immersive content

No Evaluation Item Apply Parameter Remark
1 Logo insert Change Top, bottom Immersive video apply logo insert pattern
2 Caption insert Available
3 Severe compression Change
4 Codec compression Change H.264 (MPEG-4 AVC) Apply 360-degree
H.265(HEVC) Codec
5 Aspect ratio change Change 16:9 <-> 2:1
6 Resolution change Change SD (640*720) Apply 360-degree
720P HD (1280*720) Resolution
1080P HD (1920*1080) (YouTube, Vimeo,
4K UHD (3840*1920) Upload based)
4K Monoscopic (4096*2048)
7 Frame-rate reduction Available 20 fps
8 Rotation Change 180 rotation 90, 270 video cannot be played
9 Flip Available
10 Color to monochrome conversion Available
11 Brightness change Available
12 Contrast change Available
13 Multi-transformation Available
14 Etc. Available
15 Projection conversion New Cube map -> ERP Projection
ERP -> Cube map conversion
16 Monoscopic cropping New Location: 0, viewing angle: 120*90 Partial Video
Location: 90, viewing angle: 120*90
Location: 180, viewing angle: 120*90
Location: 270, viewing angle: 120*90
17 Stereoscopic cropping New Location: 0, viewing angle: 120*90 Partial video
Location: 90, viewing angle:120*90
18 Viewpoint move New Viewpoint move

1. Insert Logo

In immersive content, distortion occurs at the top and bottom areas when the video is stitched, so it inserts a logo to hide the distorted parts. The insertion of the logo at the top and bottom of the video was selected as a robustness item, as shown in Figure 3.

• Insert logo on the TOP

• Insert logo on the BOTTOM.

images

Figure 3 Logo insert example.

2. Insert Subtitles

Although it is possible to see the subtitle effect by inserting text in all directions in immersive content, subtitles outside the user’s viewing angle are judged to be ineffective. However, since OMAF is standardizing the creation of subtitles that follow the user’s field of view, it is expected that subtitles will be inserted within a range that does not interfere with the user’s field of view. Accordingly, subtitle insertion was selected as a robustness item, as shown in Figure 4.

• Insert font size 20 based on serif font

• Insert font size 16 based on serif font

• Insert font size 12 based on serif font.

images

Figure 4 Subtitle conversion.

3. Video compression

As the basic attack pattern, the compression rate of the video is stored and the attack pattern is selected as a robustness item.

• Check decoder support for MPEG-AVC/AVC, MPEG-2, RM, WMC codecs

• Divx 700/512/256/128 Kbps compression for each codec.

4. Code conversion

H.264 and H.265 (HEVC) are common codecs of immersive content used currently. Since there are occasions in which the frame is deformed according to mutual deformation, it was selected as a robustness item.

images

Figure 5 Aspect ratio conversion.

5. Aspect ratio change

Immersive content is mostly produced in a ratio of 16:9 and 2:1 as seen during the data set collection process. Since the aspect ratio conversion is a basic attack pattern, this was selected as a robustness item, as shown in Figure 5.

• 16:9 2:1

• 2:1 16:9.

6. Resolution change

The basic attack pattern of converting the resolution of the video was selected as a robustness item, as shown in Figure 6.

• SD (640*360) class resolution conversion

• 720P HD (1280*720) resolution variant

• 1080P HD (1920*1080) class resolution variant

• 4K UHD (3290*1920) resolution variant

• 4K monoscopic (4096*2048) class resolution conversion

images

Figure 6 Resolution conversion.

7. Frame-rate reduction

In order to reduce motion sickness or unreality when producing immersive content, the frame rate is produced a high frame rate of 30 to 60 frames per second. Frame reduction was selected as a robustness item because there is no problem in playing a video even if the frame rate is degraded.

• Transform video over 60fps to 20fps

• Transform video over 60fps to 30fps

• Transform video over 60fps to 40fps.

8. Rotation

Immersive content is rendered and spherical when played. For this reason, the rotations of 90 and 270 cannot be played, but the rotation of 180 is possible by changing the device settings, so 180 rotation was selected as a robustness item, as shown in Figure 7.

• 180 rotation.

images

Figure 7 Rotation conversion.

This provides only the effect of inversion, not the ratio or rotation of the video. Similar to the 180 rotation described above, normal video playback is possible with a simple device setting change, so horizontal/vertical inversion was selected as a robustness item, as shown in Figure 8.

• Flip horizontally

• Flip vertically

images

Figure 8 Flip.

9. Color to monochrome conversion

Since converting a color video to a black and white one is a basic attack pattern, it was selected as a robustness item. Since converting the brightness of the video is a basic attack pattern, it was also selected as a robustness item. Frame-rate reduction is shown in Figure 9.

• I = 0.299 * R + 0.587 * G + 0.114 * B.

images

Figure 9 Color to monochrome conversion.

10. Brightness conversion

Since converting the brightness of the video is a basic attack pattern, it was selected as a robustness item, as shown in Figure 10.

• Change the brightness value of each pixel to +36: strong

• Change the brightness value of each pixel to +18: medium

• Change the brightness value of each pixel to +9: weak

• Change the brightness value of each pixel to -9: weak

• Change the brightness value of each pixel to -18: medium

• Change the brightness value of each pixel to +36: strong.

images

Figure 10 Brightness conversion.

11. Contrast change

Since converting the contrast effect is also a basic attack pattern, it was selected as a robustness item, as shown in Figure 11.

• Contrast 80%

• Contrast 120%.

images

Figure 11 Contract conversion.

12. Multi-transformation

Multiple transformations are possible through compression change, codec change, and caption insertion in immersive content. It was selected as a robustness item because it can be played normally, even under deformation.

• At least two transformations among compression transformation, transformation transformation, subtitle insertion.

13. Etc.

This was selected as a robustness item as part of the basic compression.

3.2 New Robustness Indicators for Immersive Content

We propose a new set of robustness indicators that consider monoscopic and stereoscopic projection methods, one of the characteristics of immersive content, and reproduction methods of immersive content, as shown in Nos. 15 to 18 of Table 2.

1. Projection conversion

ERP and Cub Map, a representative projection method of immersive content, were selected as robustness indicators according to the characteristics of immersive content, as shown in Figure 12.

• Transformation from ERP to CMP

• Transformation from CMP to ERP.

images

Figure 12 Projection conversion.

2. Monoscopic cropping

This is an indicator for filtering attacks that records a part of the immersive content and processes it in 2D form. For example, when the viewpoint is fixed in one direction in immersive content such as an idol music video, it is set as a robustness indicator to determine the authenticity of the original. In the immersive content, the central part with relatively little distortion was set to a viewing angle of 120*90 and applied robustly to the immersive content, as shown in Figure 13.

images

Figure 13 Monoscopic cropping.

3. Stereoscopic cropping

This is an indicator for filtering attacks that records a part of the immersive content and processes it into content. Among the immersive content, it was set as an indicator to determine the status of the original when shooting with a fixed viewpoint in one direction in a video such as an idol music video, as shown in Figure 14.

images

Figure 14 Stereoscopic cropping.

4. Viewpoint move

This is an indicator for filtering attacks that change the viewpoint information that is the reference of a video. If the reference viewpoint is changed, since the extracted frames become different from the original frames, the coordinates of the extracted feature information are different, increasing the probability of misrecognition. In this way, even if a modification attack is applied to the video, it is played in the same format when viewing the video, so it was set as a robustness indicator, as shown in Figure 15.

images

Figure 15 Viewpoint move.

4 Filtering Performance Evaluation Method for Immersive Content

Feature information is extracted from the original video through the feature point extraction algorithm for a performance evaluation of the robustness evaluation items for the immersive content filtering presented in this paper. After extracting feature information from the query video, it compares the similarity with the original video to derive the recognition rate as to whether it is recognized as a similar video. The first step is to create and store a data set of the original immersive content and the feature information of the immersive content to which the modification attack was applied, as shown in Figure 16 [13].

images

Figure 16 Original video feature extraction algorithm.

In this way, the feature information generated from the original frame and the transformed frame, respectively, are selected using K-nearest neighbor matching. Feature information that is not normally matched in K-NN matching is finally filtered out through RANSAC and results in good feature information [1416], and the process is shown in Figure 17.

images

Figure 17 Good feature information extraction process.

images

Figure 18 Query video feature extraction and similarity comparison algorithm.

Secondly, the data set creation and comparison search step for the query immersive content is shown in Figure 18, the explanation of which is as follows [17].

In the immersive content, cut out 20 s from the first half, the introduction part of the video. If a video is discriminated with only one frame, the probability of non-recognition and un-recognition in the similarity comparison increases. Therefore, five frames are randomly extracted from the total media length and queried. Query feature information is extracted from each frame in the original feature information database. As for the extracted query feature information, feature information was generated by creating a transformed frame in the frame extracted from the query video in the same manner as the original feature information extraction process. A similarity comparison with the original frame is performed on the queried frame. The similarity comparison is a comparison of the feature information of the original frame and that of the query frame, and the most suitable feature point similarity ratio should be determined to check whether it is recognized as the same frame. Similarity ratios were all compared from 50% to 100%, and a total of 500 frames were created and tested. The results are shown in Figure 19.

As a result of the similarity ratio experiment, 400 or more similar frames were recognized with an 80% to 100% similarity ratio out of 500 frames, and the similarity ratio for the robustness experiment was set to 80%, the minimum value between 80% and 100%.

In addition, when the frame is recognized as the same frame, a weight of 1 point is given, and the original video and the query video to which the modification attack is applied are determined to be similar videos for items with a weight of 4 or more.

images

Figure 19 Frame recognition count according to similarity ratio.

5 Experimentation and Evaluation

5.1 Experimental Data Set and Experimental Robustness Items

In order to verify the robustness indicators for the immersive content proposed in this paper, experiments and verification have been conducted using the performance evaluation method proposed in Section 4. For this experiment, 100 immersive content items serviced by YouTube and Vimeo were downloaded and used as the original dataset, as shown in Figure 20.

images

Figure 20 Original dataset for experiment.

Items 1 and 2 of Table 3 were excluded to avoid adverse effects on the experimental results because the insertion position was omnidirectional depending on the characteristics of the immersive content, so the reference axis may be ambiguous. Items 3 to 10 are major transformation attack items applied to 2D videos and also included in the experiment because they are transformation items that can be applied to immersive content. Items 11 and 12 were excluded from the experiment because the same result was expected because the color value of the frame was excluded when black and white conversion was applied in the feature information extraction process. Items 13 and 14 are items for confirming robustness by applying two or more modifications to the content, and were excluded from the experiment because robustness can be secured only when the verification of the proposed technology is completed. Item 15 was excluded from the experiment because all projection types are converted to ERP types in the preprocessing process. Items 16 and 17 were included in the experiment as the most important robustness index because they are transformations that cannot be recognized in the existing 2D video feature-based filtering technology among transformations that can occur in immersive contents. For item 18, the position value is changed when feature information is extracted, but the feature information of the object existing in the frame does not change. Therefore, the feature information was extracted identically and was excluded from the experiment.

For these 100 original datasets, we created 1000 deformation attack datasets as shown in Table 3 and used them as query datasets.

Table 3 Robustness indicators for immersive content

No Attack Type Number
1 Severe compression 100
2 Codec change 100
3 Aspect ratio change 100
4 Resolution change 100
5 Frame-rate reduction 100
6 Rotation 100
7 Flip 100
8 Color to monochrome conversion 100
9 Monoscopic cropping 100
10 Stereoscopic cropping 100
Total 1000

5.2 Experimental Methods and Results

Experiments were conducted on robustness items for each step, and the evaluation method and result calculation method for the experiment are as follows.

• Evaluation method: Test the recognition rate for 1000 datasets [100 originals * (10 deformation items)] and recognize 90% or more.

• Result calculation method: [Number of recognized, misrecognized, and un-recognized for each deformation item]/[total number of tests * 100].

For the experiments on robustness, 10 deformation items of the robustness indicators for the immersive content filtering technology were selected for 1000 datasets, and a recognition rate test of recognizing similar videos when compared to the original video was performed to confirm that 90% or more was recognized. There are two experiments. In the first experiment (A), the existing 2D robustness items were applied to the immersive content without change, and in the second experiment, 10 deformation items were tested. Ten different deformations of compression rate, codec, aspect ratio, resolution conversion, frame rate, rotation, inversion, black-and-white conversion, user view cropping, and HMD viewpoint cropping were tested and the results are shown in Table 4.

Table 4 Results

No RR (%) (A) SS (ms) (A) RR (%) (B) SS (ms) (B)
1 30 5066 97 3086
2 84 6978 99 3978
3 85 5245 96 3245
4 31 5153 97 3153
5 50 5067 92 3067
6 62 5157 92 3067
7 35 5150 97 3150
8 60 5135 95 3135
9 23 4090 98 3090
10 35 4388 86 3388
A 49.5 5142 95.4 3245
RR: recognition rate, SS: search speed, A: average.

In order to verify the validity of the robustness indicators, a dataset was constructed by selecting and transforming a total of 10 deformation items for 100 original immersive content, and the experimental results of the 2D robustness items show an average recognition rate of 49.5% in 5 s of execution time, which is not acceptable. On the other hand, the experimental results of the immersive content robustness indicators, proposed in this paper, have confirmed that the recognition rate of an average of 95% or more was achieved in the early 3 s execution time, which is superior to the case of applying 2D robustness indicators.

6 Conclusion

In this paper, a new set of robustness indicators tailored to immersive content has been proposed by adapting and modifying the existing robustness indicators for the filtering performance evaluation of 2D videos. In the derivation process, the existing 2D video robustness indicators were modified according to the characteristics of the immersive content. In addition, a new set of robustness indicators for the immersive content has been added by adding items for the projection method and playback method tailored to the unique characteristics of the immersive content.

In an environment where immersive content that needs to be processed at high speed and large volumes are distributed as 5G is going to be in full service, filtering technology is used to enable copyright protection by preventing the illegal distribution of immersive content fundamentally. In such cases, a set of robustness indicators proposed in this paper is required to evaluate the performance of the filtering technology.

As a future study, it is necessary to do research not only on the robustness index but also on the reliability and performance indicators for immersive content filtering technology.

Acknowledgment

This research project was supported by the Ministry of Science, ICT and Future Planning(MSIP) and the Institute for Information & Communication Technology Planning & Evaluation(IITP) in 2023(2022-0-00699).

References

[1] J. Lee, ‘Changes in realistic media content distribution environment and production technology in the 5G era’, National IT Industry Promotion Agency Issue Report, No.22, 2019.

[2] V. Ziegler, T. Wild, M. Uusitalo, H. Flinck, V. Räisänen, K. Hätönen, ‘Stratification of 5G evolution and Beyond 5G’, In 2019 IEEE 2nd 5G World Forum (5GWF), pp. 329–334, Sep., 2019.

[3] J. H. Park, ‘5G Era, Content Industry Changes and Implications’, KIET Industrial Economy, 2019.

[4] PwC, ‘Seeing is believing’, PwC, 2019.

[5] VideoPlus “5G Era, Single Media Copyright Issues and Trends,” VidoePlus, 2019.

[6] J. Park, J. Kim, J. Seo, S. Kim, J. Lee, ‘DNN-Based Forensic Watermark Tracking System for Realistic Content Copyright Protection’, Electronics, Vol. 12, No. 3, 553, Jan., 2023.

[7] Y. Kim, W. Kim, J. Lee, S. Jho and D. Shin, ‘Performance Evaluation of Video Contents Filtering’, Telecommunication Technology Association Standard, 2013, TTAK. KO-12.0161/R1.

[8] Korea Copyright Commission, Performance Evaluation of Feature-based Filtering https://www.copyright.or.kr/kcc/tmis/performance/filtering/init.do.

[9] S. Oh, ‘MPEG Omnidirectional Media Format(OMAF) for 360 Media’, Journal of Broadcast Engineering, Vol. 22, No. 5, pp. 600–607, 2017.

[10] ISO/IEC 23090-2, ‘Information Technology-Coded Representation of immersive media – Part 2: Ommidrectional Media Format’, ISO/IEC 23090-2:2019(E).

[11] J. Lee, ‘Immersive Media Format Standardization Trend’, Broadcast and Media Magazine, Vol. 24, No. 4, pp. 343–352. 2017.

[12] G. Lee, J. Jeong, H. Shin, J. Seo, ‘Standardization Trend of 3DoF+ Video for Immersive Media’, Electronics and Telecommunications Trends, Vol. 34, No. 6, pp. 156–163, 2019.

[13] B. Park, S. Jang, I. Yoo, J. Lee, S. Kim, Y. Kim, ‘A Feature Point Extraction and Identification Technique for Immersive Contents Using Deep Learning’, J. Inst. Korean. Electr. Electron. Eng., Vol. 24, No. 2, pp. 529–535, 2020.

[14] H. Lee, O. Choi, ‘An Efficient Parameter Update Method of 360-degree VR Image Model’, International Journal of Engineering Business Management, 2019.

[15] J. Jia, C. Tang, ‘Image Stitching using Structure Deformation’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 4, pp. 617–631, 2008.

[16] D. Barath, J. Matas, ‘Graph-cut RANSAC’, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 6733–6741, 2018.

[17] B. C. Park, S. Y. Jang, I. J. Yoo, J. C. Lee, S. Y. Kim and Y. M. Kim, “A Feature Point Extraction and Identification Technique for Immersive Contents Using Deep Learning,” J. Inst. Korean. Electr. Electron. Eng., Vol. 24, No. 2, pp. 529–535, 2020. DOI: 10.7471/ikeee.2020.24.2.529.

Biographies

images

Youngmo Kim received his Ph.D. degree in Computer Engineering from Deajean University, Daejeon Korea in 2011. He is currently adjunct professor in Soongsil University. He is also working on several standardization and national projects.

images

Seok-Yoon Kim received his B.S. degree in electrical engineering from Seoul National University in 1980. He received his M.Sc. and Ph.D. degrees in ECE from University of Texas at Austin in 1990 and 1993, respectively. He is currently with the School of Computing, Soongsil University.

images

Chayapol Kamyod received his Ph.D. in Wireless Communication from the Center of TeleInFrastruktur (CTIF) at Aalborg University (AAU), Denmark. He received his M. Eng. in Electrical Engineering from The City College of New York, New York, USA. In addition, he received his B.Eng. in Telecommunication Engineering and M.Sc. in Laser Technology and Photonics from Suranaree University of Technology, Nakhon Ratchasima, Thailand. He is currently a lecturer in the Computer Engineering program at the School of Information Technology, Mae Fah Luang University, Chiang Rai, Thailand. His research interests are the resilience and reliability of computer networks and systems, wireless sensor networks, embedded technology, and IoT applications.

images

Byeongchan Park received his B.Sc., M.Sc., and Ph.D. degrees in Computer Science and Engineering from Soongsil University, Korea in 2015, 2018, and 2023, respectively. He is currently with the Dept. of Computer Science and Engineering, Soongsil University. He is also working on several national R&D projects.

Abstract

1 Introduction

2 Related Research

2.1 Feature-based Filtering Technology

1. Robustness

2. Individuality

3. Search efficiency

2.2 Robustness Indicators for 2D Video

1. Logo/subtitle insertion

2. Compression and Codec support

3. Resolution change

4. Frame rate change

5. Camera capture

6. Rotation

7. Flip

8. Color to monochrome conversion

9. Brightness change

10. Contrast change

2.3 Omnidirectional Media Format for Immersive Media

images

images

3 Robustness Indicators for Immersive Content Filtering

3.1 Derivation of Immersive Content Robustness Indicators by Adapting 2D Video Robustness Indicators

1. Insert Logo

images

2. Insert Subtitles

images

3. Video compression

4. Code conversion

images

5. Aspect ratio change

6. Resolution change

images

7. Frame-rate reduction

8. Rotation

images

images

9. Color to monochrome conversion

images

10. Brightness conversion

images

11. Contrast change

images

12. Multi-transformation

13. Etc.

3.2 New Robustness Indicators for Immersive Content

1. Projection conversion

images

2. Monoscopic cropping

images

3. Stereoscopic cropping

images

4. Viewpoint move

images

4 Filtering Performance Evaluation Method for Immersive Content

images

images

images

images

5 Experimentation and Evaluation

5.1 Experimental Data Set and Experimental Robustness Items

images

5.2 Experimental Methods and Results

6 Conclusion

Acknowledgment

References

Biographies