Abstract
With the rapid proliferation of multimedia data in the internet, there has been a fast rise in the creation of videos for the viewers. This enables the viewers to skip the advertisement breaks in the videos, using ad blockers and ‘skip ad’ buttons – bringing online marketing and publicity to a stall. In this paper, we demonstrate a system that can effectively integrate a new advertisement into a video sequence. We use state-of-the-art techniques from deep learning and computational photogrammetry, for effective detection of existing adverts, and seamless integration of new adverts into video sequences. This is helpful for targeted advertisement, paving the path for next-gen publicity. Code related to this paper is available at: https://youtu.be/zaKpJZhBVL4.
A. Nautiyal, K. McCabe, M. Hossari and S. Dev—Contributed equally and arranged alphabetically.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
With the ubiquity of multimedia videos, there has been a massive interest from the advertisement and marketing agencies to provide targeted advertisements for the customers. Such targeted advertisements are useful, both from the perspectives of marketing agents and end users. The advertisement agencies can use a powerful media for marketing and publicity; and the users can interact via a personalized consumer experience. In this paper, we attempt to solve this by designing an online advert creation system for next-gen publicity. We develop and implement an end-to-end system for automatically detecting and seamlessly changing an existing billboard in a video by inserting a new advert. This system will be helpful for online marketers and content developers, to develop video contents for targeted audience.
Figure 1 illustrates our system. Our system automatically detects the presence of a billboard in an image frame from the video sequence. Post billboard detection, our system also localizes its position in the image frame. The user is given an opportunity to manually adjust and refine the detected four corners of the billboard. Finally, a new advertisement is integrated into the image, and tracked across all frames of the video sequence. Thereby, we generate a new composite video with the integrated advert.
Currently, there are no such existing framework available in the literature that aid the marketing agents to seamlessly integrate a new advertisement, into an original video sequence. However, a few companies viz. Mirriad [1] uses patented advertisement plantation technique to integrate 3D objects in a video sequence.
2 Technology
The backbone of our advert creation system is based on state-of-the-art techniques from deep learning and image processing. In this section, we briefly describe the underlying techniques used in the various components of the demo system. The different modules of our system are: advert- recognition, localization, and integration.
2.1 Advert Recognition
The first module of our advert creation system is used for the recognition of billboardFootnote 1 – does an image frame from the video sequence contain billboard? This helps the system user to automatically detect the presence of billboard in an image frame of the video. We use a deep neural network (DNN) as a binary classifier where classes represent presence and absence of billboard in video frame respectively. We use a VGG-based network [4] for billboard detection. We use transfer learning with pre-trained ImageNet weights. We freeze the corresponding weights of all layers apart from last 5 layers. We add 3 fully connected layers with a softmax layer as the output layer. We train this deep network on our annotated dataset, containing both billboard and non-billboard images, and achieve good accuracy on billboard recognition.
2.2 Advert Localization
The second module of our advert creation system is used for localizing the position of recognized billboard – where is the billboard located in image frame? We use a encoder-decoder based deep neural network that localizes the billboard position in an image. We train this model on our billboard dataset comprising input images (cf. Fig. 2(a)) and corresponding binary ground truth image (cf. Fig. 2(b)). We train the model for several thousands of epochs. The localized billboard is a probabilistic image, that denotes the probability of an image pixel to belong to billboard class. We generate the binary threshold image from our computed heatmap using thresholding, and detect the various closed contours on the binary image. Finally, we select the contour with the largest area as our localized billboard position. We thereby compute the initial four corners from the binary image by circumscribing a rectangle on the selected contour with minimum bounding area. The localized advert is shown in Fig. 2.
2.3 Advert Integration
The third and final module of our system is advert integration – how to integrate a new advert in the video? In this stage, the localized billboard is replaced with a new advert in a seamless and temporally consistent manner. We use Poisson image editing [3] on the new advert, to achieve similar local illumination and local color tone, as the original video sequence. Furthermore, the relative motion of the billboard within the scene is tracked using Kanade-Lucas-Tomasi (KLT) [2] tracking technique.
3 Design and Interface
We have designed an online system to demonstrate the functionalities of the various modulesFootnote 2. The web UI interface is designed in Vue.js - the progressive JavaScript Framework. The back end is supported via Express - Node.js web application framework. The deep neural networks for advert recognition and localization is designed in pure python, and the advert integration is implemented in C++. The web service to support advert detection is performed in python flask. The integration of a new advert into the existing video in the web server is executed via C++ binary.
Figure 3 illustrates a sample snapshot of our developed web-based tool. The web interface consists of primarily three sections: Home, Demo and Images. The page Home provides an overview of the system. The next page Demo describes the entire working prototype of our system. The user selects a sample video from the list, runs the billboard detection module to accurately localize the billboard at sample image frames of the video. The detection module estimates the four corners of the billboard. However, the user also gets an option to refine the four corners manually, if the detected four corners are not completely accurate. The refined four corners of the billboard are subsequently used for tracking and integration of a new advertisement into the video sequence. The third and final web page Images contains the list of all candidate adverts that can be integrated into the selected video sequence.
Finally, our system integrates the new advertisement into the detected billboard position, and generates a new composite video with the implanted advertisement.
4 Conclusion and Future Work
In this paper, we have presented an online advert creation system on multimedia videos for a personalized and targeted advertisement. We use techniques from deep neural networks and image processing, for a seamless integration of new adverts into existing videos. Our system is trained on datasets that comprises outdoor scenes and views. Our future work involve further refining the performance of the system, and also generalizing it to other video sequence types.
Notes
- 1.
In this paper, we interchangeably use both the terms, billboard and advert to indicate a candidate object for new advertisement integration in an image frame.
- 2.
A demonstration video of our advert creation system can be accessed via https://youtu.be/zaKpJZhBVL4.
References
Mirriad: Scalable, effective campaigns (2018). http://www.mirriad.com/. Accessed 7 May 2018
Lucas, B.D., Kanade, T., et al.: An iterative image registration technique with an application to stereo vision (1981)
Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. (TOG) 22(3), 313–318 (2003)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
Acknowledgement
The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Nautiyal, A. et al. (2019). An Advert Creation System for Next-Gen Publicity. In: Brefeld, U., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2018. Lecture Notes in Computer Science(), vol 11053. Springer, Cham. https://doi.org/10.1007/978-3-030-10997-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-10997-4_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10996-7
Online ISBN: 978-3-030-10997-4
eBook Packages: Computer ScienceComputer Science (R0)