Fri Oct. 23 Workshops day
Room 9 - (Meeting Room 1024 @ Beijing Hotel)
|8:00-17:00 ||Registration |
Adversary Modeling in Multimedia Surveillance, Mohan Kankanhalli.
Multimedia Surveillance systems are used for three purposes of
deterrence, real-time monitoring and forensics. In all of intended
uses, the notion of an adversary who is actively trying to defeat
the system has not been studied much.
We consider surveillance problems to be a set of system-adversary
interaction problems in which an adversary can be modeled as a rational
(selfish) agent trying to maximize his utility. We feel that
appropriate adversary modeling can provide deep insights into the
system performance and also clues for optimizing the system's performance
against the adversary. Further, the system designers should exploit the
fact that they can impose certain restrictions on the intruders and the
way they interact with the system. The system designers can analyze the
scenario to determine conditions under which system outperforms the
adversaries, and then suitably re-engineer the environment under a
"scenario engineering" approach to help the system outperform the adversary.
We show such enhancements to two significantly different surveillance
scenarios using a game theoretic framework and present results of their
adaptation. While the precise enforcements for the zero-sum ATM lobby
monitoring scenario and the non-zero-sum traffic monitoring scenario
are different, they lead to some useful generic guidelines for surveillance
Mohan Kankanhalli is a Professor at the Department of Computer
Science at the National University of Singapore. He is also the
Vice-Dean for Academic Affairs and Graduate Studies at the
NUS School of Computing. He obtained his BTech (Electrical Engineering)
from the Indian Institute of Technology, Kharagpur and his MS/PhD (Computer
and Systems Engineering) from the Rensselaer Polytechnic Institute.
He is actively involved in the organization of many major conferences in
the area of Multimedia. He is on the editorial boards of several journals
including the ACM Transactions on Multimedia Computing, Communications,
and Applications, IEEE Transactions on Multimedia, Springer Multimedia
Systems Journal, Multimedia Tools and Applications, and the Pattern Recognition
His current research interests are in Multimedia Systems (content processing,
retrieval) and Multimedia Security (surveillance, digital rights management
More details are available at: http://www.comp.nus.edu.sg/~mohan
|10:00-10:30 ||Coffee Break |
Oral Session 1 - Detection and Mining - Session Chair: Marcel Worring|
- Graffiti-ID: Identifying Gang Graffiti Images, Anil Jain (Michigan State University, US); Jung-Eun Lee (Michigan State University, US); Rong Jin (Michigan State University, US)
Graffiti-ID: Identifying Gang Graffiti Images
Gang graffiti are often used by law enforcement agencies for understanding gang activities and uncovering the extent of a gangĺs territory. Current method for matching and retrieving graffiti is based on manual database search that is time consuming and has limited performance. In this paper, we present a content-based image retrieval (CBIR) system for automatic matching and retrieving of gang graffiti images. Based on Scale Invariant Feature Transform (SIFT) features extracted from graffiti images, our system computes feature-based similarity between the query image and graffiti images in a database. Experimental results on two different graffiti databases show encouraging results.
- Temporal Normalization of Videos Using Visual Speech, Usman Saeed (Institute Eurecom, FR); Jean-Luc Dugelay (Institut EURECOM, FR)
Temporal Normalization of Videos Using Visual Speech
Pose and illumination variation has been considered the major cause of poor recognition results in automatic face recognition as compared to other biometrics. With the advent of video based face recognition a decade ago we were presented with some new opportunities, algorithms were developed to take advantage of the abundance of data and behavioral aspect of recognition. But this modality introduced some new challenges also, one of them was the variation introduced by speech. In this paper we present a novel method for handling this variation by using temporal normalization based on lip motion. Evaluation was carried out by comparing face recognition results from original non-normalized videos and normalized videos.
- Video Surveillance and Multimedia Forensics: an Application to Trajectory Analysis, Simone Calderara (University of Modena and Reggio Emilia, IT); Andrea Prati (Universita' di Modena e Reggio Emilia, IT); Rita Cucchiara (University of Modena and Reggio Emilia, IT)
Video Surveillance and Multimedia Forensics: an Application to Trajectory Analysis
This paper reports an application of trajectory analysis in which forensics and video surveillance techniques are jointly employed for providing a new tool of multimedia forensics. Advanced video surveillance techniques are used to extract from a multi-camera system the trajectories of the moving people which are then modelled by either their positions (projected on the ground plane) or their directions of movement. Both these two representations can be very suitable for querying large video repositories, by searching for similar trajectories in terms of either sequences of positions or trajectory shape (encoded as sequence of angles, where positions do not care). Preliminary examples of the possible use of this approach are shown.
|12:00-13:00 || Oral Session 2 - Tracking - Session Chair: Rita Cucchiara|
- Single View Geometry and Active Camera Networks Made Easy, Federico Pernici (University of Florence, IT); Alberto Del Bimbo (University of Florence, IT)
Single View Geometry and Active Camera Networks Made Easy
Active camera networks have an important role in surveillance systems. They have the ability to direct the attention to interesting events that occur in the scene. In order to achieve such behavior the cameras in the network use a process known as sensor slaving where at least two cameras are in a master slave configuration, the master camera monitors a wide area and tracks moving targets so as to provide the positional information to the slave camera, and the slave camera foveates at the targets in high resolution. In this paper, we propose a simple method to solve two typical problems that are the basic building blocks to create high level functionality in active camera networks viewing a scene plane: the computation of the world to image homographies and the computation of image to image homographies. The first is used for computing image sensors likelihood for sequential target tracking (for example with the Extended Kalman Filter). The second is used for camera slaving. We show how planar mosaic and single view geometry can be used to compute the aforementioned homographies.
- Multi-target Tracking in Time-lapse Video Forensics, Paul Koppen (University of Amsterdam, NL); Marcel Worring (University of Amsterdam, NL)
Multi-target Tracking in Time-lapse Video Forensics
We introduce a multi-target tracking algorithm that oper- ates on recorded video. Apart from being robust to visual challenges (like partial and full occlusion, variation in illu- mination and camera view), our algorithm is also robust to temporal challenges, i.e., unknown variation in frame rate. The complication with variation in frame rate is that it in- validates motion estimation. As such, tracking algorithms that are based on motion models will show decreased perfor- mance. On the other hand, high precision appearance based tracking su ers from a plethora of false detections. Our tracking algorithm, albeit relying on appearance based de- tection, deals robustly with the caveats of both approaches. The solution rests on the fact that we can make fully in- formed choices; not only based on preceding, but also based on following frames. It works as follows. We assume an object detection algorithm that is able to detect all target objects that are present in each frame. From this we build a graph structure. The detections form the graph's nodes. The vertices are formed by connecting each detection in one frame to all detections in the following frame. Thus, each path through the graph shows some particular selection of successive object detections. Object tracking is then refor- mulated as a heuristic search for optimal paths, where opti- mal means to nd all detections belonging to a single object and excluding any other detection. We show that this ap- proach, without an explicit motion model, is robust to both the visual and temporal challenges.
|13:00-14:00 || Lunch |
|14:00-15:30 ||Oral Session 3 - Multimedia forensics prototypes- Session Chair: Rita Cucchiara|
- Videntifier Forensic: A New Law Enforcement Service for Automatic Identification of Illegal Video Material, Herwig Lejsek (Reykjavik University, IS); Fririk H ┴smundsson (Eff2 Technologies, IS); Kristleifur Daason (ReykjavÝk University, IS); Baldur Jˇhannesson (Eff2 Technologies, IS); Ăvar Kvaran (Eff2 Technologies, IS); Bj÷rn Ů Jˇnsson (ReykjavÝk University, IS); Laurent Amsaleg (irisa-cnrs, FR)
Videntifier Forensic: A New Law Enforcement Service for Automatic Identification of Illegal Video Material
Tracking down producers and distributors of offensive video material, in particular child pornography, has become an ever growing focus of the world's law enforcement agencies. We describe VidentifierTM Forensic, a new service which radically improves the video identification process, by providing law enforcement agencies with a robust, efficient and easy-to-use video identification system. Using this service, a single mouse-click is sufficient to automatically scan an entire storage device and classify all videos. We give an overview of the service and the underlying technology components. We then describe an acceptance test, performed by the Icelandic police forces, which demonstrates the robustness of the service.
- Image Spam Clustering - An Unsupervised Approach, Chengcui Zhang (University of Alabama at Birmingham, US); Wei-Bang Chen (University of Alabama at Birmingham, US)
Image Spam Clustering - An Unsupervised Approach
In this paper, an unsupervised image clustering framework is proposed for revealing the common origins, i.e. the spam gangs, of unsolicited emails. In particular, we target email spam with image attachments in this study because spam information is harder to extract due to information hiding enabled by various image obfuscation techniques. To identify the spam gangs, we observe that spam images from the same source are usually composed of visually similar elements which are arranged and altered in many different ways in order to trick the spam filter. In this study, we propose to infer spam images originated from the same spam gang by investigating spam email similarity in terms of their visual appearance and editing style. In particular, a data mining technique based on unsupervised image clustering is proposed in this paper to solve this problem. This is achieved by first dividing a spam image into different areas/segments, including texts, foreground graphic illustrations, and background areas. The proposed framework then extracts characteristic visual features from segmented areas, including text layout, visual features of foreground graphic illustrations and its spatial layout, and background texture features. In the clustering stage, a two-stage clustering method is proposed to group images with similar foreground illustrations first. Then those images which contain mostly texts are clustered based on the similarity of their text layouts and background textures. We test the proposed approach using different settings and combinations of features. The overall performance of the proposed clustering method is measured with V-measure.
- Design and Deployment of a Digital Forensics Service Platform for Internet Videos, Wen Hui (University of Science and Technology Beijing, CN); Hao Yin (Tsinghua University, CN); Chuang Lin (Tsinghua University, CN)
Design and Deployment of a Digital Forensics Service Platform for Internet Videos
Increasing amount of illegal videos transmitted via Internet has aroused the need to develop digital video forensic systems for deterring and prosecuting digital crimes. Thus in this paper, we propose IVForensic, a digital forensics service platform for Internet videos, with the goal of revealing illegal videos and preventing them from spreading over the Internet. Different from previous systems, it a) detects more illegal acts; b) processes large amount of video data in real time; c) pays more attention to end-user experience. For capturing legal evidence, a special technique - video fingerprinting - is employed. We present a hybrid fingerprint and improve the related algorithms in important ways. To solve dynamic resource scheduling and load balancing problems, the platform is built upon Content Distribution Networks (CDNs). To our knowledge, IVForensic is the first large-scale digital forensics service platform for Internet videos. It has been deployed and used in practice for detecting thousands of videos under the real web environment. The results of performance evaluation using data obtained from these real-world deployments demonstrate the effectiveness of the platform.
|15:30-16:00 ||Coffee Break |
|16:00-17:30 || Oral Session 4 - Forgery and Splicing Detection - Session Chair: Marcel Worring|
- Digital Forgery Estimation into DCT Domain - A Critical Analysis, Sebastiano Battiato (University of Catania, IT); Giuseppe Messina (University of Catania, IT)
Digital Forgery Estimation into DCT Domain - A Critical Analysis
One of the key characteristics of digital images with a discrete representation is its pliability to manipulation. Recent trends in the field of unsupervised detection of digital forgery includes several advanced strategies devoted to reveal anomalies just considering several aspects of multimedia content. One of the promising approach, among others, considers the possibility to exploit the statistical distribution of DCT coefficients in order to reveal the irregularities due to the presence of a superimposed signal over the original one (e.g., copy and paste). As recently proved the ratio between the quantization tables used to compress the signal before and after the malicious forgery alter the histograms of the DCT coefficients especially for some basis that are close in terms of frequency content. In this work we analyze in more details the performances of existing approaches evaluating their effectiveness by making use of different input datasets with respect to resolution size, compression ratio and just considering different kind of forgeries (e.g., presence of duplicate regions or images composition). We also present possible post-processing techniques able to manipulate the forged image just to reduce the performance of the current state-of-art solution. Finally we conclude the papers providing future improvements devoted to increase robustness and reliability of forgery detection into DCT domain.
- A New Approach for JPEG Resize and Image Splicing Detection, Qingzhong Liu (New Mexico Tech, US)
A New Approach for JPEG Resize and Image Splicing Detection
Today's ubiquitous digital media are easily tampered to remove or include some significant objects that were existed or not existed without leaving any obvious clues. JPEG images are the most popular images on the Internet and are easily doctored. For many purposes, it is necessary and very important to design reliable methods to detect the forgery in JPEG images. In this paper, we propose a new approach to detect resized JPEG images and spliced images, which are widely used in image forgery. First, following an observation of a bivariate generalized Gaussian distribution (BGGD) in DCT domain, the neighboring joint density features are extracted; then support vector machines (SVM) are applied to the features for detection of resized JPEG images and splicing forgery. To improve the evaluation of detection, we utilize the shape parameter of generalized Gaussian distribution of DCT coefficients to measure the image complexity. The study shows that our approach is promising to detect resized JPEG images and splicing forgery. Additionally, image complexity and resize scale factor are important to the detection of resized JPEG images. The detection in high image complexity is not so well as the detection in low image complexity. Different resize scale factors are associated with different detection performance.
- Exposing Digital Video Forgery by Ghost Shadow Artifact, Jing Zhang (Tianjin University, CN)
Exposing Digital Video Forgery by Ghost Shadow Artifact
In the digital multimedia era, it is increasingly important to ensure the integrity and authenticity of the vast volumes of video data. A novel approach is proposed for detecting video forgery based on ghost shadow artifact in this paper. Ghost shadow artifact is usually introduced when moving objects are removed by video inpainting. In our approach, ghost shadow artifact is accurately detected by inconsistencies of the moving foreground segmented from the video frames and the moving track obtained from the accumulative frame differences, thus video forgery is exposed. Experiments show that our approach achieves promising results in video forgery detection.