2017 Seminars – Centre for Robotics & Vision @ AUT

For time and location, please see at the individual seminar.

14 December 2017 (12pm, Thursday, WT515C) Robert Yang (EEE, AUT) The Use of Video to Detect and Measure Pollen on Bees Entering a Hive.
The reported research measures the pollen being brought into the beehive in the bees’ pollen sacs. Processed 2D video is obtained at the entrance to the beehive; the goal is to count automatically the number of full pollen sacs which bees bring back. The technology used in this research relates to object detection using computer vision. Many papers in the field of object detection try to detect and track different objects, such as human beings, vehicles and animals. However, very little research has focused on the flight of bees, and none has involved identifying the pollen sacs. Difficulty arises from requiring high resolution video and high speed processing because bees can fly at high speed. The talk explains details of the theory of the methods used in this research. In addition, it reports on results of applying the methods in practice. Experimental results indicate that tracking of single bees is over 99% accurate. In addition, the pollen measurement sensitivity is over 60% on the video.
16 November 2017 (12pm, Thursday, WT515C) Kam Yuk (leona) Chan (EEE, AUT) Personalised annotations for augmented video coaching.
Modern mobile and video technologies are able to provide high frame rates and high-quality videos. Such video technologies can advance augmented coaching and qualitative movement diagnostics to help sport enthusiasts to improve their golf swing. Nevertheless, specific to the computational challenge of golf swing video processing in the environments where the background is not completely static is that each pixel may represent both static and dynamic information belonging to foreground or background. Experimental evidence indicate that foreground/background separation algorithms commonly used in surveillance would not perform well in golf-specific contexts. Initial experimental evaluation included benchmarking and combining commonly used surveillance algorithms (in Matlab) that could provide a silhouette of a golfer and a club. Evaluated solutions combined frame difference, erosion/dilatation, blob detection and Gaussian mixture model from captured video at AUT golf driving range. The produced multi-layered solution for privacy preservation of golfing activity is able to provide silhouette-alike video/image that can be used for augmented coaching and providing visual annotation feedback while preserving players privacy. Producing a video or generated a report with annotated angles, golf club head trajectory and other elements of swing performance are important coaching tools to facilitate golf learning from novice to intermediate skill level players.
9 November 2017 (12pm, Thursday, WT515C) Loulin Huang (EEE, AUT) Dynamic Control of UAVs – some case studies.
The talk will cover dynamic control of UAVs in two configurations – quadrotor (6 DOF ) and twin-rotor (2 DOF for hovering control). Common approaches for dynamic modelling, state estimation, controller design and embedded system design for controller implementation will be presented. The role of dynamic modelling in controller design will also be examined in view of practical applications.
2 November 2017 (12pm, Thursday, WT515C) Syeda Fouzia(EEE, AUT) Computer Vision-Based Improved Process Efficiency in Warehouses.
Optimizing warehousing processes and ensuring work safety is one of the key concerns for the logistics industry. Computer vision has been used recently in conjunction with other technologies for improving industrial logistics handling in warehouses, but its use is still limited. We suggest the use of an added sensors, i.e. fixed infrastructure monocular cameras, which could help in aiding safety inside warehousing. It can help in enhancing situational awareness in industrial environments like warehouses and ensure safe working conditions for the pedestrians working around. Recent use of deep learning algorithms, alongside computer vision, has brought a revolution in the IT industry. For this research, we aim at reviewing computer vision and deep learning algorithms for accurate recognition of objects.
26 October 2017 (12pm, Thursday, WT515C) Zahra Moayed (EEE, AUT) Topic: Traffic Intersection Monitoring Using Fusion of GMM-based Deep Learning Classification and Geometric Warping.
In this work we present a vision-based road user monitoring system for traffic intersections using a combination of Gaussian Mixture Model (GMM)-based deep learning approaches and geometric warping for further behaviour analysis. We thus extract the real distance and the approximate real time speed of each road user. The results demonstrate that the proposed region proposal generator method for Fast R-CNN outperforms the other methods in terms of both accuracy and computation time.
19 October 2017 (12pm, Thursday, WT515C) Hsiang-Jen Chien (EEE, AUT) Multi-path Feature Tracking for Stereo Visual Odometry.
Visual odometry (VO) studies the recovery of a camera trajectory from an image sequence. Stereo vision-based VO techniques solve the egomotion estimation problem by means of disparity-derived 3D scene structure. Typically, one of the the two images is only used for disparity computation. This talk presents a generic feature tracking framework extending the classical VO problem into a higher dimension, where the image data of both cameras are fully used.
12 October 2017 (12pm, Thursday, WT515C) Yueqiu Ren (CS, AUT) Real-Time Recognition of Series 7 New Zealand Banknotes. –In this talk, an effective method for banknote recognition in real time using digital image processing will be introduced. The Series 7 New Zealand banknotes are considered as a case for intelligent and real-time recognition. The composite feature of banknote containing the elements of color and texture is extracted, and a three-layer back-propagation neural network is trained for classification. The proposed method has demonstrated excellent recognition results in an indoor environment and is comparatively less time-consuming that makes it suitable for real-time applications. This study fills in the vacancy of real-time recognition of the newly released paper currency. Practically, our proposed approach can be served as the uppermost for future development of the prototype assisting the blind or the visually impaired in recognizing the denominations of New Zealand’s new banknotes.
05 October 2017 (12pm, Thursday, WT515C) Amita Dhiman (EEE, AUT) Road Pothole Detection Using 3D Plane Fitting in Image Disparity Space—Potholes on roads are the most requiring type of road distress which needs to be identified automatically (e.g., in self-driving cars or modern trucks) to avoid any fatality. The proposed strategy of pothole detection is based on stereo vision. The strategy directly performs road plane modelling in the image-disparity space, without back projecting a disparity image into 3D space; by employing a RANSAC process we find the dominating plane. This method has been tested using CCSAD data set.
28 September 2017 (12pm, Thursday, WT515C) Noor Saleem (EEE, AUT) The V-disparity Representation for Challenging Datasets. — Robust obstacle segmentation and scene understanding are key tasks for cameras in autonomous cars. The talk revisits stereo-based methods for robust vertical road profile detection, and gives possible solutions to ensure a very low false-detection rate for obstacles in challenging data sets.

15 – 17 September 2017 CeRV Retreat. The list of talks:

Q. Gu, A Course for Deep Learning (15 September 2017, Friday)
J. Hilty, Measuring Leaf Surface Area using Stereo Vision (15 September 2017, Friday)
R. Klette, Video Demos (15 September 2017, Friday)
A. Griffin, Counting Sheep from Aerial Imagery (16 Spe 2017, Saturday)
Z. Moayed, GMM-Based Abnormal Event Detection using RCNN (16 September 2017, Saturday)
N. Saleem, Stixels Presentation (16 September 2017, Saturday)
A. Dhiman, Pothole Detection in Disparity Space (16 September 2017, Saturday)
H. Chien, Multi-objective Visual Odometry (16 September 2017, Saturday)
T. Paijon, Mobile Robot Localization for Precision Manufacturing (16 September 2017, Saturday)
F. Noor, Detection and Tracking in Warehouse of Project (16 September 2017, Saturday)
W. Yan, Visual Event Computing (16 September 2017, Saturday)

14 September 2017 (12pm, Thursday, WT515C) Celebration and discussion.

17 August 2017 (12pm, Thursday, WT515C) Jonas Hilty (EEE, AUT) Leaf Growth Measurement using Marker Tracking.— In this talk, I present first results from a study conducted last summer. An existing method based on leaf fixation and marker tracking was implemented in the field to measure in situ leaf growth of the mangrove Avicennia marina. I focus on the experimental set-up including the design of a weather proof frame, image acquisition using a Raspberry Pi single-board computer, and measurement of environmental variables. Further, I talk about my marker tracking software and general challenges I faced.

10 August 2017 (12pm, Thursday, WT515C) Dr Martin Stommel (EEE, AUT) Probabilistic Automata Model of a Soft Robot for the Planning of Manipulation Tasks. — Soft robots must be able to structure an automation problem into a sequence of actions that lead to a desired state, before they can fulfil a meaningful role in automation applications. This, however, can only be successful if the robot can predict the outcome of an action. The theory of rigid industrial robots is not applicable without major changes, because kinematic chains do not adequately describe the continuous deformation of the complex, often biologically inspired shapes of soft robots. Analytic solutions have not been found yet. Numerical solutions based on finite elements are slow, technically challenging, and only suitable for one specific robot. It is, however, possible to observe the outcome of an action, and use these observations to plan a sequence of actions that let the robot accomplish an automation task. This talk deals with a probabilistic automaton that computes the optimal sequence of actions to bring the robot into a desired state. It will be shown how the method can be used to model and solve a planning problem inspired by a real soft robot. The analog of an impulse response will be identified, although it is not closed form due to the nonparametric nature of the method.

3 August 2017 (12pm, Thursday, WT515C) Dr Minh Nguyen (Computer Science, AUT) Exploring the 3D World of the Internet with Commodity Virtual Reality Devices. — In this talk, we describe technical basics and applications of graphically interactive and online Virtual Reality (VR) frameworks which automatically extract and display observable 3D datasets from the Internet search engines, e.g. Google Image Search. Within a short waiting time, many 3D related results are returned to the user regarding aligned left and right stereo photos which are ready to view using VR glasses. The system automatically filters different types of available 3D data; from redundant pictorial datasets on the Internet, to identify if an image is a side-by-side stereo pair, an anaglyph, a stereogram, or just a “normal’’ 2D images. The system then generates stereo pairs from detected 3D dataset, to seamlessly display 3D visualisation on State-of-the-art VR devices such as the low-cost Google Cardboard, the Samsung Gear VR or Google Daydream. These devices are used to provide an immediate, controllable, 3D display. In this article, an image-type classification technique is proposed to dynamically extract co-aligned stereo pairs and produce a rich 3D visualisation to VR viewers. This portable, simple to set up and operate the system, currently very rare on the Internet. From a number of initial experiment results; our system is shown to be relatively fast, accurate, and easy to implement. With such system, Internet users all over the World could easily visualise millions of real life stereo datasets publicly available on the Internet; which are believed to be useful for VR testing and learning purposes.

27 July 2017 (12pm, Thursday, WT515C) Qin Gu (EEE, AUT) Fast Vehicle-Oriented Traffic Event Recognition Using Deep Convolution Flow. — This talk gives a brief review of current work in the domain of traffic event detection and recognition. Contributing to vision-based intelligent supervising systems, a vehicle-oriented event detection and recognition method using deep learning has been developed. A feature descriptor of a short sequence of video frames has been applied to a fast region-based convolution neural network(Fast-RCNN) including loopy belief propagation. The features of this sequence of video frames after convolution operations construct a deep convolutional space. A group of conclusive and directed regions of interest (CD-ROI) are set in this space. By using ROI pooling, we directly identify each event in this space without redundant calculations and therefore reduce computational time of event recognition.

20 July 2017 (12pm, Thursday, WT515C) Zahra Moayed (EEE, AUT) Vision-based safety analysis for traffic participants at road intersections. — The safety of pedestrians and vehicles at traffic intersections is a concern for transport practitioners due to a high number of reported accidents and fatalities. Computer vision as an essential area involved in Intelligent Transport System (ITS) can take advantage of infrastructure-based recording such as surveillance cameras to assess the events and analyse the participant’s safety. Previous studies revealed that the safety factors are investigated solely and there is a demand to have an automated safety analyser, which considers the interaction among all participants at intersections as well. Due to variation in traffic scenes in terms of weather conditions and daytime, further research work is still needed to achieve robustness. This research focus on an automated robust safety analyser. It will provide more cost-effective and reliable approach for informing the participants about possible risks from the infrastructure side, hence it aligns with the future technologies such as autonomous vehicles. Apropos of being more effective and customized, intersections at Auckland, New Zealand are our ultimate goal for monitoring.

18 May 2017 (12pm, Thursday, WT515C) Arpita Dawda (EEE, AUT) Slope Measurement using Stereoscopic Vision.–The project aims to measure the slope of any surface using stereoscopic vision. The nerian technologies product SP1 stereo vision system is used with two grayscale cameras to get a stereoscopic vision. The project mainly depends on the output of SP1 stereo vision system. In this talk, different features of SP1 and cameras on which the accuracy of output relies on will be discussed. Along with that, some image processing fundamentals used to calculate the slope of the surface will be explained. The application of this project will be to measure river bank erosion and it can also be used in autonomous vehicles.
11 May 2017 (12pm, Thursday, WT515C) Amita Dhiman (EEE, AUT) An introduction into techniques for visual 3D road surface inspection.–This talk informs about a starting new PhD project. Road condition assessment is an essential and critical task of identifying and analyzing distinct types of distress like potholes, cracks, or bumps in road surfaces. Significant efforts have been made to suggest several techniques for the identification of road surface unevennes. The project starts with a study to review the methods and techniques used so far in (especially vision-based) road unevennes detection. It also presents an introduction into thoughts around a potential way (based on stereo vision) of accurately defining and identifying potholes to aid heavily loaded trucks.
04 May 2017 (12pm, Thursday, WT515C) Mahmoud Al-Sarayreh (EEE, AUT) Computer Vision Technologies for Hyperspectral Imaging Systems Implemented in Food Applications.–Hyperspectral imaging systems come to solve the limitations of spectroscopic methods and RGB colour images in terms of spectral and spatial information, while this imaging system integrates the main advantages of both systems (i.e. spectroscopic, colour image). Hyperspectral line scanning systems have a disadvantage regarding the ability for real-time applications due to the time needed for collecting the spectral information. Recently, new hyperspectral snapshot cameras (sensors) have been successfully introduced with the ability to generate up to 170 image/sec. This type of sensors opens the door for applying the spectral imaging systems on applications of object detection and recognition and materials segmentation. This research project aims to investigate these new spectral sensors (i.e. snapshot sensors) for developing a real-time spectral-based classification system implemented in the meat processing industry. This talk will focus on developing and enhancing the existent methods for dealing with the spectral data collected at video rate in terms of: (i) quality of the spectral information resulting from the snapshot camera, (ii) the best information extraction from the collected data at video rate, and (iii) model development for discriminating the regions of interest using the extracted information.
06 April 2017 (12pm, Thursday, WT515C) Syeda Fouzia (EEE, AUT) Exploring Neural Networks and Deep Learning–Neural networks and deep learning currently provide the best solutions to many problems in image classification, speech recognition, and natural language processing. There is no engineering involved, we just need huge data for training the algorithm. This talk continues our series of detailed discussions towards the full understanding of deep learning. We are exploring neural nets, which are a programming paradigm enabling machines to learn from training data, and deep learning as a powerful set of techniques, used for learning in neural networks.
30 March 2017 (12pm, Thursday, WT515C) Johnny Chien (EEE, AUT) A Walk-through the Latest Work on Visual SLAM and Visual Odometry.–In the past 5 years many exciting breakthroughs have been made following the 20-year long development in Visual SLAM and Visual Odometry. The talk revisits the vision solutions to the classical SLAM problem and gives an up-to-date overview on selected prominent work.
02 March 2017 (12pm, Thursday, WT515C) Zahra Moayed (EEE, AUT) Deep Leaning: Its Trend in Research and Industry.–Deep learning is a GENERAL answer to machine learning problems. Deep learning is considered as one of the main research areas that industries and researchers are focusing on for data analysis: basically a single algorithm can solve complex problems. There is no engineering involved, we just need huge data for training the algorithm. In this talk, the general concepts of deep learning and its origins, existing and future trends in research and industry, and some key persons and groups are discussed. The end of the presentation explores the possible use of deep learning in current research topics at CeRV.
16 February 2017 (12pm, Thursday, WT515C) Noor Saleem (EEE, AUT) Semantic Segmentation Approach for Vision-based Driver Assistance.–Vision-Based Driver Assistance (VBDA) systems emerge as a contribution to traffic safety, being of significant research interest in modern and developed countries. Previous studies have found that semantic segmentation of traffic scenes suffers from both challenging road surface (free-space) detection and absence of risk perspective; hence, there is a demand on those topics towards future generation of intelligent automotive systems. Moreover, different technologies can support VBDA to detect free-space either by using monocular or multi-ocular vision. Due to variations in trafﬁc scenes (e.g. weather, road conditions, road geometry, or trafﬁc density) there is still further research needed. This talk will provide possible solutions to semantic segmentation in traffic scenes. Besides, it will be relevant to New Zealand in such a way to generate road-environment models that help to understand traffic density, traffic safety characteristics, and maintenance-related issues..
02 Feb 2017 (12pm, Thursday, WT515C) Dr Xiaotian Wu (Jinan University Guangzhou China) Reversible Data Hiding in Encrypted Images.–Signal processing in encrypted domain is a cutting edge technique in multimedia security. In this talk, two reversible data hiding methods for encrypted images will be introduced to conceal user information into encrypted data. Both the two approaches obtain optimal performance while comparing to those state-of-the-art methods and results.
27 January 2017 (11am, Friday, WT712) Professor Zixiang Xiong (IEEE Fellow, Texas A&M University US and Monash University AU) Adaptive Boosting for Image Denoising: Beyond Low-Rank Representation and Sparse Coding.—- In the past decade, much progress has been made in image denoising due to the use of low-rank representation and sparse coding. In the meanwhile, the-state-of-the-art algorithms also rely on an iteration step to boost the denoising performance. However, the boosting step is fixed or non-adaptive. In this work, we perform rank-1 based fixed-point analysis, then, guided by our analysis, we develop the first adaptive boosting (AB) algorithm, whose convergence is guaranteed. Preliminary results on the same image dataset show that AB uniformly outperforms existing de-noising algorithms on every image and at each noise level, with more gains at higher noise levels.
26 January 2017 (12pm, Thursday, WT515C) Syeda Fouzia (EEE, AUT) Detection and Tracking of Pedestrians and Forklift Trucks in Warehouses.—- Each year, as reported by OSHA department of Labor USA, there are nearly 100,000 forklift accident-related injuries in the United States. Of these, nearly 100 result in fatalities. The two most common causes of forklift related injuries are forklift tip-over incidents or accidents involving pedestrians being struck by a forklift. As per work safe New Zealand, warehouse related injuries contribute 8% of all work place fatalities. Ensuring warehouse safety is one of the crucial steps to ensure comfortable work environments for workers and efficiency improvement. Not only it helps to avoid potentially alarming situations for pedestrians and forklift drivers, but also aids in productivity improvement at facilities. The talk reports about the start of a project that aims at using computer vision techniques to detect, track and explore the behavior understanding of forklift drivers and pedestrians in the warehouse environment. The system will cover multiple viewpoints, with the special aim to reduce the effects of occlusions, multiple-object scenarios, different lighting conditions, and varying speed of mobile objects.