2016 Seminars


For time and location, please see at the individual seminar.

  • 10 November 2016 (2.10pm, Thursday, WT515C)Hsiang-Jen (Johnny) Chien, Noor Saleem (AUT EEE) Our Time at Wuhan University – Trinocular Vision for Autonomous Driving

    — We report about the setting-up of a trinocular recording system and its preliminary use in a test vehicle at Wuhan University. This is a report about an invited two-months stay at Wuhan University, with regarding to collected experiences and the established research collaboration.

  • 03 November 2016 (12:10pm, Thursday, WT515C)
    Qin Gu (UESTC, Chengdu China) Multi-scaled object detection and recognition for intelligent vehicles and intelligent transportation systems.

    –This talk gives a brief review of current work in the domain of object detection and recognition. Contributing to vision-based driver assistance systems, a monocular vehicle detection and localisation method has been developed. Focusing on changing scales for object detection, a morphology-based method is given for license plate detection and localisation in challenging real-world environments. Finally, the talk reports an information parsing method for dealing with images at different scales. It has been used for multi-scale vehicle logo recognition.

  • 02 November 2016 (12:05pm, Wednesday, WE230)
    Rajib Hasan (AUT EEE) Computer Vision for Sport Quantification – From Monocular to Multi-Sensor Systems.

    —  To get the optimum performance from an athlete, it involves understanding with the application of biomechanics principle, and it is related to the underlying physics of forces with motion to the way that people move. A number of techniques are available to quantification the forces and motion of the people, such as force plates to quantification ground reaction forces for a jump or running sports. To quantification the detailed motion of athletes in training, commercially available motion-capture devices based on optical markers on the athlete’s body and multiple calibrated fixed cameras around the sides of the capture volume can be used. In some situations, it is not practical to attach any kind of marker or transducer to the athletes or the existing machinery are being used, while it is required by a pure vision-based approach to use the natural appearance of the person or object. When a sporting event is taking place, there are opportunities for computer vision to help the referee and other personnel involved in the sports to keep track of incidents occurring, which may provide full coverage and analysis in details of the event for sport viewers. The research aims at using computer vision methods, especially designed for monocular recording, for measuring sports activities, such as high jump, wide jump, or running (i.e. measuring height, width, or speed). The particular focus on this research will remain on quantifying the height of a jump. Just for indicating the complexity of the project: a single camera needs to understand the height at a particular distance in 3D projective space. The height of a jump can be quantified for one foot or both feet, for the head (after eliminating effects by flying hair), or by the centroid of a person or by detecting the position of ear if it is visible (most stable organ during sports event) also. To achieve the objectives, a novel method has been proposed that derived by building a statistical model, comprises of the result obtained from a professional camera, a smart phone camera and the traditional method of quantifying height of a jump. The data has been collected from AUT millennium campus. The benefit of this study is that it is marker-less, and also using only one camera as the only sensor to quantify the height of a single person jump.

  • 27 October2016 (12.10pm, Thursday, WT515C)
    Haokun Geng (Computer Science, UoA) Visual Odometry in Dynamic Environment.

    — Egomotion estimations over stereo image sequences could be considered a key element in many computer vision applications nowadays. Its relevant theories and findings are used in a number of complex and advanced systems, such as driver assistant systems. The focus of this project is stereo-vision based visual odometry for roadside reconstruction. In our study, we find multi-sensor integration with GPS can be a solution to break the limitations of the current VO approach. We designed a ‘multi-run’ scenario to recover missing information in the 3D reconstruction point clouds. We present a novel approach, called ‘geometric multi-layer optimisation’, to improve the accuracy of the motion estimations in a dynamic environment.

  • 20 October 2016 (12.10pm, Thursday, WT515C)
    Dr Ni Liu (Computer Science, UOA) Learnable high-order MGRF models for contrast-invariant texture recognition.

    — Frequent in practice spatially variant contrast/offset deviations that preserve image appearance hinder classification based on signal co-occurrence statistics. Contrast/offset-invariant descriptors of ordinal signal relations, such as local binary or ternary patterns (LBP/LTP), are popular means to overcome this drawback. The research extends conventional LBP/LTP-based classifiers towards learning, rather than prescribing most characteristic shapes, sizes,and numbers of these patterns for semi-supervised texture classification and retrieval. Our proposed learning framework models images as samples from a high-order ordinal Markov-Gibbs random field (MGRF). Approximate analytical estimates of the model parameters guide selecting characteristic patterns of a given order, the higher order patterns being learned on the basis of the already found lower order ones. Comparative experiments on four texture databases confirmed that classifiers with the learned multiple LTPs from the 3rd – to 8th-order consistently outperform more conventional ones with the prescribed fixed-shape LBP/LTPs or a few other filters.

  • 13 October 2016 (12.10pm, Thursday, WT515C)
    Zahra Moayed (AUT, EEE) Detection of Abnormal Behavior in Traffic Scenes: Infrastructure – or Vehicle – based Video Recording.

    –Future traffic systems will include various types of communication such as vehicle to vehicle (V2V), vehicle to infrastructure (V2I), or vehicle to roadside (V2R). Vehicles will increasingly be able to drive autonomously, and these communications will support road safety. Video surveillance at road intersections will be an important component for creating data to be communicated to vehicles. In particular, it is of interest not only to track the traffic participants (cars, trucks, people, bicycles, etc.) at a road intersection but also to understand the traffic flow with respect to potential future risks. Anomaly detection is an important element of understanding traffic flow.

  • 6 October 2016 (12.10pm, Thursday, WT515C)
    Dongwei Liu (University of Auckland, NZ) Stereo Computational Photography.

    –Methods of computational photography transform raw data from an image sensor into an image, improve image quality with dedicated hardware, or provide capability beyond film-based photography. With the current success of 3-dimensional display technology, binocular stereo cameras become increasingly popular for data recording. Taking advantage of depth hints provided by stereo images, computational photography may go further, beyond just 3-dimensional visualisation of recorded data. This work presents the simulation of three depth-aware artistic effects using stereo vision, briefly addressed by fog, bokeh, or star effects. We aim at achieving related artistic effects in some comfortable and controllable ways.

  • 22 September 2016 (12.00pm, Thursday, WT515C)
    Professor Xingle Feng (Chang’an University, China) Asphalt Pavement Structure Detection System Based on Line Laser and 3D Image Data.

    –Skid resistance is a primary index of performance of highway pavement. Texture Depth of the surface is a key factor implying skid resistance of the asphalt pavement structure. The focus of this topic is on the methods of texture depth detection of asphalt pavement structure in terms of the laser triangulation method and image processing technique. Specifically, a breakpoint interpolation algorithm is put forward to dealing with the inconsecutive (breakpoint) contour line. Moreover, a high-precise thinning algorithm is proposed as well. Furthermore, the automatic road detection vehicle embedded with this technology has been developed and applied to the highway construction department.

  • 16 September 2016 (11.00am, Friday, WT515C)
    Dr. Zouhair Mahboubi (Stanford University, US) Automated Air Traffic Control for Non-Towered Airports.

    –The majority of mid-air collisions involve general aviation aircraft, and these accidents tend to occur in the vicinity of airports. This work proposes a concept for an autonomous air traffic control system for non-towered airports. The system is envisioned to be advisory in nature and would rely on observations from a ground-based surveillance system to issue alerts over the common traffic advisory frequency. The system models the behavior of aircraft in the airport pattern as a hidden Markov model whose parameters are learned from real-world radar observations. To determine the optimal advisories that reduce the risk of collision, we formulate the problem as a partially observable semi-Markov decision process (POSMDP). To address the computational complexity of solving the problem, we investigate three different kinds of approximation methods.

  • 2-4 September 2016 (Whatipu) CeRV retreat.
  • 18 August 2016 (WS318: Boardroom of AUT Engineering)
    Dan Breen and Barbara Breen (AUT, Geospatial Science) Unmanned aerial surveys of sand cay habitats and wildlife on the far northern Great Barrier Reef.

    –Coral reef sand cays are unstable accumulations of the calcium carbonate skeletons of marine organisms. These provide important nesting and roosting habitats for endangered species such as the green turtle (Chelonia mydas). We describe the use of multi-rotor unmanned aircraft systems (UAS) to map topography and monitor wildlife on two sand cays on the northern Great Barrier Reef, Australia.

  • 04 August 2016 (WT 515C)
    Dr Barbara Bollard-Breen (AUT, Geospatial Science) The Mysterious Fairy Circles of Namibia and Australia.

    — The AUT UAV research group recently traveled to Namibia and Australia to map the mysterious fairy circles. Fairy circles are bare circles that dot the landscape along the edge of the Namib Desert and the Pilbara Region of Western Australia. These bare patches have been named “fairy circles.” The circles, which support little flora, are an integral part of the distinctive landscape of both areas.  While numerous scientists have researched these circles, no one has yet been able to ultimately determine their cause or purpose. Various theories of their origin have been suggested, including euphorbia poisoning, animal dust baths, meteor showers, termites and underground gas vents. We are part of a large group based at AUT and the University of Pretoria researching these circles. Our team is mainly focused on gaining a better understanding of their spatial patterns using low altitude remote sensing.  This seminar will take you on a journey to these two countries, share some preliminary results and open the floor for discussion on the role computer vision can play in helping to understand these mysterious landscape features.

  • 28 July 2016 (WT 515C)
    Dr Minh Nguyen (AUT, Computer Science) StereoTag: A Novel Stereogram-Marker-Based Approach for Augmented Reality.

    — Augmented Reality (AR) is an active and exciting topic aiming to create intuitive computer interface by blending reality and virtual reality. One challenge of AR is to align virtual data with the environment. Typically, one uses a marker-based approach such as a thick-bordered black and white 2D marker which allows one to recover the relative pose (location and orientation) of a camera in real time. However, bar-code markers do not contain any intuitive visual meaning, and they thus look uninteresting and uninformative. We propose a new type of marker, referred to as a StereoTag, which embeds a meaningful stereogram image hiding 3D coded/decoded information. From experiments conducted, our StereoTag is found to be relatively robust under various conditions and thus could be widely used in future AR applications.

  • 21 July 2016 (WT 515C)A/Professor Wei Qi Yan (AUT, Computer Science): Intelligent Navigation.

    — In this talk, we will present an image-based navigation system developed in AUT. The idea was inspired by Google Street View that features 360° imagery for navigation. In this project, we design and implement a tool to update images and navigation information of a website timely. The system has the functionalities to find a kitchen/toilet and an emergency exit within a building without GPS assistance. The system could be applied to the scenarios like real estate business, underground shops, underwater vehicle (ROV) and collision avoidance of autonomous vehicles which need intelligent navigations however GPS information is not available.

  • WINTER BREAK
  • 26 May 2016 (WT 515C)Marlon M. Reis (AgResearch, NZ) Spectroscopy in the Visible-Near InfraRed spectral range applied to food assurance.

    — The meat and dairy meat sectors are evolving and developing new approaches to production. These changes are bringing their own challenges to maintaining the integrity of the food produced. The challenges in maintaining the integrity of the food produced in NZ is not limited to production but extend to the market where New Zealand producers are exposed to considerable risks, both domestically and overseas. In this seminar the application and development of spectroscopic techniques (Near InfraRed Spectroscopy and Hyperspectral imaging) applied to food assurance are discussed as means to assess food integrity.

  • 26 May 2016 (WT 515C)Mr Mahmoud Al-Sarayreh (AUT, EEE): Hyperspectral Imaging Analysis.

    — The hyperspectral imaging refers to the number of bands along to a range of wavelengths. It is always hundreds of bands over a specific range of contiguous wavelengths and the bandwidth is less than ten nm. In general, materials consist of both internal and external attribute. The internal structure of material is very important factor to detect and classify the chemical component in the material. The moisture, fat, protein, and PH value are the most popular chemical attribute in the hyperspectral image for meat processing to classify the quality of the meat. Computer vision systems are capable of giving us the external attributes for the objects like size, shape, colour, surface texture and so on. In regard to the internal attributes (chemical composition) for the object, the computer system with the simple colour imaging is not capable of analysing these attributes mentioned earlier. In addition, actual colour images have limited spectrum information to analyze the chemical distribution of the objects. Hyperspectral imaging systems solve the limitation of the spectral information problem from the colour images. It provides a stack of images at a different wavelength within a specific wavelength range.

  • 19 May 2016 (WT 515C)Mohammad Rajib Hasan (AUT, EEE): Cyber War Around The World: The True Sequel of Hacking.

    — Cyber war is an action by a nation-state to penetrate another nation’s computers or networks for the purpose of causing damage or disruption. In fact, this war is not limited to the nation but it is spreading among the cyber groups in different countries. For example in 2012, groups of Indian cyber teams and Bangladeshi cyber teams declared cyber war. In 2013, also there were several attacks from some cyber groups from Indonesia targeting the official web sites of “University Utara Malaysia (UUM)”. Hacking is one of the main exploits in this cyber war. In the computer security context, a hacker is someone who seeks and exploits weaknesses in a computer system or computer network. Hackers may be motivated by a multitude of reasons, such as profit, protest, challenge, or enjoyment. Hackers believe that hacking is not a crime but it is the play of codes. This study is based on some true cyber war and hacking events. This study discusses the incident of cyber war, how a cyber war may be triggered in real life, the benefits (yes, there is benefit!), the loss, and how a war is ended or tackled.

  • 12 May 2016 (WT 515C) Jonas Hilty, Computer Vision in Plant Phenotyping (AUT, EEE).

    — Plant phenotyping is the measurement and description of a plant’s appearance. Recently, image analysis has been identified as the “new bottleneck” in plant phenotyping. This talk gives an overview of applications and challenges of computer vision that field.

  • 5 May 2016 (WT 515C)Professor Reinhard Klette (AUT, EEE), Results and Impressions: A Two-weeks Trip to China.

    –The talk informs about visits at the Chinese Academy of Sciences, Beijing University, Shandong Academy of Sciences, Shandong University, Central China Normal University, Wuhan University, and of NextCar, a newly founded company in Wuhan. There will also be a few impressions reported about interesting sites seen on this trip.

  • MID-SEMESTER BREAK
  • 12 April 2016 (WA 214, 4:30 p.m. – 7:00 p.m.)Professor Reinhard Klette (AUT, EEE), Inaugural Professorial Address: A Fairly Random Journey Through The History Of Science – With A Personal Perspective.– Computer vision originated in the 1960s, with the goal that a computer should be able to describe what it sees when connected to a camera. Computer vision is the research area of Professor Klette. This inaugural lecture presents some selected people and their academic contributions, along a path defined by academic genealogy. Research interactions along that path illustrate by examples several important and often unexpected moments in the complex development of science. Astronomers, mathematicians, philosophers, logicians, and finally computer scientists dominated some sections of that path. Who is going to dominate the next section?
  • 7 April 2016 (WT 515C)Dr. Minh Nguyen (AUT, Computer Science), Investigation of Stereogram-Marker-Based Approach for Augmented Reality.

    — Augmented Reality (AR) is an active and exciting topic aiming to create intuitive computer interface by blending reality and virtual reality. One challenge of AR is to align virtual data with the environment. A marker-based approach is one of the successful solutions such as thick bordered black and white 2D markers, detectable with computer vision methods, to calculate the relative pose (location and orientation) of a camera in real time. However, barcode markers do not contain any intuitive visual meaning, thus, look uninteresting and uninformative. We propose a new type of marker which embeds a realistic-looking stereogram image hiding 3D coded information. Thus, the mark has not only naturally coloured look but also presents binary-coded data. From a number of initial experiments conducted, our method is found to be relatively robust under various lighting conditions, and this could be utilised in future AR applications.

  • 31 March 2016 (WT 515C)A/Professor Wei Qi Yan (AUT, Computer Science), Analogy: an AI Approach.

    — Analogy is a cognitive process of transferring information or meaning from a particular subject (the analogue or source) to another (the target). The word analogy can also refer to the relation between the source and the target themselves, which is often, though not necessarily, a similarity. In this presentation, we start from the examples of analogies in our daily life and digitalize the analogy for the purpose of visual information processing. We will demonstrate how to use analogies in curves, images and videos. Event analogy for privacy preservation in intelligent surveillance will be reviewed in this talk.

  • 24 March 2016 (WT 515C)Dr. Yuanyuan Zhang (Shandong Academy of Sciences, China): Deep learning and computer vision.

    — Deep learning is a branch of machine learning based on a set of that attempt to model high-level abstractions in data by using model architectures, with complex structures or otherwise, composed of multiple non-linear transformations. Deep learning algorithms have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains. In this talk, we will give a brief introduction to deep learning and also illustrate its differences to traditional machine learning approaches. There are several widely-used deep learning models, but we will pay much more attention to Convolutional Neural Networks (CNN) which has been applied with great success to computer vision community.

  • 17 March 2016 (WT 515C)Antonius (Tony) Paijens (AUT, SECMS): Mobile robots: the next breakthrough in CAM.

    — Mobile robots have the potential of enforcing a paradigm shift in Computer Aided Manufacturing. Instead of using bigger machines to produce bigger parts, a mobile robot performs the manufacturing operations while moving around the part, wherever processing is required. Parts can be made “on site” according to actual specifications, allowing for flexible, client specific production. The current bottleneck in the development of mobile CAM-bots is the limited accuracy with which mobile robots can position themselves and consequently, a manufacturing tool. The talk presents some state of the art technologies for mobile robot positioning as well as new ideas for positioning and operation of a CAM-bot.

  • 10 March 2016 (WT 515C)Professor Reinhard Klette (AUT, EEE): Smart vehicles for smart cities.

    — The talk reports about two imminent revolutions in the automotive history which will define the on-raod traffic in the smart city of the near future. Both revolutions are already starting to happen, and those cities which will prepare for their event will make the best possible use of them. The first revolution is the replacement of the manually steered vehicle by increasingly autonomously driving vehicles while driving on densely populated roads. Examples of goals are here to go towards zero-accident traffic and a more efficient use of existing road systems. Dense and autonomous traffic will reduce the need for additional roads, and avoid traffic jams by ensuring a steady flow of traffic. The second revolution is the replacement of gasoline powered vehicles by electric cars. Electric cars are completely emission free if charged from renewable energy sources; they also drive silent at low speeds, have significantly cheaper running costs, and also significantly cheaper servicing cost. Autonomous driving and electric vehicles will come, earlier or later, and both developments can benefit from smart city infrastructure projects for supporting their introduction. The talk reports about advances towards autonomous driving, and a city-wide study about the introduction of electric cars in Perth, Western Australia. – The talk is co-authored by Prof. Thomas Braeunl, The University of Western Australia.

  • 03 March 2016 (WT 515C)Yang Zhang (AUT, Computer Science): A virtual keyboard implementation using finger recognition.

    — In this talk, we will present a new type of virtual keyboard that allows users to use fingers to type on any plane at a fixed place. The virtual keyboard could be a keyboard printed on plain paper, a keyboard projected on a desk or a laser keyboard on a wall or on any other oblique plane. The BWMORPH algorithm is utilized to recognize the user’s fingertip. If the user’s fingertip has remained on a key for a long time, the program will regard this key as an input. The experiments adopted a paper keyboard including occlusions on a desk to demonstrate that this type of virtual keyboard works effectively as expected.

  • 25 February 2016 (WT 515C)Hsiang-Jen (Johnny) Chien (AUT, EEE): Geometric camera-LiDAR calibration: offline and online ways to go.

    — LiDAR-enabled vision systems provide reliable and robust solutions to many challenging problems by means of sensory data fusion. To achieve a good fusion the geometric parameters need to be accurately estimated beforehand. It is also important to keep validating the calibrated geometry for any change to avoid hazard in critical applications such as autonomous driving. In this talk I will show my work regarding these issues during my 6 month’s internship at Daimler A.G..

  • 18 February 2016 (WT 515C)Rajib Hasan (AUT, EEE): Computer vision for measuring sports activities.

    — Nowadays, to improve understanding, performance and presentation of a sports with blended of technology is expanding. However, Uses of technology such as a sensor or fixing devices to the sports equipment is generally not widely acceptable. Due to these issue to the average sports viewer or amateur sports player, the use of technology may not be particularly apparent when compared to more obvious such as analysis of performance of a sport event (jump up). Since, computer vision offers lot in this field we propose a research on sport measurements from monocular to multi-sensor systems. The research aim is to identify existing computer vision method, capture the motion of a jump with a monocular camera. Then measure the accuracy of a jump up with computer vision method later compare the accuracy of a jump up. The height of the jump can be measured either by detecting the height of feet, centroid or head.

  • 11 February 2016 (WT 133A)Noor Saleem (AUT, EEE): An introduction into 6D semantic segmentation.

    — In order to provide applications for efficient road environment and safety, methods are required for producing semantic segmentation from existing traffic scenes. The semantic segmentation is relatively new and very active topic in the computer vision community which involves segments classification, analysis and reconstruction. Through reviewing existing research in such domain, there are few researches involved with particular 6D vision for semantic segmentation. Therefore, in this seminar we will introduce certain topics that related to the system model of 6D vision, the necessity of such 6D vision in road environment and safety, besides prospective application of 6D vision in the future.

  • 04 February 2016 (WS 318)Jia Wang (AUT, Computer Science): An event-driven traffic ticking system.

    — In this talk, we will present a new schema to implement an event-driven traffic ticketing system. The system consists of four modules (1) event detection module in where event are detected, (2) plate number recognition module to recognize the extracted plate number, (3) database management module to execute information retrieval and (4) traffic ticket transmission module to deliver ticket. Due to the importance of plate number recognition module in entire system, a novel plate number localization algorithm called Secondary Positioning (SP) method is proposed in this project. The location of plate number in photo will be found out by searching the red pixels value in HSV color space, and the precise position of plate number will be localized by finding out the vertical edge of plate number. A template matching algorithm was implemented for recognizing each plate character incorporates Correction Coefficient calculated between templates and testing images.

  • 28 January 2016 (WS 318)A/Professor Laiyun Qing (UCAS & NSFC, China): Activity Auto-Completion (AAC).- Predicting human activities from partial videos.

    –In this talk, we will give a brief overview of what we have done in terms of activity analysis especially on activity prediction, i.e., given partial video, predicting what the activity is and what will happen next. We propose an activity auto-completion (AAC) model for activity prediction by formulating activity prediction as a query auto-completion (QAC) problem as in information retrieval. First, we extract discriminative patches in frames of videos. A video is represented based on these patches and divided into a collection of segments, each of which is regarded as a character typed in the searching box. Then a partially observed video is considered as an activity prefix, consisting of one or more characters. Finally, the missing observation of an activity is predicted as the activity candidates provided by the auto-completion model. The candidates are matched against the activity prefix on-the-fly and ranked by a learning-to-rank algorithm. We also demonstrate that it yields results that are competitive with the state-of-the-art in activity prediction for the UT-Interaction dataset.