Posts by Collection

portfolio

publications

OCR-VQA: Visual question answering by reading text in images

Published in International Conference on Document Analysis and Recognition, 2019

We released the first dataset for Visual Question Answering that requires understanding scene-text through Optical Character Recognition (OCR).

Recommended citation:

From Strings to Things: Knowledge-enabled VQA Model that can Read and Reason

Published in International Conference on Computer Vision, 2019

We released the first dataset for Visual Question Answering (VQA) that requires traversing an external knowledge-graph as well as understanding scene-text through Optical Character Recognition (OCR). We also proposed a Graph-RNN based approach for VQA with external knowledge and demonstrated state-of-the-art results.

Recommended citation: http://openaccess.thecvf.com/content_ICCV_2019/papers/Singh_From_Strings_to_Things_Knowledge-Enabled_VQA_Model_That_Can_Read_ICCV_2019_paper.pdf

Operator-in-the-Loop Deep Sequential Multi-camera Feature Fusion for Person Re-identification

Published in IEEE Transactions on Information Forensics and Security, 2019

We developed a human-in-the-loop system to rank image retrieval results from an image recognition CNN and then merge them with human choice for person re-identification. Our method showed consistent improvement in performance across 6 cameras compared to only relying on model insights.

Recommended citation: https://ieeexplore.ieee.org/abstract/document/8922622

Response Time Analysis for Explainability of Visual Processing in CNNs

Published in Computer Vision and Pattern Recognition (CVPR) Workshop, 2020

We adapt response time evaluation in human psychophysics for deep learning to calculate Neural Response Times (NRT) by profiling dynamic DNNs. We verify that NRT is able to corroborate known effects about the feature space composed by object recognition models when tested on OOD view points. We further demonstrate that NRT can be used to causally verify Scene Grammar effects in identifying semantic and syntactic inconsistencies across visual scenes.

Recommended citation: https://openaccess.thecvf.com/content_CVPRW_2020/papers/w26/Taylor_Response_Time_Analysis_for_Explainability_of_Visual_Processing_in_CNNs_CVPRW_2020_paper.pdf

Context-aware Scene Graph Generation with Seq2Seq Transformers

Published in International Conference on Computer Vision, 2021

We demonstrated state-of-the-art results in generating scene graphs from natural images by using a sequential subject->predicate->object generation using a transformer encoder on object bounding boxes derived using a detection model.

Recommended citation: https://openaccess.thecvf.com/content/ICCV2021/papers/Lu_Context-Aware_Scene_Graph_Generation_With_Seq2Seq_Transformers_ICCV_2021_paper.pdf

talks

Knowledge-Enabled Visual Question Answering that utilizes scene text

Published:

I gave a talk on my research work on visual question answering where models can successfully utilize scene text information as well as external world knowledge from knowledge graphs. The talk was a summary of our research published at ICDAR and ICCV. Slides available:

Dynamic Inference Models

Published:

I gave a talk on dynamic inference models in deep learning i.e. models that can perform a variable amount of computation depending on the input complexity. Slides available:

Opacity in artificial intelligence

Published:

For my graduate course on Artificial Intelligence and Society, I gave a presentation on the issue of opacity and the need for interpretability in artificial intelligence as well as some research directions in interpretability research. Slides available:

Abstract Visual Reasoning

Published:

I gave a talk on the problem of Abstract Visual Reasoning in computer vision. We discussed ideas on generalisation across visual concepts, compositionality, the role of inductive biases, symbolic knowledge etc. Slides available:

teaching

Community Teaching Assistant: Machine Learning

Online course, Coursera.org, 2017

Worked as community TA for the Machine Learning online course on Coursera.org. Addressed students’ doubts and provided support on the discussion forums.

Graduate Teaching Assistant: Engineering Analysis

Undergraduate course, University of Guelph, Department of Engineering, 2020

Worked as a teaching assistant for an undergraduate course on linear algebra. Prepared and delivered modules on least squares approximation, PCA and Markov chains

Graduate Teaching Assistant: Optimization

Undergraduate course, University of Guelph, Department of Engineering, 2020

Worked as a teaching assistant for an undergraduate course on optimization. Held weekly tutorial sections on linear and dynamic programming, markov chains etc

Instructor: LearnAI

Undergraduate course, University of Toronto, 2020

Instructor for the LearnAI course to ~100 undergradutes (freshmen up to senior) supported by a team of 6 TAs. Taught modules on introduction to machine learning, scientific python stack and deep learning in a project oriented course.

Graduate Teaching Assistant: Modelling Complex Systems

Undergraduate course, University of Guelph, Department of Engineering, 2021

Worked as the only teaching assistant for a programming intensive undergraduate course on modelling complex systems. Held weekly tutorial sections on programming (using Python), Graphs (using NetworkX), Cellular Automata and Agent Based Models. Provided support on development technologies including Git, Restructured Text markup, and Jupyter Labs to students.