Posts by Collection
portfolio
publications
Road Damage Detection And Classification In Smartphone Captured Images Using Mask R-CNN
Published in IEEE BigData Cup 2018 workshop, 2018
We fine-tuned a Mask R-CNN trained on MS-COCO dataset to detect and classify road damage in real-world images of roads taken from smartphones.
OCR-VQA: Visual question answering by reading text in images
Published in International Conference on Document Analysis and Recognition, 2019
We released the first dataset for Visual Question Answering that requires understanding scene-text through Optical Character Recognition (OCR).
From Strings to Things: Knowledge-enabled VQA Model that can Read and Reason
Published in International Conference on Computer Vision, 2019
We released the first dataset for Visual Question Answering (VQA) that requires traversing an external knowledge-graph as well as understanding scene-text through Optical Character Recognition (OCR). We also proposed a Graph-RNN based approach for VQA with external knowledge and demonstrated state-of-the-art results.
Recommended citation: http://openaccess.thecvf.com/content_ICCV_2019/papers/Singh_From_Strings_to_Things_Knowledge-Enabled_VQA_Model_That_Can_Read_ICCV_2019_paper.pdf
Operator-in-the-Loop Deep Sequential Multi-camera Feature Fusion for Person Re-identification
Published in IEEE Transactions on Information Forensics and Security, 2019
We developed a human-in-the-loop system to rank image retrieval results from an image recognition CNN and then merge them with human choice for person re-identification. Our method showed consistent improvement in performance across 6 cameras compared to only relying on model insights.
Recommended citation: https://ieeexplore.ieee.org/abstract/document/8922622
Response Time Analysis for Explainability of Visual Processing in CNNs
Published in Computer Vision and Pattern Recognition (CVPR) Workshop, 2020
We adapt response time evaluation in human psychophysics for deep learning to calculate Neural Response Times (NRT) by profiling dynamic DNNs. We verify that NRT is able to corroborate known effects about the feature space composed by object recognition models when tested on OOD view points. We further demonstrate that NRT can be used to causally verify Scene Grammar effects in identifying semantic and syntactic inconsistencies across visual scenes.
Recommended citation: https://openaccess.thecvf.com/content_CVPRW_2020/papers/w26/Taylor_Response_Time_Analysis_for_Explainability_of_Visual_Processing_in_CNNs_CVPRW_2020_paper.pdf
Context-aware Scene Graph Generation with Seq2Seq Transformers
Published in International Conference on Computer Vision, 2021
We demonstrated state-of-the-art results in generating scene graphs from natural images by using a sequential subject->predicate->object generation using a transformer encoder on object bounding boxes derived using a detection model.
Recommended citation: https://openaccess.thecvf.com/content/ICCV2021/papers/Lu_Context-Aware_Scene_Graph_Generation_With_Seq2Seq_Transformers_ICCV_2021_paper.pdf
Neural response time analysis: Explainable artificial intelligence using only a stopwatch
Published in Applied AI Letters, 2021
We extended our work on Neural Response Time analysis from CVPR-W 2020 with an additional experiment showing that NRTs are sensitive to intra-class variations, yet can be used to reliably inform between intra-class variations between objects.
Recommended citation: https://onlinelibrary.wiley.com/doi/pdf/10.1002/ail2.48
Inductive Biases For Higher-Order Visual Cognition
Published in Master's Thesis, 2022
We demonstrated state-of-the-art results in generating scene graphs from natural images by using a sequential subject->predicate->object generation using a transformer encoder on object bounding boxes derived using a detection model.
Recommended citation: https://atrium.lib.uoguelph.ca/xmlui/bitstream/handle/10214/26739/Shekhar_Shashank_202201_MASc.pdf
talks
Knowledge-Enabled Visual Question Answering that utilizes scene text
Published:
I gave a talk on my research work on visual question answering where models can successfully utilize scene text information as well as external world knowledge from knowledge graphs. The talk was a summary of our research published at ICDAR and ICCV. Slides available:
Dynamic Inference Models
Published:
I gave a talk on dynamic inference models in deep learning i.e. models that can perform a variable amount of computation depending on the input complexity. Slides available:
Probablistic Programming Tutorial
Published:
I gave a tutorial on probabilistic programming using Pyro. Code is available at this Github repository https://github.com/sshkhr/ppl_tutorial. Slides below:
Artificial Cognition
Published:
I gave a talk on Explainable Artificial Intelligence (XAI) using methods from Cognitive Science to the Data Science Club at my alma mater, Indian Institute of Technology (ISM) Dhanbad. This talk was based on work done by advisor/collaborator Eric J Taylor and myself for our CVPR workshop paper: Response Time Analysis for Explainability of Visual Processing in CNNs. Slides available:
Opacity in artificial intelligence
Published:
For my graduate course on Artificial Intelligence and Society, I gave a presentation on the issue of opacity and the need for interpretability in artificial intelligence as well as some research directions in interpretability research. Slides available:
Abstract Visual Reasoning
Published:
I gave a talk on the problem of Abstract Visual Reasoning in computer vision. We discussed ideas on generalisation across visual concepts, compositionality, the role of inductive biases, symbolic knowledge etc. Slides available:
What can neural networks reason about?
Published:
I gave two talks summarizing the ICLR 2020 paper What Can Neural Networks Reason About?. This paper proposes a theoretical framework to understand the role of inductive biases on the ability of neural networks to reason about problems which have algorithmic solutions. Slides available:
Differentiable Neural Computers
Published:
I presented the DeepMind paper Hybrid computing using a neural network with dynamic external memory from 2016. This paper proposes using a multi head attention framework to train a neural network + memory model to perform computing operations. Slides available:
teaching
Community Teaching Assistant: Machine Learning
Online course, Coursera.org, 2017
Worked as community TA for the Machine Learning online course on Coursera.org. Addressed students’ doubts and provided support on the discussion forums.
Graduate Teaching Assistant: Engineering Analysis
Undergraduate course, University of Guelph, Department of Engineering, 2020
Worked as a teaching assistant for an undergraduate course on linear algebra. Prepared and delivered modules on least squares approximation, PCA and Markov chains
Graduate Teaching Assistant: Optimization
Undergraduate course, University of Guelph, Department of Engineering, 2020
Worked as a teaching assistant for an undergraduate course on optimization. Held weekly tutorial sections on linear and dynamic programming, markov chains etc
Instructor: LearnAI
Undergraduate course, University of Toronto, 2020
Instructor for the LearnAI course to ~100 undergradutes (freshmen up to senior) supported by a team of 6 TAs. Taught modules on introduction to machine learning, scientific python stack and deep learning in a project oriented course.
Graduate Teaching Assistant: Modelling Complex Systems
Undergraduate course, University of Guelph, Department of Engineering, 2021
Worked as the only teaching assistant for a programming intensive undergraduate course on modelling complex systems. Held weekly tutorial sections on programming (using Python), Graphs (using NetworkX), Cellular Automata and Agent Based Models. Provided support on development technologies including Git, Restructured Text markup, and Jupyter Labs to students.