Now showing 1 - 5 of 5
  • Placeholder Image
    Publication
    Gradient sensitive kernel for image denoising, using Gaussian Process Regression
    (2016-06-10)
    Dey, Arka Ujjal
    ;
    We target the problem of Image Denoising using Gaussian Processes Regression (GPR). Being a non-parametric regression technique, GPR has received much attention in the recent past and here we further explore its versatility by applying it to a denoising problem. The focus is primarily on the design of a local gradient sensitive kernel that captures pixel similarity in the context of image denoising. This novel kernel formulation is used to shape the smoothness of the joint GP prior. We apply the GPR denoising technique to small patches and then stitch back these patches, this allows the priors to be local and relevant, also this helps us in dealing with GPR complexity. We demonstrate that our GPR based technique gives better PSNR values in comparison to existing popular denoising techniques.
    Scopus© Citations 1
  • Placeholder Image
    Publication
    Generating synthetic handwriting using n-gram letter glyphs
    (2016-12-18)
    Dey, Arka Ujjal
    ;
    We propose a framework for synthesis of natural semi cursive handwritten Latin script that can find application in text personalization, or in generation of synthetic data for recognition systems. Our method is based on the generation of synthetic n-gram letter glyphs and their subsequent concatenation. We propose a non-parametric data driven generation scheme that is able to mimic the variation observed in handwritten glyph samples to synthesize natural looking synthetic glyphs. These synthetic glyphs are then stitched together to form complete words, using a spline based concatenation scheme. Further, as a refinement, our method is able to generate pen-lifts, giving our results a natural semicursive look. Through subjective experiments and detailed analysis of the results, we demonstrate the effectiveness of our formulation in being able to generate natural looking synthetic script.
  • Placeholder Image
    Publication
    Beyond visual semantics: Exploring the role of scene text in image understanding
    (2021-09-01)
    Dey, Arka Ujjal
    ;
    Ghosh, Suman K.
    ;
    Valveny, Ernest
    ;
    Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We not only extract and encode visual and scene text cues but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images with scene text content to demonstrate its effectiveness. In the retrieval framework, we augment the contextual semantic representation with scene text cues to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous scene text recognition, we also apply query-based attention to the text channel. We show that our multi-channel approach, involving contextual semantics and scene text, improves upon the absolute accuracy of the current state-of-the-art methods on Advertisement Images Dataset by 8.9% in the relevant statement retrieval task and by 5% in the topic classification task.
    Scopus© Citations 10
  • Placeholder Image
    Publication
    Greedy Gaussian Process Regression Applied to Object Categorization and Regression
    (2018-12-18)
    Dey, Arka Ujjal
    ;
    Hafez, A. H.Abdul
    ;
    In this work we propose an approximation of Gaussian Process and apply it to Classification and Regression tasks. We, primarily, target the problem of visual object categorization using a Greedy variant of Gaussian Processes. To deal with the prohibitive training and inferencing cost of GP, we devise a greedy approach to subset selection and the inducing input choice to approximate the kernel matrix, resulting in faster retrieval timings. A localized combination of kernel functions is designed and used in a framework of sparse approximations to Gaussian Processes for visual object categorization and generic regression tasks. Through exhaustive experimentation and empirical results we demonstrate the effectiveness of the proposed approach, when compared with other kernel based methods.
    Scopus© Citations 1
  • Placeholder Image
    Publication
    EKTVQA: Generalized Use of External Knowledge to Empower Scene Text in Text-VQA
    (2022-01-01)
    Dey, Arka Ujjal
    ;
    Valveny, Ernest
    ;
    The open-ended question answering task of Text-VQA often requires reading and reasoning about rarely seen or completely unseen scene text content of an image. We address this zero-shot nature of the task by proposing the generalized use of external knowledge to augment our understanding of the scene text. We design a framework to extract, validate, and reason with knowledge using a standard multimodal transformer for vision language understanding tasks. Through empirical evidence and qualitative results, we demonstrate how external knowledge can highlight instance-only cues and thus help deal with training data bias, improve answer entity type correctness, and detect multiword named entities. We generate results comparable to the state-of-the-art on three publicly available datasets under the constraints of similar upstream OCR systems and training data.
    Scopus© Citations 1