Now showing 1 - 10 of 20
  • Placeholder Image
    Publication
    Effect of the Latent Structure on Clustering with GANs
    (2020-01-01) ;
    Jayendran, Aravind
    ;
    Prathosh, P. A.
    Generative adversarial networks (GANs) have shown remarkable success in the generation of data from natural data manifolds such as images. In several scenarios, it is desirable that generated data is well-clustered, especially when there is severe class imbalance. In this paper, we focus on the problem of clustering in the generated space of GANs and uncover its relationship with the characteristics of the latent space. We derive from first principles, the necessary and sufficient conditions needed to achieve faithful clustering in the GAN framework: (i) presence of a multimodal latent space with adjustable priors, (ii) existence of a latent space inversion mechanism and, (iii) imposition of the desired cluster priors on the latent space. We also identify the GAN models in the literature that partially satisfy these conditions and demonstrate the importance of all the components required, through ablative studies on multiple real-world image datasets. Additionally, we describe a procedure to construct a multimodal latent space which facilitates learning of cluster priors with sparse supervision. Codes for our implementation is available at https://github.com/NEMGAN/NEMGAN-P.
  • Placeholder Image
    Publication
    Large Scale Time-Series Representation Learning via Simultaneous Low-and High-Frequency Feature Bootstrapping
    (2023-01-01)
    Gorade, Vandan
    ;
    Singh, Azad
    ;
    Learning representations from unlabeled time series data is a challenging problem. Most existing self-supervised and unsupervised approaches in the time-series domain fall short in capturing low-and high-frequency features at the same time. As a result, the generalization ability of the learned representations remains limited. Furthermore, some of these methods employ large-scale models like transformers or rely on computationally expensive techniques such as contrastive learning. To tackle these problems, we propose a noncontrastive self-supervised learning (SSL) approach that efficiently captures low-and high-frequency features in a cost-effective manner. The proposed framework comprises a Siamese configuration of a deep neural network with two weight-sharing branches which are followed by low-and high-frequency feature extraction modules. The two branches of the proposed network allow bootstrapping of the latent representation by taking two different augmented views of raw time series data as input. The augmented views are created by applying random transformations sampled from a single set of augmentations. The low-and high-frequency feature extraction modules of the proposed network contain a combination of multilayer perceptron (MLP) and temporal convolutional network (TCN) heads, respectively, which capture the temporal dependencies from the raw input data at various scales due to the varying receptive fields. To demonstrate the robustness of our model, we performed extensive experiments and ablation studies on five real-world time-series datasets. Our method achieves state-of-art performance on all the considered datasets.
  • Placeholder Image
    Publication
    An Intelligent CMOS Image Sensor System Using Edge Information for Image Classification
    (2023-01-01)
    Kisku, Wilfred
    ;
    Khandelwal, Prateek
    ;
    ;
    CMOS image sensors have drawn a lot of attention due to their superior performance in the past decade. The most demanding applications such as visual surveillance and intrusion detection in surveillance systems, and aerial monitoring in conflict zones, are made possible by recent technological advancements in near sensor-based systems. In the existing methodology, image sensors are being used along with a DSP processor to perform image classification and recognition. Continuously reading and transferring data of all pixels to CNN required high computation and power consumption of ADC. In this work, we propose a novel design for an intelligent CMOS image sensor wherein an analog Sobel edge detector is introduced before the ADC. Using the edge detector, instead of digitizing a natural image obtained off the sensor, only edges in the image are digitized and transferred for object classification and recognition tasks. This significantly reduces the power consumed by ADC as only edge-detected pixels are being converted into the digital domain. Analysis shows that object classification tasks on edge-detected images with reduced pixel information from thresholding operation result in pixel reduction by 67 %, 79 %, and 87 % for threshold values of 0.2, 0.3, and 0.4. This work shows that CNN models can still be trained with acceptable accuracy on state-of-the-art models and can reduce operating power for Intelligent edge devices.
  • Placeholder Image
    Publication
    On-Array Compressive Acquisition in CMOS Image Sensors Using Accumulated Spatial Gradients
    (2021-02-01) ; ;
    Amogh, K. M.
    ;
    Sarkar, Mukul
    A compressive acquisition technique for on-array image compression is proposed in this paper. It capitalizes on representation ability of accumulated spatial gradients of the acquired scene. The local variations inferred from strength of the accumulated gradients are used as cues to vary number of samples read through the image sensor readout. Such sampling enables the reconstruction using traditional interpolation techniques with desired quality. The proposed method is first verified using MATLAB simulations, where on an average, a compression of 87% is achieved, for a threshold of 40 intensity levels. The images are reconstructed using nearest neighbour interpolation (NNI) method which results in a mean peak signal to noise ratio (PSNR) value of 29.09 dB. The reconstructed images are further enhanced using deep convolutional neural network, which improves the PSNR to 32.46 dB. The biggest advantage of the proposed technique is low-complex hardware design. As a proof of concept, a hardware implementation of the technique is performed using discrete components. Pixel intensity values of standard images are converted into analog voltages using a data acquisition system and mapped in the input voltage range of 1.5 V -5.5 V. For a threshold of 3.8 V, the compression of 81% - 83% is observed for the considered images. The proposed technique is simple and effective, and is suitable for low-power complementary metal oxide semiconductor (CMOS) image sensors.
  • Placeholder Image
    Publication
    Discovering and Overcoming Limitations of Noise-engineered Data-free Knowledge Distillation
    (2022-01-01)
    Raikwar, Piyush
    ;
    Distillation in neural networks using only the samples randomly drawn from a Gaussian distribution is possibly the most straightforward solution one can think of for the complex problem of knowledge transfer from one network (teacher) to the other (student). If successfully done, it can eliminate the requirement of teacher's training data for knowledge distillation and avoid often arising privacy concerns in sensitive applications such as healthcare. There have been some recent attempts at Gaussian noise-based data-free knowledge distillation, however, none of them offer a consistent or reliable solution. We identify the shift in the distribution of hidden layer activation as the key limiting factor, which occurs when Gaussian noise is fed to the teacher network instead of the accustomed training data. We propose a simple solution to mitigate this shift and show that for vision tasks, such as classification, it is possible to achieve a performance close to the teacher by just using the samples randomly drawn from a Gaussian distribution. We validate our approach on CIFAR10, CIFAR100, SVHN, and Food101 datasets. We further show that in situations of sparsely available original data for distillation, the proposed Gaussian noise-based knowledge distillation method can outperform the distillation using the available data with a large margin. Our work lays the foundation for further research in the direction of noise-engineered knowledge distillation using random samples.
  • Placeholder Image
    Publication
    MBGRLp: Multiscale Bootstrap Graph Representation Learning on Pointcloud (Student Abstract)
    (2022-06-30)
    Gorade, Vandan
    ;
    Singh, Azad
    ;
    Point cloud has gained a lot of attention with the availability of large amount of point cloud data and increasing applications like city planning and self-driving cars. However, current methods, often rely on labeled information and costly processing, such as converting point cloud to voxel. We propose a self-supervised learning approach to tackle these problems, combating labelling and additional memory cost issues. Our proposed method achieves results comparable to supervised and unsupervised baselines on widely used benchmark datasets for self-supervised point cloud classification like ShapeNet, ModelNet10/40.
  • Placeholder Image
    Publication
    A Power Efficient Image Sensor Readout with On-Chip δ -Interpolation Using Reconfigurable ADC
    (2020-07-01) ; ;
    Sarkar, Mukul
    In this paper, a low-power readout using reconfigurable cyclic ADC for CMOS image sensors is proposed. It reduces the total number of pixels to be read by taking advantage of pixel correlation. The required number of ADC operations is reduced, resulting in power saving. In contrast to the existing pixel correlation-based approaches, which focus only on the intensity differences, in the proposed method, the polarity of the differences is also taken into account. It helps in preserving fine edges representing features such as texture. The discarded or unread pixels are interpolated on-chip while reconfiguring the ADC input range according to the interpolation step size. Furthermore, this reduces the number of ADC conversion cycles by 25% to 50% for interpolation steps of 16 LSB and 64 LSBs, respectively. The ADC is designed and fabricated in UMC 180-nm CMOS technology, and the proposed method is verified for standard test images. The reconstructed images incorporating ADC non-linearities result in average Pratt's FoM values of 0.88, 0.86, and 0.81 for 60%, 70%, and 74% compression, respectively. The corresponding best values achieved by the existing approaches are 0.86, 0.80, and 0.77, respectively. The improvement in FoM is observed due to the consideration of polarity information. The proposed technique results from 33% to 50% power saving for 80% compression in 512times512 image, using reconfigurable ADC. Therefore, it is suitable for a power efficient CMOS sensor design.
    Scopus© Citations 3
  • Placeholder Image
    Publication
    A reconfigurable cyclic ADC for biomedical applications
    Bio-signals such as electroencephalogram (EEG) contain low activity regions often called B-noise and high activity regions called active potentials. The high activity regions are more important as compared to their counterpart. In addition, the signals are considerably sparse in the low activity regions. Thus a full n-bit conversion of low activity samples into digital domain increases readout power and reduces data acquisition rate of analog to digital converter (ADC). To alleviate these problems, a reconfigurable cyclic ADC is presented in this paper. Input range and conversion cycles of the proposed ADC are varied according to the samples of the neural signal. The high activity region samples are resolved using conventional n-bits, however, the low activity region is resolved using less number of bits. This saves readout power and also reduces the digital data content. The proposed ADC is designed and fabricated in UMC 180 nm CMOS technology. The ADC operates at a sampling rate of 200 kS/s and consumes 61.8 μW of power. The chip occupies an area of 0.031 mm2. Using reconfiguration, the power saving of 28.6% is achieved compared to the conventional n-bit full conversion.
    Scopus© Citations 2
  • Placeholder Image
    Publication
    Target-Independent Domain Adaptation for WBC Classification Using Generative Latent Search
    (2020-12-01)
    Pandey, Prashant
    ;
    Prathosh, A. P.
    ;
    Kyatham, Vinay
    ;
    ;
    Dastidar, Tathagato Rai
    Automating the classification of camera-obtained microscopic images of White Blood Cells (WBCs) and related cell subtypes has assumed importance since it aids the laborious manual process of review and diagnosis. Several State-Of-The-Art (SOTA) methods developed using Deep Convolutional Neural Networks suffer from the problem of domain shift - severe performance degradation when they are tested on data (target) obtained in a setting different from that of the training (source). The change in the target data might be caused by factors such as differences in camera/microscope types, lenses, lighting-conditions etc. This problem can potentially be solved using Unsupervised Domain Adaptation (UDA) techniques albeit standard algorithms presuppose the existence of a sufficient amount of unlabelled target data which is not always the case with medical images. In this paper, we propose a method for UDA that is devoid of the need for target data. Given a test image from the target data, we obtain its 'closest-clone' from the source data that is used as a proxy in the classifier. We prove the existence of such a clone given that infinite number of data points can be sampled from the source distribution. We propose a method in which a latent-variable generative model based on variational inference is used to simultaneously sample and find the 'closest-clone' from the source distribution through an optimization procedure in the latent space. We demonstrate the efficacy of the proposed method over several SOTA UDA methods for WBC classification on datasets captured using different imaging modalities under multiple settings.
    Scopus© Citations 17
  • Placeholder Image
    Publication
    On Edge FPN Reduction in CMOS Image Sensor Using CNN with Attention Mechanism
    (2023-01-01)
    Kodam, Sandeep
    ;
    Kisku, Wilfred
    ;
    ;
    Obtaining a good quality image from a CMOS Image Sensor (CIS) is always a constraint due to the effect of noise present within the image sensor system. One of the dominant source of noise in CIS with column-parallel readout is Fixed Pattern Noise (FPN) which significantly degrade the image quality. This work implements an architecture for the reduction of vertical FPN called Fixed Pattern Noise Reduction Network (FPNrNet), which uses a Convolutional Neural Network (CNN) with an attention mechanism. The denoising performance of the FPNrNet model is quite similar to that of standard denoising models; however, a significant reduction in model size is observed due to a reduction in the number of parameters. An average Peak Signal-to-Noise Ratio (PSNR) improvement of around 11.3 dB with respect to input noisy image and an average Structural Similarity Index Measure (SSIM) of 0.99 is observed for Pascal VOC 2012 dataset. Further, the model is quantized on different bit precision using the Qkeras library and synthesized using the High-Level Synthesis for Machine Learning (hls4ml) platform to make it hardware friendly so that inference can be performed on resource-constrained edge devices.