Vishwa Karia

Vishwa Karia

Seattle, Washington, United States
2K followers 500+ connections

About

I am a Software Engineer at Meta, working in the AI Infrastructure organization…

Articles by Vishwa

Activity

Join now to see all activity

Experience

  • Meta Graphic

    Meta

    Bellevue, Washington, United States

  • -

    Seattle, Washington, United States

  • -

    Global

  • -

    Greater Seattle Area

  • -

  • -

    Greater Seattle Area

  • -

    Greater Los Angeles Area

  • -

    Greater New York City Area

  • -

    Greater Los Angeles Area

  • -

    Bangalore, India

  • -

    Mumbai Area, India

Education

Volunteer Experience

  • Social Initiative Volunteer

    Technovanza

    - 1 year 6 months

    Education

    Organized educational sessions at Mumbai Municipal schools and Non-Governmental Organizations to teach Basic Algebra and Science to underprivileged students

Publications

  • GenSample: A Genetic Algorithm for Oversampling in Imbalanced Datasets

    Imbalanced datasets are ubiquitous. Classification performance on imbalanced datasets is generally poor for the minority class as the classifier cannot learn decision boundaries well. However, in sensitive applications like fraud detection, medical diagnosis, and spam identification, it is extremely important to classify the minority instances correctly. In this paper, we present a novel technique based on genetic algorithms, GenSample, for oversampling the minority class in imbalanced…

    Imbalanced datasets are ubiquitous. Classification performance on imbalanced datasets is generally poor for the minority class as the classifier cannot learn decision boundaries well. However, in sensitive applications like fraud detection, medical diagnosis, and spam identification, it is extremely important to classify the minority instances correctly. In this paper, we present a novel technique based on genetic algorithms, GenSample, for oversampling the minority class in imbalanced datasets. GenSample decides the rate of oversampling a minority example by taking into account the difficulty in learning that example, along with the performance improvement achieved by oversampling it. This technique terminates the oversampling process when the performance of the classifier begins to deteriorate. Consequently, it produces synthetic data only as long as a performance boost is obtained. The algorithm was tested on 9 real-world imbalanced datasets of varying sizes and imbalance ratios. It achieved the highest F-Score on 8 out of 9 datasets, confirming its ability to better handle imbalanced data compared to other existing methodologies.

    See publication

Courses

  • Artificial Intelligence

    -

  • Big Data Analytics

    -

  • Cloud Computing

    -

  • Compiler Construction

    -

  • Computer Networks

    -

  • Computer Organization and Architecture

    -

  • Cryptography and Network Security

    -

  • Current Topics in Computer Science: System Design/Architecture

    CS 259

  • Data Mining and Data Warehousing

    -

  • Data Structures and Algorithms

    -

  • Digital Image Processing

    -

  • Fundamentals of Database Systems

    -

  • Internet of Things

    -

  • Introduction to Data Management and Statistical Computing

    BIOSTAT203A-1

  • Introduction to Operating Systems

    -

  • Large Scale Data Mining

    ECE 219

  • Learning from Text

    -

  • Machine Learning

    -

  • Neural Networks and Deep Learning

    ECE239AS

  • Probabilistic Programming and Relational Learning

    CS 267A

  • Software Design Patterns

    -

  • Software Engineering

    -

  • Statistics and Optimization

    -

  • Web Information Management

    CS 246

Projects

  • A Neural-based System for Question Answering on SQUAD 2.0

    -

    - Used a Bidirectional Attention Flow (BiDAF) network to build a question answering system for the SQuAD 2.0 data set
    - Explored different embeddings like GloVe, FastText and BERT with the BiDAF model to improve performance

  • Code2Cap: Automated Code Captioning

    -

    - Constructed a novel model that produces a natural language description of an input code snippet written in Python or Java
    - Used transformers for embedding the input data as well as predicting the output sequence of words
    - Evaluated the effect of using an AST generated from the code snippet as an input feature

    See project
  • Automatic Colorization of Grayscale Images

    -

    - Developed an Autoencoder-based CNN model and a Generative Adversarial Network (GAN) based model for automatic colorization of grayscale images
    - Performed prediction of pixel values for RGB and LAB channels to assess the effect of predicting 3 vs 2 values for each pixel

  • Mini Search Engine

    -

    - Built a small-scale search engine based on the Simple English Wikipedia dataset using a custom analyzer of Elasticsearch
    - Used a probabilistic language model-based approach for spell correction to improve search accuracy
    - Improved the ranking function by leveraging Latent Dirichlet Analysis to extract latent topics from the corpus

  • Prediction of Flight Delays

    -

    - Built different supervised learning models like Logistic Regression, Random Forests and Neural Network that predict the occurrence of airline delay from historical flight data
    - Used Hadoop and Pig to partition the data and generate a feature matrix
    - Analyzed and compared the results produced by these models

  • Probabilistic Database System

    -

    - Built a Probabilistic Database System to compute the probability of fully-quantified queries in first-order logic
    - Implemented different approximation schemes like Monte Carlo sampling and Markov Chain Monte Carlo (MCMC) methods such as Metropolis-Hastings and Gibbs Sampling for evaluating #P-Hard queries and compared their results.

  • Image Captioning using Object Detection

    -

    - Constructed annotation vectors using Deep CNN (based on the VGG deep neural network model) to recognize objects in an image and their locations
    - Built a model to feed the annotation vectors into a Recurrent Neural Network to generate a short descriptive sentence
    - Implemented the system on Keras API using Tensorflow framework

  • Classification of Ball Bearings

    -

    - Applied Wavelet-based Denoising and Logistic Regression to detect the healthy ball bearings from the defective ones using Python
    - Achieved an accuracy of 86% which was among the top 10 in the class

  • Online Gift Shop Website

    -

    - Made the Software Project Plan for an online store which can be used for buying and selling gift articles
    - Developed the front end of the website using HTML, CSS, JavaScript, Bootstrap and MySQL

  • Maze Solving Bot

    -

    - Built a line following bot which could solve a grid and provide shortest path to destination using Graph Theory
    - Programmed the bot using AVR Programming

Honors & Awards

  • Rank Holder at the end of BTech in Computer Engineering

    -

    Secured 5th Rank at the end of Btech in Computer Engineering in a class of 75 students

Test Scores

  • GRE

    Score: 331/340

    Quants: 169/170
    Verbal: 162/170
    AWA: 5.0/6.0

  • TOEFL

    Score: 117/120

    Reading: 29/30
    Listening: 30/30
    Writing: 30/30
    Speaking: 28/30

Languages

  • English

    Full professional proficiency

  • Hindi

    Native or bilingual proficiency

  • Marathi

    Professional working proficiency

  • Gujarati

    Native or bilingual proficiency

  • Spanish

    Elementary proficiency

Recommendations received

More activity by Vishwa

View Vishwa’s full profile

  • See who you know in common
  • Get introduced
  • Contact Vishwa directly
Join to view full profile

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses