Autoencoder Architecture

Autoencoders are fundamental to creating simpler representations of a more complex piece of data. They use a famous encoder-decoder architecture that allows for the network to grab key features of the piece of data. If you are new to autoencoders and would like to learn more, I would reccommend reading this well written article over auto encoders:

In this article we will be implementing an autoencoder and using PyTorch and then applying the autoencoder to an image from the MNIST Dataset.


For this project, you will need one in-built Python library:

import os


In this paper, a detailed summary and analysis over Shi and Malik’s paper on Normalized Cuts and Image Segmentation. Each section covers a summary and analysis of the respective portion of the original paper. The original paper can be found here:

Wertheimer’s Perceptual Grouping Theory

In the introduction, Shi and Malik note that their research is based off of Wertheimer’s Perceptual Grouping Theory. This theory states that regions should be grouped together based on looking at the image on a higher level through objects and patterns.

Choosing the right Subset for Partitions

They notes that there are two key factors to choose the right subset when partitioning:


With Generative Adversarial Networks becoming so prominent within the world of machine learning, alternatives have emerged as an effort to improve these very networks. One such alternative includes conditional generative adversarial networks or cGAN. In this paper, we aim to gain a better understanding of cGANs and their relation to traditional GANs.

Brief Overview of Generative Adversarial Networks

Generative Adversarial Networks or GANs is a method in which a generative model can be trained. It has two primary parts to it: the discriminator and the generator. Together, the discriminator and the generator work together in a competitive manner in order to produce real-like synthetic data. …

2D Convolutions are instrumental when creating convolutional neural networks or just for general image processing filters such as blurring, sharpening, edge detection, and many more. They are based on the idea of using a kernel and iterating through an input image to create an output image. If you are new to convolutions I would high reccommend the playlist by on convolutional neural networks.

In this article we will be implementing a 2D Convolution and then applying an edge detection kernel to an image using the 2D Convolution.


For this implementation of a 2D Convolution we will need 2 libraries:


Generative Adversarial Networks have become a large deal within the machine learning world. In this paper we take a look at the full mathematical source behind how generative adversarial networks work. We start from the very basics, giving an intuitive understanding and defining the different parts of a generative adversarial networks all the way up to understanding the loss function both mathematically and conceptually.

Defining the Generator

A Generative Adversarial Network is made up of 2 different parts: The discriminator and the generator. Given the generator through the function: G(z) = x. The generator function is represented through G(z) where z, the parameter…

Generative Adversarial Network Structure


With data becoming increasingly more important in the world of machine learning and data science, researchers have developed systems known to generate data from scratch. These systems are known as Generative Adversarial Networks or GANs. This paper gives a brief and intuitive introduction and analysis over Generative Adversarial Networks and their applications.

Keywords: Generative Adversarial Network (GAN), Synthetic Data, Machine Learning

Introduction to Generative Adversarial Networks

Within the world of machine learning, we often interpolate based on large amounts of data. However, in many areas, data is generally limited. Take for example the COVID-19 crisis in 2020. Many teams built models in order to diagnose…

With the rise of computer science in the world around us, the field of software development is seeming to have more of an impact on the medical field and COVID-19 than we originally thought. Data scientists and software developers from all around the world have developed a variety of tools ranging from a pre-diagnosis through cell phones based on breathing patterns to education products to assist students to study for their respective topics through automatic question generation. …

Just Recently I was trying to create a UI for my facial recognition-based attendance tracker so that it would be easier for general users to use. I reasonably guessed based on some prior knowledge, that there would be some really nice library like how Java has JavaFX. However, upon some research, the best I was able to find were Kivy and Tkinter, which in their own regard are some very nice UI libraries, however, they were not exactly what I was looking for.

I remember looking at some websites and commenting on how nice they were. And that's when I…

Source: Wikimedia Commons[teguhjatipras / CC0]

With computer vision technologies such as facial recognition coming to the forefront of many modern applications, the popularity of the technology rises. Despite the popularity, many fail to understand the underlying mechanisms that allow for facial recognition to work.

Step 1: Pre-processing Data Sets

The first major step in the facial recognition process is the preprocessing of data. For a computer to recognize a face, the computer must have data to base the recognitions off of. Because of this, the user must first input some sample data. There are two approaches to the amount of data inputted. The user has…

For those that have not seen my Github, I am currently in the works of developing a facial-recognition based attendance tracker. However, I not only want to develop this but I want to make it unique and innovative so it can be used for consumer use. To do this, I must have ideas that revolutionize a product in such a way it becomes more applicable and practical in consumer use. In my past mentor visit with Mr. Trey Blankenship, a software developer at Raytheon, we brainstormed ideas that would improve the overall performance of the product while also making it…

Samrat Sahoo

Hi there! My name is Samrat Sahoo and I am a passionate highschooler who is very interested in computer vision. I write blogs to share what I’ve learned!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store