In this blog, we have done some data exploration using matplotlib and seaborn. Here we have used three different classifier models to predict the wine quality: K-Nearest Neighbors ClassifierSupport Vector ClassifierRandom Forest Classifier Also we have classified wine qualities into 3 different categories as good, average and bad. Dataset Information: The two datasets are related … Continue reading Predicting Wine Quality using different classifiers
Time Series Analysis with Python
What is Time Series? The set of data collected on the basis of time is called time series.Time series forecasting is basically the machine learning modelling for Time Series data (years, days, hours…etc.) for predicting future values using Time Series modelling. This helps if your data in serially correlated. Importance of time series: Study past … Continue reading Time Series Analysis with Python
Manage Git on the command line
Git: an open source, distributed version-control system. You can download git from here: https://git-scm.com/downloads GitHub: a platform for hosting and collaborating on Git repositories. Basic Git commands Open command window. Change your working directory.cd <working directory path> Check your Git informationgit config --global --list Create an empty Git repository or reinitialize an existing onegit init … Continue reading Manage Git on the command line
Principal Component Analysis
What is Data Dimensionality? In real world, number of columns is the number of dimensions of data.However, some columns are similar, some are correlated, some are duplicates in some way, some are junk, some are useless, etc. so the actual number of dimensions can be unknown. Its a knotty problem. What is high dimensionality? Suppose … Continue reading Principal Component Analysis
Machine Learning – K-Nearest Neighbor (KNN) Algorithm
A classification problem has a discrete value as its output. For example, “likes pineapple on pizza” and “does not like pineapple on pizza” are discrete. ML Mind Map: What is a K-Nearest Neighbor Algorithm (KNN)? KNN is one of the simplest classification algorithms and it is one of the most used learning algorithms. KNN is … Continue reading Machine Learning – K-Nearest Neighbor (KNN) Algorithm
Machine Learning – Logistic Regression Algorithm
What is Logistic Regression? Logistic regression is a technique used for solving the classification problem. Classification is technique to categorize our data into a desired and distinct number of classes where we can assign label to each class. And Classification is nothing but a problem of identifying to which of a set of categories a … Continue reading Machine Learning – Logistic Regression Algorithm
Machine Learning – Linear Regression Algorithm
What is Regression? Regression searches for relationships among variables. Regression is a method for studying the relationship between two or more quantitative variables. Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent and independent variable. For example, you can observe several employees of some company and try to understand how their … Continue reading Machine Learning – Linear Regression Algorithm
An introduction to Machine Learning
What is Machine Learning? Machine learning lets us find patterns in existing data, then create and use a model that recognizes those patterns in new data. A machine learning algorithm is like a trial and error process, but the special thing about it is that each consecutive trial is at least as good as the … Continue reading An introduction to Machine Learning
Exploring The Power of Data Frame in Pandas
We covered a lot on basics of pandas in Python – Introduction to the Pandas Library, please read that article before start exploring this one. DataFrame is two-dimensional (2-D) data structure defined in pandas which consists of rows and columns. Each column in a DataFrame is a Series object, rows consist of elements inside Series. … Continue reading Exploring The Power of Data Frame in Pandas
Python – Introduction to the Pandas Library
Pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is the most popular python library that is used for data analysis. Key Features of Pandas: Fast and efficient DataFrame object with default and customized indexing. Tools for loading data into in-memory data objects … Continue reading Python – Introduction to the Pandas Library