Why is everyone talking about Machine learning? What is it all about?
Today we will try to understand more about the same and find out ways that can help us land a job in the field.
Excited? Let's get started!
If you have clicked on this post, chances are you already know a bit about machine learning (we will refer to it as ‘ML’ for this article). However, let's still discuss it a little before we can go on to the more important stuff.
So what exactly is Machine Learning?
In simple words, when a machine can teach itself how to do things, it is called machine learning.
Well, that gives us the gist of it. However, it still leaves us with a lot of unanswered questions.
Let's dig a little deeper.
- Machine learning is a form of artificial intelligence (AI) that provides systems with the ability to learn and improve from experience without being explicitly programmed. It is a powerful tool that makes prediction possible to a great extent. Prediction can be made in any field of interest.
Machine learning has penetrated many industries, big firms and start-ups. Commute Estimation like Google Maps, Fraud Prevention in the Banking Sector, Snapchat and Instagram's facial filters, Online advertisement, self-driving cars, email spam filtering, automated trading and medical diagnosis are some of the many examples of Machine Learning that we see around us daily.
It is everywhere around us, changing our lives in various ways. From our morning online news update to your evening travel plans, on any device or app you are using, it is everywhere. And it is getting better as I write this!
There is no doubt now why people want to break into the data industry. Some wish to be a data scientist, while some just want to explore this new space. Regardless of their intent, everyone should have the right tools that will help them enter the industry.
However, doing a couple of online courses in this field might still not be enough to give you the right understanding of the subject or land you a job. What you need are hands-on projects to understand how real-world problems are tackled. Having said that, not all ML projects are created equally, some may be more suited for complete beginners looking to break into the field for the first time than others.
In this article, I'll give you some suggestions for projects that are beginner-friendly as well as advanced level, that will help you get a deeper understanding.
We will talk about the following projects today:
- Kaggle Titanic Prediction - Beginner Level
- House Price Prediction - Beginner Level
- Customer Service Chatbot - Intermediate/ Advanced Level
- Real-Time Spam Detection - Intermediate/ Advanced Level
- YouTube Comment Sentiment Analysis - Intermediate/ Advanced Level
Let's learn about each of these, one by one.
Kaggle Titanic Prediction
This is a very simple, beginner-friendly ML project one can take on. If you are new to ML, this would be the best fit for you.
You can get the dataset here.
This dataset is of passengers who travelled on the titanic. It contains details like passenger age, ticket fare, cabin, and gender.
- Objective: Predict whether these passengers survived or not.
- Target Variable: Survival
- No. of Independent Variables: 11
It is a simple binary classification problem. The dataset is already cleaned and ready to work on!
Since this is a classification problem, you can use algorithms like logistic regression, decision trees, and random forests and build the predictive model. You can also choose gradient boosting models like an XGBoost classifier for this beginner-level machine learning project to get better results.
Let's hop on to the next project!
House Price Prediction
This is great to start with if you are a beginner at machine learning. This project will use the house pricing dataset available on Kaggle.
Get the dataset here.
- Objective: Predict the property's sale price in dollars.
- Target Variable: SalePrice
- No. of Independent Variables: 79
It is a regression problem. You can use techniques like linear regression to build the model. You can also use a random forest regressor or gradient boosting to predict house prices.
Some dimensionality reduction techniques can be used to hand-pick features since adding too many variables can affect the performance of your model.
Use techniques like one-hot encoding or label-encoding for categorical variables.
Customer Service Chatbot
If you are interested in AI, ML, chatbots, then this is the perfect project.
So what is a chatbot anyway?
A chatbot is a computer program that simulates human conversation through voice commands or text chats or both. Chatbot, short for chatterbot, is an artificial intelligence (AI) feature that can be embedded and used through any major messaging application.
There are mainly three kinds of chatbots you can build:
- Rule-Based Chatbot - they have a set of pre-defined rules (or questions and answers) and respond to users based on these rules. They are unable to answer questions falling outside these defined rules.
- Independent Chatbots — These use ML to process and analyze a user’s request and provide responses accordingly.
- NLP Chatbots — They can understand patterns in words and distinguish between different word combinations. They can also tell what to say next, based on the data they are trained on. They are the most advanced of all three.
An NLP chatbot can be a great project to add to your CV. You will need a collection of words. You can also have a pre-defined dictionary with a list of question and answer pairs you’d like to train your model on. Though this project does come under intermediate/advanced level ML projects, it is recommended to start with an easier project first if you are new to ML.
Real-Time Spam Detection
This is a slightly advanced level project, recommended for readers who already have some experience.
Get the Kaggle SMS Spam Collection dataset here.
- Objective: Distinguish between Legitimate (not-spam/ham) and Illegitimate(spam) messages.
- Target Variable: v1 (spam or not-spam)
- No. of Independent Variables: 3
To build this model for your project, pre-process the text messages present in the dataset. Then, convert these messages into words so that they can pass into your classification model for prediction.
This dataset contains approximately 5K messages that have been labelled as spam or not spam. You can train your model on the dataset provided above.
To build a real-time spam detection system, create a simple chat-room server in Python. Then, move the model on your chat-room server and make sure that all incoming traffic passes through the model. Next, allow messages to go through it if they are legitimate/not spam. If they are spam, return an error message.
This project is a great addition to your CV and can help you land your dream job!
YouTube Comment Sentiment Analysis
YouTube has had a huge impact on the Internet. Its dawn was the game-changer for video viewing and sharing, and it has remained the same till today. While there are other video sharing apps, none have reached the heights that YouTube has! You can build a sentiment analysis model and create a dashboard to visualize sentiments around celebrities over time since controversy and criticism has emerged time and again in the past for many YouTube Influencers.
This is an advanced level ML project.
- Objective -Analyze the overall sentiment of popular YouTubers.
Start with scraping comments of the videos by the YouTubers you want to analyze. Next, use a pre-trained sentiment analysis model to make predictions on each comment. Now, visualize the model’s predictions on a dashboard.
With this, we come to the end of our list. These projects are enough to give you hands-on experience on different models and datasets, are suited for all levels of expertise from beginner to advanced.
Choose the one that fits perfectly with you and get started!
With these 5 projects, you will be better prepared and more confident to ace your next interview process and impress recruiters and hiring managers at some of the top tech companies.
Your dream job awaits you!
Stay tuned for more!