So, what is Machine Learning?
The buzzwords of the 21st century are "Data Science & Machine Learning" and the upcoming ones being "web3, NFT's, dao's" we'll get into these in my next article.
So yeah, what is Machine Learning?
Data is the new gold of this century. With an estimate of 92% of the data being generated in the last 5 years alone, the need and necessity of maintaining these data are at it's utmost need!
But wait, maintaining this data will just not do the job, why not manipulate them as well! The tech giants saw this as an opportunity to target the perfect customers and expand their markets! But letting the business aside the logic behind this is exciting!
To put in layman's terms the art of using the said data to gain/extract useful insights is called Data Science. You got it right, Data Science is the "science" of finding unseen patterns in the data and maybe building predictive models to derive meaningful information and make business decisions!
Domains of Data Science
Now that we have a basic idea about Data Science, let's look into the domains of Data Science(pretty sure you would have heard about it)
- Data Analytics: as the name suggests, this deals with the translation of raw data into use. It helps out organizations to present useful insights to the stakeholders and point out certain anomalies or stand-alone features through beautiful visualizations.
The stack usually includes :
- Excel -> Usually the dataset comes in CSV(comma separated values) format.
- Python -> Libraries like pandas, NumPy(data manipulation), and matplotlib (data visualization) are used.
- SQL -> a query language to communicate with relational databases.
Tableau/PowerBI -> Visualization software.
Machine Learning: In simple terms, machine learning is the general term for when computers learn from data. A perfect blend of computing and mathematical inference models(statistics & probability).
The stack:
- Excel -> Usually the dataset comes in CSV(comma separated values) format.
- Python -> Libraries like pandas, NumPy(data manipulation), and matplotlib (data visualization) are used.
- Specific Libraries (frameworks) -> Keras & PyTorch are the traditional and the prominent ones, we also have libraries like TensorFlow,fast.ai, and a lot more.
- Resource Management(environments) -> When it comes to production quality code deployment the most prominent ones in the industry are Docker & Kubernetes. Cloud deployment services like AWS EC2 for computing services and AWS S3 for data needs.
- Others -> Libraries like RAPIDS, GIT for version control Scikit-Learn, NVIDIA CUDA Architecture for GPU enabled rendering, and a lot more!
- Deep Learning : Let's just say that Deep Learning is just a more sophisticated and evolved form of Machine Learning. Deep Learning describes algorithms that analyze data with a logical structure similar to how a human would conclude.
Deep Learning achieves this behavior by mimicking human behavior. A human brain uses electrical impulses and chemical signals to transmit information between different areas of the brain and between the brain and the rest of the nervous system using neurons.
Deep Learning does the same with something called artificial Neural networks, which look something like this :
Neural Network Simulator by TensorFlow (feel free to play around)
MACHINE LEARNING :
Now that we have a basic idea about all the other domains in Data Science lets dive deep into Machine Learning. Let's take a scenario, It's the weekend you decide to binge-watch so you open your favorite streaming service of your choice say Netflix and voila!
It recommends the perfect movie/series for your to watch! But wait how is this possible? It is the work of Netflix's Recommendation Engine! Machine Learning enthusiasts quote this as the "Binging on the Algorithm"!
When you first create an account on Netflix fill out the basic details like age, name, region, etc Netflix maps these "attributes" to their database and gives you a tailored experience! The more you watch the better it gets!
Sounds cool right! It is just one of the applications of Machine Learning. Computer researchers and scientists all over the world are using algorithms like this to solve problems like Hunger, Cancer and are working tirelessly to make this world a better place to live in!
Even in 2020, a group of researchers came together to help scientists and researchers with computing power just like carpooling! Check out Folding@Home
## How to get started in Machine Learning? Firstly try to get the basics clear. The concepts of:
- Supervised Learning: Training the model with labeled data or in simple words you are spoon-feeding your model with all the requirements.
- Unsupervised Learning: The data points in the datasets are neither labeled nor classified just like me, the night before my exams.
- Reinforcement Learning: A "self-interpreting" training model where it works on trial and error method.
These are just the basics, I would recommend starting with Andrew NG's course which covers almost all the required basics for one to get into ML. Andrew NG - Stanford University
If not try covering these topics to get a basic understanding :
- Regression -> Cost Function models, Gradient Descent
- Supervised Learning
- Unsupervised Learning
- Overfitting
- Neural Networks
- Classification
- SVM - Support Vector Machines
- & a lot more..
If you are looking for a roadmap check this out! Machine Learning Roadmap
Machine Learning is a huge domain , and one cannot become a Machine Learning Engineer overnight, so keep grinding and keep working on your skills!
Skills :
Just like the process for conventional software engineering, machine learning requires that extra bit of sauce. Projects like Netflix's Recommendation Engine, and Face mask detection are good but not enough! Publishing Research Papers and working on Real-world projects would really help. Hackathons are one way you can work on your skills! Hackathons are basically events where teams work together to solve the given problem statement. Hackathons provides you with wonderful opportunities to network with like-minded people, explore and maybe help you land your first job as an ML Engineer!
These are my go-to places to find Hackathons :
- Major League Hacking
- Devpost
- [Devfolio] (devfolio.co)
- [Kaggle] (kaggle.com)
- [Machine Hack] (machinehack.com)
- [HackerEarth] (hackerearth.com)
Blogs :
- Analytics Vidhya
- Analytics India Mag
- KDNuggets
- TowardsDataScience
- Twitter (the best resource)
YouTube :
- Daniel Bourke
- Krish Naik
- 365 Data Science
- Data Camp
- 3Blue 1Brown (for math)
- Alex the Analyst
- Luke Barousse
Use Links:
- [The ML Engineer Notebook] (github.com/Dipankar-Medhi/the-mlengineer-no..)
- [Microsoft ML] (github.com/Dipankar-Medhi/the-mlengineer-no..)
- [TensorFlow documentation] (tensorflow.org/overview)
Just remember one thing: "Torture the data, it will confess to everything"
Connect with me here : Srinivas T B