You don’t need me to tell you it’s a great time to learn AI, do you?
When I started building and training models, there were lots of mistakes I did, this post is a short summary of what you should do to master AI, and more importantly, what you shouldn’t.
Roadmap
0. Have a solid plan, think it through
I’ll be honest- this roadmap is hard to finish if you try to do it alone.
When you’re learning, ask questions on reddit communities (r/Python, r/learnpython, r/pythonlearning, r/pythontips, r/learnprogramming, r/pythonhelp, r/learnmachinelearning), X, LinkedIn, and talk to people who’re on the same path as you. Sharing what you learn is awesome. Keeps you motivated, the community can help you. Don’t skip this.
Most videos/ courses you watch will have something you don’t understand. Pause, learn it, then come back to it if you don’t understand something.
Set expectations early. What exactly do you want to get out of your effort? What projects will you have (on GitHub) as you’re doing this? How long do you think it will take you? Months? One year?
Having realistic expectation is key to finishing the whole thing. It took me years to get familiar with these concepts, probably because I wasn’t very efficient. If you jump right in, you’ll probably burn out and not finish.
1. Python, version control, API: Basics of programming
You should know Python (any language is fine, but if you don’t know any, start with Python — it’s a good one).
Learn the basics - data types, function, loops, error handling, clean code, git and github, and learn to consume APIs
Resources:
Beginner Python Programming All-in-one Tutorial Series (Caleb Curry)
Immediate Python Programming (freeCodeCamp)
REST API Crash Course (Caleb Curry)
Bonus: Effective Python by Brett Slatkin (version 3.13).
I love this book, it’s the only book I recommend for writing cleaner code. For everything else, don’t waste your time with books - you can use free videos on YouTube.
2. Maths: learn when you need to
You need maths. Stats, probability, and linear algebra. There’s no way around it.
Good news is — you don’t need to learn everything at once. Just make a list of what you need, and learn them when you need them. Jumping straight into math courses is a bad idea. You can continue learning concepts while you’re learning other things — don’t be disappointed if you don’t understand calculus or regression from the start.
Resources:
3Blue1Brown (has visual explanations, watch what you need)
StatQuest (just watch what you need)
Professor Leonard (check his playlists on differential equations and calculus; for me, his full-length statistics playlist was the absolute best for learning statistics)
Steve Brunton (somebody recommended his probability and statistics playlists over Prof. Leonard’s statistics playlists, I think that’s a good idea; he covers more ground)
3. Learn to see and prep data
Your models are only as good as the data you feed them. Garbage in = Garbage out is very true in machine learning.
You’ll need to be able to load your datasets, know when to scale/ normalize/ impute, see outliers and correlations, and preprocess the data for training.
Good thing is — if you learned Python in step 1 and stats in step 2., so now you just learn the libraries: pandas, matplotlib, seaborn
The best way to learn would be to make your own dataset, see what normalizing/ scaling/ imputation does to your own data.
Resources:
Data Analysis with Python (freeCodeCamp)
Lecture 5. Data Preprocessing by Joaquin Vanschoren (recommend the notebooks here)
4. Get familiar with core ML
Learn basic concepts in regression (linear, ridge/ lasso) classification (logistic regression, decision trees, SVM), and clustering (k-means, DBSCAN). Look into XGBoost - it’s usually the best for tabular data!
Focus on knowing when to pick one model over the other, and learn to evaluate predictions.
What model are we to classify data? What metric are we using for regression? accuracy vs precision/recall vs f1-score? Why can’t we use accuracy for unbalanced datasets?
You should focus on learning scikit-learn
. Once you get the hang of fitting your model, making predictions and evaluating, you’re good!
Resources:
All Machine Learning Algorithms explained in 17 min (Infinite Codes)
Machine Learning for Everybody (freeCodeCamp)
5. Learn deep learning concepts
Learn basic concepts (layers, activations, forward/ backprop), 1DCNN and 2DCNN, RNN/ LSTM (for sequences).
Skip Tensorflow/ Keras. Get familiar with device-agnostic code (should run without issues on CPU/ GPU).
Resources:
Deep Learning Crash Course for Beginners (freeCodeCamp)
Learn PyTorch for deep learning in a day (Daniel Bourke)
6. Basics of NLP and transformers
Learn classical NLP (tf-idf, bag-of-words) before moving on to transformers. The ideas behind embedding and attention are very interesting.
Look into pre-trained models (BERT, GPT) and hugging face ecosystem (transformers, datasets
)
Gemma 3 270M, for instance is a lightweight model that came out in August. You should try fine-tuning models like these for your own tasks. It’s a great learning experience, and awesome to try out yourself.
Resources:
Natural Language Processing (NLP) Tutorial with Python & NLTK (freeCodeCamp)
HuggingFace + Langchain (Tech With Tim)
7. RAG, Vector Databases, Context engineering
LLMs can hallucinate — we want models to remember the context. You’ll want to chunk your data, turn those into embeddings, and store them in a vector database.
Get familiar with tools like Chroma, Weaviate, and Pinecone. This is probably the most rewarding part where you can get results fast.
Tip: Focus on evaluating retrieval quality and not just plugging in vector databases.
Resources:
Learn RAG from scratch (freeCodeCamp)
8. MLOps
MLOps is the probably most important part - unless you deploy models, nobody will be able to see your awesome models.
Learn basics of Docker, orchestration (you don’t really need Kubernets when you’re starting, just know it’s there), experiment tracking (TensorBoard, MLflow; Weights and Biases is a good place to start)
Can you deploy the whole thing as a simple API? Focus on making your models easier to use.
Resources:
MLOps Pipeline with Python, AWS, Docker (freeCodeCamp)
9. Projects
Here’s the reality - nobody cares what courses you’ve watched or what certificates you have. You’ll need good projects that show end-to-end development to get hired.
Where do you find good project ideas? Well, that depends on you. Think what you’re passionate about, and pick up something you know will be exciting for you.
Keep your favorite projects on GitHub, deploy them, turn them into apps, and start applying everywhere!
Mistakes To Avoid
Avoid watching tutorials/courses without a clear goal
I’ve watched countless videos, some good, some bad — from people building agents to those building automation systems that’d look impressive in my resume. I’m not saying the videos are bad - I’m saying I wasted months doing this.
Set a clear goal. Have clear expectations. Set strict deadlines. What can you finish by this week? Give yourself time, be patient, see what works for you and what doesn’t.
Don’t skip the foundation
Don’t skip fundamentals. This is important. Don’t jump to building shiny apps without learning key concepts.
You’ll need to know Python, GitHub, and Colab (or your local setup). You can search for things, sure, but you should know the absolute basics - functions, data types (list, dictionary), OOP, error handling, and best practices.
Ask more questions
Learn things intuitively. Ask yourself if you understand things like - what exactly does loss mean, how exactly does a model learn, what if you added a second convolution block, what if you didn’t normalize inputs. What if you randomly dropped neurons? Would that do something? Why?
Building RAG-based chatbots is easy. Building your custom models from scratch is easy. Jax optimizations are easy. If you haven’t skipped the basics, it’s only a matter of time before you’ll get there.
Avoid comparing
Don’t get jealous of people implementing B-splines or reproducing papers. You’ll get there. Focus on yourself. Make a list of things you don’t know, stick to it. Write the list on paper. Cross it.
Look at the inputs and outputs
I learned this too late.
You should know what the function takes as an input, a rough idea of what it’s doing to the inputs, and what outputs you’ll get. Does it return a np.ndarray
or a torch.float32
? Is it in-place? Are there better alternatives?
It’s not specific to AI, it’s good engineering practice.
Type hints and docstrings
Python has type hints since their 3.5 (PEP 484) update. And 3.9+ (PEP 585) supports built-in generics.
#old
from typing import List, Dict
def process_data(data: Dict[str, List[int]]) -> int:
pass
#new
def process_data(data: dict[str, list[int]]) -> int:
pass
Use hints, built-in generics (PEP), and stub files (.pyi) even if you’re the only one looking at the code.
Use docstrings that actually make sense.
Make it work, then refactor
Don’t waste your time thinking about clean code if it doesn’t work in the first place. You can worry about abstraction and what not once it works.
Write more tests
Write tests religiously. Test-driven development (TDD) is something I wish I started doing early.
Tests make you think more carefully about the inputs, return types, edge-cases, and you can catch mistakes before they crash your program.
Not asking more questions
Ask more questions (on Reddit, stackoverflow, LinkedIn, X, and everywhere else). Ask questions nobody is asking. Don’t worry what that’ll do to your profile. More you ask, more you learn, and more chances other people have to actually help you.
Conclusion
I don’t recommend you finish everything in one week. Take breaks, break the roadmap so you finish at your own pace. It does get easier once you learn the basics, so stick with it, and enjoy the process.
Credit where it’s due:
Reddit user (Sweaty_Chair_4600) for pointing out Prof. Leonard’s calc and differential equations playlists, and recommending Steve Brunton for probability and stats.