Building Your Second Brain AI Assistant Using Agents, LLMs and RAG

Learn how to access the collective wisdom of your own mind

Open-source course by Decoding ML in collaboration with
MongoDB, Comet, Opik, Unsloth and ZenML.

📖 About This Course

This course is part of Decoding ML's open-source series, teaching you how to build production-ready GenAI systems using LLMs, RAG, agents and LLMOps.

The Second Brain AI Assistant course contains 6 modules that will teach you how to build an advanced RAG and LLM system using LLMOps and ML systems best practices. You'll learn to build an end-to-end AI assistant that chats with your Second Brain - your personal knowledge base of notes, resources, and storage.

By the end of this course, you'll be able to architect and implement a production-ready agentic RAG and LLM system from scratch.

So What Is the Second Brain AI Assistant?

The Second Brain, a concept by Tiago Forte, is your personal knowledge base of notes, ideas, and resources. Our AI Assistant leverages this knowledge to answer questions, summarize documents, and provide insights.

Imagine asking your AI Assistant to recommend agent courses, list top PDF parsing tools, or summarize LLM optimization methods - all based on your research, without manually searching through notes.

While we use Notion for this course, the code is adaptable to other sources like Google Drive or Calendar. We'll provide our curated AI/ML resource list from Notion, covering GenAI, LLMs, RAG, MLOps, and more. No Notion account needed - but if you want to use yours, our flexible pipeline supports any Notion database.

What You'll Do:

Build an agentic RAG system powered by your Second Brain
Design production-ready LLM architectures
Apply LLMOps and software engineering best practices
Fine-tune and deploy LLMs
Use industry tools: OpenAI, Hugging Face, MongoDB, ZenML, Opik, Comet, Unsloth, and more

After completing this course, you'll have access to your own Second Brain AI assistant, as seen in the video below:

second_brain_ai_assistant_example.mp4

🎯 What You'll Learn

While building the Second Brain AI assistant, you'll master:

LLM system architecture (FTI) and MLOps best practices
Pipeline orchestration and tracking with ZenML
LLMOps and RAG evaluation using Opik
Large-scale web crawling and content normalization
Quality scoring with LLMs and heuristics
Dataset generation through distillation
Llama model fine-tuning with Unsloth and Comet
Serverless model deployment to Hugging Face
Advanced RAG with contextual or parent retrieval and vector search
Agent building using smolagents
Modern Python tooling (uv, ruff)

🥷 With these skills, you'll become a ninja in building advanced agentic RAG and LLM systems using LLMOps and ML systems best practices.

👥 Who Should Join?

Target Audience	Why Join?
ML/AI Engineers	Build production-ready agentic RAG & LLM systems
Data/Software Engineers & Data Scientists	Level-up to production AI systems

Note: Hands-on engineering, not theory.

🎓 Prerequisites

Category	Requirements
Skills	- Python (Intermediate) - Machine Learning, LLMs, RAG (Beginner)
Hardware	Modern laptop/PC (GPU optional - cloud alternatives provided)
Level	Intermediate (But with a little sweat and patience, anyone can do it)

💰 Cost Structure

The course is open-source and free! You'll only need $1-$5 for tools if you run the code:

Service	Maximum Cost
OpenAI's API	~$3
Hugging Face's Dedicated Endpoints (Optional)	~$2

The best part? We offer multiple paths - you can complete the entire course for just ~$1 by choosing cost-efficient options. Reading-only? Everything's free!

🥂 Open-source Course: Participation is Open and Free

As an open-source course, you don't have to enroll. Everything is self-paced, free of charge and with its resources freely accessible at:

code: this GitHub repository
lessons: Decoding ML

📚 Course Outline

This open-source course consists of 6 comprehensive modules covering theory, system design, and hands-on implementation.

Our recommendation for getting the most out of this course:

Clone the repository.
Read the materials.
Setup the code and run it to replicate our results.
Go deeper into the code to understand the details of the implementation.

Module	Materials	Description	Running the code
1	Build your Second Brain AI assistant	Architect an AI assistant for your Second Brain.	No code
2	Data pipelines for AI assistants	Build a data ETL pipeline to process custom Notion data, crawl documents, compute a quality score using LLMs & heuristics, and ingest them into a NoSQL database.	apps/second-brain-offline
3	Generate high-quality fine-tuning datasets (WIP)	Generate a high-quality summarization instruct dataset using distilation.	apps/second-brain-offline
4	Fine-tune and deploy open-source LLMs (WIP)	Fine-tune an open-source LLM to specialize it in summarizing documents and deploy it as a real-time endpoint.	apps/second-brain-offline
5	RAG feature pipelines for building AI assistants (WIP)	Implement an RAG feature pipeline using advanced techniques such as context retrieval.	apps/second-brain-offline
6	Agents and LLMOps (WIP)	Implement the agentic inference pipeline together with an observation pipeline to monitor and evaluate the performance of the AI assistant.	apps/second-brain-online

🏗️ Project Structure

While building the Second Brain AI assistant, we will build two separate Python applications:

.
├── apps / 
|   ├── infrastructure/               # Docker infrastructure for the applications
|   |   ├── second-brain-offline/     # Offline ML pipelines
└─  └─  └── second-brain-online/      # Online inference pipeline = our AI assistant

👔 Dataset

We will use our personal list of filtered resources (which we keep in Notion) on AI and ML, such as GenAI, LLMs, RAG, MLOps, LLMOps and information retrieval, containing ~100 pages and 500+ links which we will crawl and access from the Second Brain AI assistant.

For ease of use, we stored a snapshot of our Notion data in a public S3 bucket, which you can download for free without AWS credentials.

Download here

Thus, you don't need to use Notion or give access to your Notion to complete this course. But if you want to, you can, as we expose in this GitHub repository, a flexible pipeline that can load any Notion database.

🚀 Getting Started

Find detailed setup instructions in each app's documentation:

Application	Documentation
Offline ML Pipelines (data pipelines, RAG, fine-tuning, etc.)	apps/second-brain-offline
Online Inference Pipeline (Second Brain AI assistant)	apps/second-brain-online

Pro tip: Read the accompanying articles first for a better understanding of the system you'll build.

💡 Questions and Troubleshooting

Have questions or running into issues? We're here to help!

Open a GitHub issue for:

Questions about the course material
Technical troubleshooting
Clarification on concepts

🥂 Contributing

As an open-source course, we may not be able to fix all the bugs that arise.

If you find any bugs and know how to fix them, support future readers by contributing to this course with your bug fix.

You can always contribute by:

Forking the repository
Fixing the bug
Creating a pull request

We will deeply appreciate your support for the AI community and future readers 🤗

Core Contributors

_{Paul Iusztin}
_{AI/ML Engineer}

_{Ernesto Larios}
_{AI Engineer}

_{Anca Ioana Muscalagiu}
_{SWE/ML Engineer}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github/workflows		.github/workflows
apps		apps
static		static
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building Your Second Brain AI Assistant Using Agents, LLMs and RAG

Learn how to access the collective wisdom of your own mind

📖 About This Course

So What Is the Second Brain AI Assistant?

What You'll Do:

🎯 What You'll Learn

👥 Who Should Join?

🎓 Prerequisites

💰 Cost Structure

🥂 Open-source Course: Participation is Open and Free

📚 Course Outline

🏗️ Project Structure

👔 Dataset

🚀 Getting Started

💡 Questions and Troubleshooting

🥂 Contributing

Sponsors

Core Contributors

License

About

Releases

Packages

Contributors 3

Languages

License

decodingml/second-brain-ai-assistant-course

Folders and files

Latest commit

History

Repository files navigation

Building Your Second Brain AI Assistant Using Agents, LLMs and RAG

Learn how to access the collective wisdom of your own mind

📖 About This Course

So What Is the Second Brain AI Assistant?

What You'll Do:

🎯 What You'll Learn

👥 Who Should Join?

🎓 Prerequisites

💰 Cost Structure

🥂 Open-source Course: Participation is Open and Free

📚 Course Outline

🏗️ Project Structure

👔 Dataset

🚀 Getting Started

💡 Questions and Troubleshooting

🥂 Contributing

Sponsors

Core Contributors

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages