Week 1: Foundations of LLM Software Development

January 6—January 12, 2024

Learning Objectives

By the end of this week, students will be able to:

Understand the evolution and current state of LLM technology
Analyze the implications of non-determinism in AI systems
Implement basic LLM applications using Modal and OpenAI APIs
Design appropriate development workflows for LLM applications

Session 1: Introduction to Modern LLM Development

Tuesday, January 7, 2024 (1:00 AM—3:00 AM GMT+1)

Key Topics

1. Evolution of Large Language Models

Historical development of language models
Breakthrough of transformer architecture
Scaling laws and their implications
Current state-of-the-art models and capabilities

2. Understanding Non-determinism in LLMs

Sources of variability in LLM outputs (Lou & Sun, 2024)
Statistical approaches to output validation (Tu et al., 2024)
Temperature and sampling strategies (Watson et al., 2024)
Deterministic pipelines for production systems
Reproducibility frameworks and best practices

References for this section:

Lou, J., & Sun, Y. (2024). Anchoring Bias in Large Language Models: An Experimental Study. arXiv preprint arXiv:2412.06593.
Tu, L., Meng, R., & Joty, S. (2024). Investigating Factuality in Long-Form Text Generation. arXiv preprint arXiv:2411.15993.
Watson, J., Góes, F., & Volpe, M. (2024). Are Frontier Large Language Models Suitable for Q&A in Science Centres? arXiv preprint arXiv:2412.05200.

3. Modern LLM Landscape

Survey of available models and their capabilities
Comparison of different approaches (OpenAI, Google, Anthropic)
Trade-offs between different model sizes and architectures
Latest developments in multi-modal capabilities

Required Readings

Brown, T. B., Mann, B., Ryder, N., et al. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems, 33, 1877–1901.
Döll, M., Döhring, M., & Müller, A. (2024). Evaluating Gender Bias in Large Language Models. arXiv preprint arXiv:2411.09826.
Xu, L., Zhao, S., Lin, Q., et al. (2024). Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study. arXiv preprint arXiv:2408.14438.

Supplementary Materials

OpenAI API Documentation (2024) - Function Calling and Tool Use
Modal Documentation (2024) - Serverless Deployment for AI Applications
Google PaLM 2 Technical Report (2024)

Session 2: LLM Application Development Lifecycle

Thursday, January 9, 2024 (1:00 AM—3:00 AM GMT+1)

Key Topics

1. LLM-Specific Software Development Lifecycle

Evolution of AI Development Practices (Gong et al., 2024)
- Traditional vs. LLM-based development cycles
- Continuous evaluation patterns
- Prompt version control strategies
Iterative Development with LLMs
- Rapid prototyping methodologies
- A/B testing frameworks
- Feedback incorporation patterns
Advanced Integration Patterns
- Microservices architecture for LLMs
- API abstraction layers
- Versioning strategies for prompts and models
Production Best Practices (Ventirozos et al., 2024)
- CI/CD for LLM applications
- Testing strategies for non-deterministic systems
- Documentation requirements

References for this section:

Gong, X., Li, M., & Zhang, Y. (2024). Effective and Evasive Fuzz Testing-Driven Jailbreaking Attacks against LLMs. arXiv preprint arXiv:2409.14866.
Ventirozos, F., Nteka, I., & Nandy, T. (2024). Shifting NER into High Gear: The Auto-AdvER Approach. arXiv preprint arXiv:2412.05655.
OpenAI. (2024). LLM Application Development Guide. OpenAI Documentation.

2. Development Environment Setup

Modal platform configuration
OpenAI API integration
Local development workflows
Version control considerations

3. Infrastructure and Deployment Basics

Modern GPU Architecture Requirements (NVIDIA, 2024)
- H100 Tensor Core optimizations
- Multi-GPU deployment patterns
- Memory hierarchy considerations
Scaling and Performance (Ventirozos et al., 2024)
- Distributed inference strategies
- Load balancing techniques
- Latency optimization
Security and Cost Management
- API security patterns (OpenAI, 2024)
- Resource utilization optimization
- Cost-effective scaling strategies
Monitoring and Observability
- Metrics collection frameworks
- Performance profiling tools
- Error tracking systems

References for this section:

NVIDIA. (2024). H100 Tensor Core GPU Architecture: Advancing the State of AI. NVIDIA Technical Documentation.
Ventirozos, F., Nteka, I., & Nandy, T. (2024). Shifting NER into High Gear: The Auto-AdvER Approach. arXiv preprint arXiv:2412.05655.
OpenAI. (2024). Production Best Practices: Security and Scaling. OpenAI Documentation.

Required Readings

Gong, X., Li, M., Zhang, Y., et al. (2024). Effective and Evasive Fuzz Testing-Driven Jailbreaking Attacks against LLMs. arXiv preprint arXiv:2409.14866.
Gandhi, K., Lynch, Z., Fränken, J.P., et al. (2024). Human-like Affective Cognition in Foundation Models. arXiv preprint arXiv:2409.11733.
Modal Platform Documentation (2024) - Production Deployment Guide

Supplementary Materials

NVIDIA Hopper Architecture Documentation (2024)
OpenAI Model Cards and System Cards (2024)
Google Best Practices for LLM Application Development (2024)

Hands-on Activities

Lab 1: Production-Grade LLM Application Development

Build a robust question-answering system using Modal and OpenAI's APIs that implements industry best practices (OpenAI, 2024; Modal, 2024):

Advanced API Integration
- Secure API key rotation system
- Intelligent rate limiting with backoff strategies
- Comprehensive error handling (Gandhi et al., 2024)
- Response validation framework
Streaming and Processing
- Efficient token streaming implementation
- Real-time response validation
- Output format enforcement
- Caching with TTL management
Production-Ready Features
- Automated logging pipeline
- Performance metrics collection
- Cost optimization system
- Health monitoring dashboard

Technical Stack Requirements:

Python 3.10+ with asyncio
Modal serverless deployment
OpenAI API (GPT-4 Turbo)
Redis for caching
Prometheus for metrics
Grafana for visualization

References:

OpenAI. (2024). Production System Design. OpenAI Documentation.
Modal. (2024). Enterprise Deployment Guide. Modal Documentation.
Gandhi, K., Lynch, Z., & Fränken, J.P. (2024). Human-like Affective Cognition in Foundation Models. arXiv preprint arXiv:2409.11733.

Lab 2: Development Environment Setup

Set up a complete development environment including:

Modal configuration
API key management
Local testing framework
Basic monitoring

Project Milestone 1

Due: January 12, 2024

Objective

Build and deploy a basic PDF query application that demonstrates understanding of:

LLM API integration
Proper error handling
Basic prompt engineering
Deployment using Modal

Requirements

Implementation must include:
- PDF text extraction
- Proper chunking strategy
- Effective prompt design
- Basic error handling
- Cost monitoring
Documentation must include:
- Architecture overview
- Setup instructions
- API documentation
- Cost analysis

Evaluation Criteria

Code quality and organization (25%)
Implementation of best practices (25%)
Documentation quality (25%)
Error handling and robustness (25%)

Additional Resources

Technical Documentation

OpenAI API Reference (2024)
Modal Platform Documentation (2024)
Google PaLM 2 API Documentation (2024)

Academic Papers

Kaplan, J., McCandlish, S., Henighan, T., & Brown, T. B. (2020). Scaling Laws for Neural Language Models. arXiv preprint arXiv:2001.08361.
Raffel, C., Shazeer, N., Roberts, A., et al. (2020). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21(140), 1–67.
Xu, R., & Li, G. (2024). A Comparative Study of Offline Models and Online LLMs in Fake News Detection. arXiv preprint arXiv:2409.03067.

Industry Resources

OpenAI Engineering Blog (2024)
Google AI Blog - PaLM 2 Architecture (2024)
NVIDIA Developer Blog - GPU Architecture for LLMs (2024)

Discussion Topics

How do different model architectures affect development practices?
What are the implications of non-determinism for testing and validation?
How do you balance cost, performance, and reliability in LLM applications?
What are the key considerations when choosing between different LLM providers?

Assessment

Class Participation: 10%
Lab Assignments: 40%
Project Milestone: 50%

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week1_content.md

week1_content.md

Week 1: Foundations of LLM Software Development

Learning Objectives

Session 1: Introduction to Modern LLM Development

Key Topics

1. Evolution of Large Language Models

2. Understanding Non-determinism in LLMs

3. Modern LLM Landscape

Required Readings

Supplementary Materials

Session 2: LLM Application Development Lifecycle

Key Topics

1. LLM-Specific Software Development Lifecycle

2. Development Environment Setup

3. Infrastructure and Deployment Basics

Required Readings

Supplementary Materials

Hands-on Activities

Lab 1: Production-Grade LLM Application Development

Lab 2: Development Environment Setup

Project Milestone 1

Objective

Requirements

Evaluation Criteria

Additional Resources

Technical Documentation

Academic Papers

Industry Resources

Discussion Topics

Assessment

Files

week1_content.md

Latest commit

History

week1_content.md

File metadata and controls

Week 1: Foundations of LLM Software Development

Learning Objectives

Session 1: Introduction to Modern LLM Development

Key Topics

1. Evolution of Large Language Models

2. Understanding Non-determinism in LLMs

3. Modern LLM Landscape

Required Readings

Supplementary Materials

Session 2: LLM Application Development Lifecycle

Key Topics

1. LLM-Specific Software Development Lifecycle

2. Development Environment Setup

3. Infrastructure and Deployment Basics

Required Readings

Supplementary Materials

Hands-on Activities

Lab 1: Production-Grade LLM Application Development

Lab 2: Development Environment Setup

Project Milestone 1

Objective

Requirements

Evaluation Criteria

Additional Resources

Technical Documentation

Academic Papers

Industry Resources

Discussion Topics

Assessment