Phi-3 is a family of state-of-the-art open AI models developed by Microsoft, optimized for efficiency, strong reasoning, and long-context processing.
Model Variants
Model | Parameters | Context Window | Ollama Command |
---|---|---|---|
Phi-3 Mini | 3.8B | 4K tokens | ollama run phi3:mini |
Phi-3 Medium | 14B | 4K tokens | ollama run phi3:medium |
Phi-3 Medium (128K) | 14B | 128K tokens | ollama run phi3:medium-128k (Requires Ollama 0.1.39+) |
Key Features
- High Efficiency – Optimized for low-resource and latency-sensitive environments.
- Strong Reasoning – Excels in math, logic, coding, and general knowledge.
- Long Context Handling – Up to 128K tokens for deep context retention.
- Optimized for Real-World Applications – Well-suited for chatbots, coding, research, and general AI tasks.
Technical Details
- Architecture: Dense decoder-only Transformer.
- Training Data: 3.3 trillion tokens, including high-quality educational data, synthetic “textbook-like” data, and supervised fine-tuning.
- Post-Training: Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) for better instruction adherence and safety.
- Training Hardware: 512 H100-80G GPUs over 7 days.
- Training Cutoff: October 2023 (Offline dataset).
Performance Benchmarks
- Phi-3 Mini achieves state-of-the-art results among models <13B on reasoning and common sense benchmarks.
- Phi-3 Medium (14B) outperforms Gemini 1.0 Pro.
Responsible AI Considerations
- Primarily trained in English; performance may degrade in other languages.
- Potential for bias, misinformation, and hallucinations—requires human oversight.
- Not optimized for high-risk applications like legal, medical, or financial decisions.
Deployment & Usage
- Run with Ollama:bashCopyEdit
ollama run phi3:mini # 3.8B model ollama run phi3:medium # 14B model ollama run phi3:medium-128k # 14B model with 128K context
- Works with: PyTorch, DeepSpeed, FlashAttention, ONNX, Azure AI Studio.
License & Availability
License: MIT (Open-source).
Resources: