The post Deepseek-R1 – 1.5B Parameters appeared first on Edge AI.
]]>Introduction DeepSeek-R1 represents a new generation of lightweight, high-performance reasoning models, optimized for Edge AI applications. These models deliver exceptional reasoning, coding, and mathematical capabilities while maintaining efficiency for deployment in resource-constrained environments.
Why Small Models Matter for Edge AI Edge AI demands models that strike a balance between computational efficiency and performance. Small-scale LLMs provide:
DeepSeek-R1 Small Model Variants The DeepSeek team has successfully distilled knowledge from larger models into smaller, dense models. These lightweight models leverage insights from extensive reasoning datasets, achieving strong benchmark results while being optimized for Edge AI use cases.
ollama run deepseek-r1:1.5b
ollama run deepseek-r1:7b
ollama run deepseek-r1:8b
Applications in Edge AI These small models are particularly suited for:
Licensing & Flexibility DeepSeek-R1 small models are open-source under the MIT License, allowing for commercial use, modifications, and further fine-tuning. The Qwen-based models originate from Qwen-2.5 (Apache 2.0 License), and Llama-derived models follow Meta’s licensing terms.
Conclusion DeepSeek-R1’s distilled small models present a breakthrough for Edge AI, delivering robust performance with minimal computational overhead. These models enable AI at the edge—secure, fast, and efficient.
For more information and downloads, visit https://www.ollama.com/library/deepseek-r1:1.5b
The post Deepseek-R1 – 1.5B Parameters appeared first on Edge AI.
]]>The post Mistral – 7B Parameters appeared first on Edge AI.
]]>Mistral is a compact and powerful 7B parameter model, optimized for instruction following and text completion while being lightweight enough for Edge AI deployments. With an Apache 2.0 license, Mistral provides unrestricted flexibility for customization and integration into real-world applications.
Mistral 0.3 enables function calling via Ollama’s raw mode, making it useful for real-world tasks such as:
[AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the user's location."}}, "required": ["location", "format"]}}}][/AVAILABLE_TOOLS][INST] What is the weather like today in San Francisco [/INST]
Example Response:
[TOOL_CALLS] [{"name": "get_current_weather", "arguments": {"location": "San Francisco, CA", "format": "celsius"}}]
Mistral can be easily deployed on edge devices via command-line or API:
ollama run mistral
curl -X POST http://localhost:11434/api/generate -d '{
"model": "mistral",
"prompt":"Summarize recent AI advancements"
}'
Mistral’s compact size and powerful performance make it an ideal choice for Edge AI deployments, whether for industrial automation, IoT integration, or AI-powered assistants. Its ability to handle structured function calls enhances real-world usability, making it a key player in next-generation AI systems at the edge.
The post Mistral – 7B Parameters appeared first on Edge AI.
]]>The post Phi-3 – 3.8B Parameters appeared first on Edge AI.
]]>Model | Parameters | Context Window | Ollama Command |
---|---|---|---|
Phi-3 Mini | 3.8B | 4K tokens | ollama run phi3:mini |
Phi-3 Medium | 14B | 4K tokens | ollama run phi3:medium |
Phi-3 Medium (128K) | 14B | 128K tokens | ollama run phi3:medium-128k (Requires Ollama 0.1.39+) |
ollama run phi3:mini # 3.8B model ollama run phi3:medium # 14B model ollama run phi3:medium-128k # 14B model with 128K context
License: MIT (Open-source).
Resources:
The post Phi-3 – 3.8B Parameters appeared first on Edge AI.
]]>The post Llama 3.2 – 1B Parameters appeared first on Edge AI.
]]>Introduction Meta’s Llama 3.2 models bring the power of large language models (LLMs) to smaller, efficient architectures designed for Edge AI applications. With 1B and 3B parameter versions, these models enable robust multilingual dialogue, retrieval, and summarization capabilities while remaining computationally lightweight.
Why Llama 3.2 for Edge AI? Deploying AI at the edge requires models that are efficient, responsive, and adaptable. Llama 3.2’s small-scale variants provide:
Llama 3.2 Small Model Variants
ollama run llama3.2:1b
ollama run llama3.2
Applications in Edge AI Llama 3.2’s compact models are suited for:
Licensing & Availability Llama 3.2 is released under Meta’s Llama 3.2 Community License Agreement, with usage governed by Meta’s Acceptable Use Policy. The models are freely available for research and commercial applications, subject to compliance with licensing terms.
Conclusion Llama 3.2’s 1B and 3B parameter models provide an optimal balance of efficiency and performance for Edge AI applications. Their multilingual capabilities, instruction tuning, and lightweight architecture make them powerful tools for deploying AI beyond traditional cloud environments.
For more details and downloads, visit EdgeAI.org.
The post Llama 3.2 – 1B Parameters appeared first on Edge AI.
]]>The post Qwen2.5 – 0.5B – 1.5B – 3B Parameters appeared first on Edge AI.
]]>0.5B – 72B: Covers a broad range of applications, from lightweight edge AI to large-scale enterprise use.
7B Model (Default)
Optimized for instruction-following, structured data, and long-text generation.
Run locally via:bashCopyEditollama run qwen2.5
Other Model Sizes:
The post Qwen2.5 – 0.5B – 1.5B – 3B Parameters appeared first on Edge AI.
]]>The post Gemma – 2B Parameters appeared first on Edge AI.
]]>ollama run gemma
ollama run gemma:2b
License: Gemma Terms of Use (Modified February 21, 2024)
Architecture: Gemma
Parameters: 8.54B (7B model)
Quantization: Q4_0 (5.0GB)
The post Gemma – 2B Parameters appeared first on Edge AI.
]]>The post TinyLlama – 1.1B Parameters appeared first on Edge AI.
]]>ollama run tinyllama
The post TinyLlama – 1.1B Parameters appeared first on Edge AI.
]]>The post Gemma 2 – 2B Parameters appeared first on Edge AI.
]]>Google’s Gemma 2 model delivers a balance of power and efficiency, making it a strong candidate for EdgeAI deployments. With a focus on small-scale, high-performance models, the 2B and 9B parameter versions provide cutting-edge natural language processing (NLP) while maintaining a manageable computational footprint.
For EdgeAI applications, model size and efficiency are critical. The 2B and 9B variants of Gemma 2 are designed to operate in constrained environments without sacrificing performance. These models enable:
Gemma 2 can be efficiently run using Ollama:
from langchain_community.llms import Ollama
llm = Ollama(model="gemma2:2b")
response = llm.invoke("Explain the benefits of on-device AI.")
For more demanding edge applications, the 9B variant provides a balance between efficiency and performance:
llm = Ollama(model="gemma2:9b")
response = llm.complete("How can AI optimize edge computing workflows?")
By leveraging Gemma 2’s compact yet capable architecture, EdgeAI solutions can achieve faster, more reliable, and scalable AI-driven automation. These models provide a foundation for deploying AI where it matters most—on the edge, closer to real-world interactions.
The post Gemma 2 – 2B Parameters appeared first on Edge AI.
]]>