• LLM TrustAI Model Discovery
Platform
  • Home
  • Models
  • Categories
  • Compare
  • Blog
  • Newsletter
My Account
  • Dashboard
Categories
    Back to Models

    DeepSeek V3

    671B

    DeepSeek's 671B MoE model with 37B active parameters. Matches GPT-4o on many benchmarks.

    open-sourcemoereasoningcode128k-context
    Architecture

    deepseek

    Parameters

    671B

    Context Length

    131,072 tokens

    License

    DeepSeek License

    About DeepSeek V3

    DeepSeek V3 uses a Mixture-of-Experts architecture with auxiliary-loss-free load balancing. Trained on 14.8T tokens with multi-token prediction, it achieves state-of-the-art performance.

    Author

    Community

    Category

    text generation

    Downloads

    1,567,890

    License

    DeepSeek License