When was deepseek released

Content on WhatAnswers is provided "as is" for informational purposes. While we strive for accuracy, we make no guarantees. Content is AI-assisted and should not be used as professional advice.

Last updated: April 17, 2026

Quick Answer: DeepSeek was released in December 2023. The initial version, DeepSeek-RL, marked the debut of the DeepSeek series developed by the Chinese AI startup DeepSeek AI.

Key Facts

DeepSeek was released in December 2023
Developed by the Chinese AI startup DeepSeek AI
First model launched was DeepSeek-RL
Trained on over 1 trillion tokens of text data
Achieved competitive performance on benchmarks like MMLU and GSM8K

Overview

DeepSeek is a series of large language models developed by DeepSeek AI, a Beijing-based artificial intelligence company. The first version, DeepSeek-RL, was officially released in December 2023, marking a significant milestone in China's generative AI landscape.

The release positioned DeepSeek as a competitive player among open-source and proprietary language models globally. Designed for high performance in reasoning, coding, and multilingual tasks, DeepSeek quickly attracted attention from researchers and developers.

Release Date: DeepSeek-RL was launched in December 2023, with public access and model weights shared shortly after.
Developer: Created by DeepSeek AI, a startup founded in Beijing focused on advancing large-scale language models.
Model Scale: The initial version was trained on over 1 trillion tokens, enabling robust generalization across tasks.
Architecture: Built on a decoder-only transformer structure similar to GPT, optimized for autoregressive text generation.
Openness: Unlike many U.S.-based models, DeepSeek released open-weight models, allowing broad access for research and commercial use.

How It Works

DeepSeek leverages transformer-based deep learning architectures trained on vast datasets to generate human-like text and perform complex reasoning tasks. Each component of the model is optimized for efficiency, accuracy, and scalability across diverse applications.

Transformer Architecture: Uses a decoder-only transformer with multi-head attention, enabling efficient processing of sequential data.
Training Data: Trained on 1.2 trillion tokens from diverse internet sources, books, and technical documents to ensure broad knowledge coverage.
Reinforcement Learning: DeepSeek-RL incorporates reinforcement learning from human feedback (RLHF) to align outputs with user intent and safety standards.
Parameter Count: The base model contains 6.7 billion parameters, while larger variants scale up to 67 billion.
Multilingual Support: Supports over 30 languages, including Chinese, English, Spanish, and Arabic, enhancing global usability.
Inference Optimization: Employs quantization and distillation techniques to run efficiently on consumer-grade GPUs.

Comparison at a Glance

Below is a performance comparison of DeepSeek-RL with other leading language models across key benchmarks.

Model	MMLU Score	GSM8K Score	Context Length	Open Weights
DeepSeek-RL	78.2%	81.4%	32,768 tokens	Yes
GPT-3.5	71.2%	78.1%	16,384 tokens	No
Llama-2-70B	68.9%	66.3%	4,096 tokens	Yes
Palm 2	75.1%	72.8%	8,192 tokens	No
Falcon-40B	65.3%	60.2%	8,192 tokens	Yes

DeepSeek-RL outperforms many contemporaries in reasoning and knowledge tasks, particularly due to its long context window and high-quality training data. Its open-access model distinguishes it from closed models like GPT-3.5 and Palm 2, promoting wider adoption in academic and enterprise settings.

Why It Matters

DeepSeek's emergence highlights the growing strength of non-U.S. AI innovation, especially in China’s rapidly evolving tech ecosystem. Its open release model encourages transparency, collaboration, and faster iteration in AI development.

Global AI Competition: DeepSeek strengthens China's position in the global race for AI dominance, challenging U.S. leadership.
Open Research: By releasing model weights, DeepSeek enables reproducible research and community-driven improvements.
Enterprise Applications: Used in customer support automation and document analysis by Chinese tech firms.
Educational Tools: Integrated into AI tutoring systems for personalized learning experiences.
Code Generation: Supports autocompletion and debugging in multiple programming languages with high accuracy.
Cost Efficiency: Runs on lower-cost hardware due to optimization, making AI more accessible to SMEs.

As AI continues to evolve, models like DeepSeek demonstrate that innovation is no longer confined to a single region, fostering a more diverse and resilient technological future.