Tekmono
  • News
  • Guides
  • Lists
  • Reviews
  • Deals
No Result
View All Result
Tekmono
No Result
View All Result
Home News
DeepSeek Unveils AI Model with Lower Inference Costs

DeepSeek Unveils AI Model with Lower Inference Costs

by Tekmono Editorial Team
30/09/2025
in News
Share on FacebookShare on Twitter

Researchers at DeepSeek have released a new experimental model, V3.2-exp, designed to significantly lower inference costs in long-context operations, as announced in a post on Hugging Face and an accompanying academic paper on GitHub.

The model’s key feature is DeepSeek Sparse Attention, a system that utilizes a “lightning indexer” to prioritize specific excerpts from the context window. Following this, a “fine-granular token selection system” selects specific tokens from within those excerpts, which are then loaded into the module’s limited attention window. This enables the Sparse Attention model to operate over extensive context portions with relatively small server loads.

DeepSeek’s preliminary testing indicates that the price of a simple API call can be reduced by as much as half in long-context operations. Although further testing is necessary to validate these claims, the model is open-weight and freely available on Hugging Face, allowing third-party evaluations to assess the results presented in the paper.

Related Reads

Apple Unveils iPhone 17e Starting at $599

Honor Launches Thinner Magic V6 Foldable Phone

Trump Orders Immediate Halt to Anthropic AI Use

Claude AI Suffers Partial Service Disruption on March 2

DeepSeek’s new model is part of a series of recent breakthroughs addressing the issue of inference costs, which represent the server expenses associated with operating a pre-trained AI model, distinct from training costs. The researchers aimed to enhance the fundamental transformer architecture’s efficiency and found significant room for improvement.

Based in China, DeepSeek has been an unconventional player in the AI sector, particularly for those who view AI research as a nationalist competition between the U.S. and China. The company garnered attention earlier this year with its R1 model, trained primarily using reinforcement learning at a lower cost than its American counterparts. Although R1 did not spark a revolution in AI training as predicted, DeepSeek’s new “sparse attention” approach could still offer valuable insights to U.S. providers on maintaining low inference costs.

The new model’s release is unlikely to generate the same level of excitement as R1, but it has the potential to teach U.S. providers important strategies for reducing inference costs.

ShareTweet

You Might Be Interested

Apple Unveils iPhone 17e Starting at 9
News

Apple Unveils iPhone 17e Starting at $599

02/03/2026
Honor Launches Thinner Magic V6 Foldable Phone
News

Honor Launches Thinner Magic V6 Foldable Phone

02/03/2026
Trump Orders Immediate Halt to Anthropic AI Use
News

Trump Orders Immediate Halt to Anthropic AI Use

02/03/2026
Claude AI Suffers Partial Service Disruption on March 2
News

Claude AI Suffers Partial Service Disruption on March 2

02/03/2026
Please login to join discussion

Recent Posts

  • Apple Unveils iPhone 17e Starting at $599
  • Honor Launches Thinner Magic V6 Foldable Phone
  • Trump Orders Immediate Halt to Anthropic AI Use
  • Claude AI Suffers Partial Service Disruption on March 2
  • Claude Chatbot Overtakes ChatGPT in US App Store

Recent Comments

No comments to show.
  • News
  • Guides
  • Lists
  • Reviews
  • Deals
Tekmono is a Linkmedya brand. © 2015.

No Result
View All Result
  • News
  • Guides
  • Lists
  • Reviews
  • Deals