The launch of Deepseek R1 has stunned the AI world,
delivering the performance of the frontier AI model Chat GPT 01 at a fraction of
the price. Benchmark performance is on par across a range of domains, yet the Chat
Bot platform is completely free and it’s API exchanges tokens at 1 fiftieth of the price of the Open AI API. The US markets
reacted by shaving 1 trillion dollars off the stock valuations of leading
technology companies. Indeed NVIDIA, the primary supplier of high-end chips
lost 600 billion dollars or 18% of its value. What was most astonishing is the
purported training cost of Deepseek R1, at a paltry 10 million dollars, or 10
times less than Chat GPT 4.0. There is much speculation regarding the Hardware
available to the Deepseek team, and whether they were able to import 1000s of
high-end NVIDIA chips at a cost of a billion dollars, despite the US export ban
to China, or whether CHAT GPT could have been used to train Deepseek R1.
Furthermore, the Deepseek R1 model is open source, meaning
it can be downloaded and run on a local machine. However, the hardware
requirements are very significant as would be expected.
The most critical factor is the size of the model, which determines how much VRAM (GPU memory) or RAM (CPU memory) you need. Here's a rough estimate:
Model Size | Approximate VRAM/GPU Requirements | RAM/CPU Requirements |
---|---|---|
7B parameters | 8–16 GB VRAM | 16–32 GB RAM |
13B parameters | 16–24 GB VRAM | 32–64 GB RAM |
30B parameters | 24–48 GB VRAM | 64–128 GB RAM |
65B+ parameters | 48+ GB VRAM | 128+ GB RAM |
For efficient inference, you need a GPU with sufficient VRAM. For example:
If you don't have a GPU, you can run smaller models on a CPU, but it will be much slower. You'll need significantly more RAM.
NVIDIA GPUs are preferred because of their CUDA support and compatibility with deep learning frameworks like PyTorch and TensorFlow.
If you're running the model on a CPU:
Disk Space: Model weights can take up a lot of space. For example:
Use an SSD for faster loading times.
If your hardware is limited, you can use quantized versions of the model (e.g., 4-bit or 8-bit quantization). This reduces the VRAM/RAM requirements significantly:
llama.cpp
or Hugging Face's bitsandbytes
can help with quantization.
If you don’t have the required hardware, consider using cloud services like:
transformers
, accelerate
, bitsandbytes
, etc.
The AI world has changed as a result of Deepseek R1. It was
never going to be unipolar anyway. Competition is good and will drive down costs
and increase the demand for AI. Already Open AI has responded by releasing the 03
mini model which has set some new benchmarks.
The open-source development of Deepseek may have led to
increased training efficiency and makes it more likely that powerful AI will be
cheaply available everywhere in the future, perhaps for all but the frontier
models at least.
The US have instigated a 500 billion Stargate project which will see the pooling of hardware, data, compute time and the finest developers. The goal is clear, to achieve artificial general intelligence (AGI) or super intelligence (ASI) as a once in a lifetime opportunity to beat all other rivals to super intelligence and then further development in AI performance will be through the AI.
The costs of
further developments in AI are huge due to the scaling laws, which require
exponential effort for linear growth. These indicate that in order to achieve super
intelligence, 100,000 times as much compute will be required. But Deepseek may
have shifted this paradigm by training a model with much less compute. It is also possible that the
cost reduction could be attributable to lower labour costs or training a less
than frontier model. Either way the feat is remarkable and has changed the AI
industry.
It is not clear yet if AGI is possible. Off course all the
leading technology companies believe that it is. The current AI usage, given the huge
costs indicate that companies are playing the long game, preparing to incur
financial losses on the way to developing what may be available soon. This loss leading exercise may assist in the
development of AI through the insight gained from public interactions.
We may not have a clear grasp of what constitutes human
intelligence. Impressive though chat bots are, having benefited from a magnitude
of training that surpasses what any individual human has received, it is not
clear if they are more than speedy librarians, serving up the curated regurgitation of
human ingenuity, always selecting the modal probabilistic route. This may not
be a route that is taken when generating new knowledge. And as we provide ever
greater datasets with greater compute, where does this really take the AI? The
AI industry will have us believe this is somewhere beyond human intelligence.
Only time will tell whether the 500 billion stargate project is a once only
opportunity to reach the AI singularity and achieve unrivalled power. It may
turn out to be the pinnacle of Hubris, like a sci-fi tower of Babel. Perhaps further
Large Language Model scaling doesn’t get us much further than where we are right now. Any frontier model may very will be replicated a few months later for a fraction of the cost. Doubtless AI will flood the labour markets as we are we only beginning
to understand its potential application.