Getting Started with DeepSeek LLM: A Hands-On Guide

3 min readJan 15, 2025

· Introduction
· Installing Dependencies
· Running DeepSeek LLM on Google Colab
· Loading DeepSeek LLM
· Running Inference with Basic Text Generation
· Experimenting with Different Prompts
· Summarization Example
· Code Generation Example
· Creative Writing Example
· Next Steps

Introduction

DeepSeek LLM has emerged as a powerful open-source language model, offering researchers, developers, and businesses the flexibility to fine-tune and experiment without restrictions. In this guide, we will walk through the step-by-step process of setting up DeepSeek LLM, installing dependencies, running basic inference, and exploring different prompts to understand its capabilities.

If you prefer hands-on learning over theory, jump straight into the code with Colab!

DeepSeek LLM — Google Colab

colab.research.google.com

1. Installing Dependencies

To begin, ensure you have the following installed:

Python 3.8+
PyTorch (CUDA-enabled for GPU acceleration)
Hugging Face Transformers library
accelerate for optimized inference
Google Colab or a local machine with an NVIDIA GPU

Run the following command to install the necessary libraries:

pip install torch transformers accelerate

2. Running DeepSeek LLM on Google Colab

If you don’t have a powerful local GPU, you can run DeepSeek LLM on Google Colab with a free T4 GPU:

Go to Google Colab.
Select Runtime > Change runtime type > GPU.
Run the installation commands above to set up the environment.

3. Loading DeepSeek LLM

DeepSeek LLM is available on Hugging Face and can be easily loaded with the transformers library.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "deepseek-ai/deepseek-llm-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda")

4. Running Inference with Basic Text Generation

Once the model is loaded, you can generate text using a simple inference script:

input_text = "What are the key advancements in DeepSeek LLM?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

5. Experimenting with Different Prompts

To explore the model’s capabilities, try different types of prompts:

Summarization Example

input_text = "Summarize the importance of open-source AI models."
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Code Generation Example

input_text = "Write a Python function to calculate Fibonacci numbers."
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Creative Writing Example

input_text = "Compose a short science fiction story about AI and the future."
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

By experimenting with various inputs, you can better understand how DeepSeek LLM handles different tasks and where it excels.

Google Colab to play with

Next Steps

Now that we have DeepSeek LLM set up and have explored its basic capabilities, the next step is fine-tuning it for specific tasks. Stay tuned for the next article, where we’ll explore Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to align DeepSeek with domain-specific requirements.

References & Citations

DeepSeek LLM Research Paper: DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
OpenAI Scaling Laws: Kaplan et al., 2020, Scaling Laws for Neural Language Models (arXiv)
LLaMA-2 Performance Benchmarks: Meta AI, 2023, LLaMA-2: Open Foundation and Fine-Tuned Chat Models (arXiv)

Stay on the cutting-edge of AI! 🌟 Follow me on Medium, connect on LinkedIn, and explore latest trends in AI technologies and models. Dive into the world of AI with me and discover new horizons! 📚💻