Ollama Intruduction

Before You Begin

If you are opening this guide, you have probably heard a lot about artificial intelligence and now want to get your hands on it yourself. But perhaps you also feel a little overwhelmed by terms like "local LLMs," "Jupyter Notebooks," and "Virtual Environments." Don't worry, that is completely normal! This tutorial was written specifically for absolute beginners like you.

Our goal is to show you how to run powerful AI models directly on your own computer. This is not only incredibly fascinating, but it also offers you full control over your data, maximum privacy, and the freedom to use AI models without an internet connection or expensive cloud services.

VS Code

Python

Ollama

CUDA

Component	Basic	Enthusiast
RAM	16 GB	32 GB+
GPU VRAM	8 GB	16 GB+
Storage	50 GB SSD	1 TB+ SSD
CPU	4+ Cores	8+ Cores

Ollama is the most important tool in this kit. It takes the complexity out of the equation and makes it easy to install, manage, and run various AI models. We will use this tool to set up an AI "playground" in Visual Studio Code (VS Code), which resembles a kind of lab book called a Jupyter Notebook.

In the following steps, we will together:

Install the necessary software.
Set up your development environment so that everything is clean and organized.
Download your first AI model and get it running with a few lines of code.

You don't need to be a programming expert to get started. We will guide you through every single step and provide you with all the code snippets you need.

A Quick Look at the Hardware

You saw a table with hardware recommendations in the previous overview. It is important to understand that the performance of AI models heavily depends on your Random Access Memory (RAM) and, if you have one, on your Graphics Processing Unit (GPU).

RAM: This is your computer's short-term memory. The larger the AI model, the more RAM is needed to run it.
GPU: An NVIDIA graphics card can massively accelerate computational power. If you have one, the AI experience will be significantly faster. But even without a powerful GPU, you can get started it will just take a bit longer.

See these recommendations as a guide, not as rigid rules. Even with a basic setup, you can experiment with smaller, yet impressive, models and learn the fundamentals.

Installation Process

Install VS Code extensions: Python and Jupyter (Ctrl+Shift+X in VS Code).
Create a virtual Python environment: Open folder in VS Code, use Python: Create Environment (Ctrl+Shift+P), select Venv.
Install Jupyter kernel: Open terminal, run pip install ipykernel in your virtual environment.

Start Ollama after installation and check if the service is running. Make sure Python and CUDA are set up correctly for optimal AI model performance. Configure your environment variables and test the installation.

Testing and Experimentation

Compare model requirements:

gemma3:1b – 815 MB, 4 GB RAM, Multimodal/Chat
mistral:7b – 4.1 GB, 8 GB RAM, Fast Chat
llama2:7b – 3.8 GB, 8 GB RAM, General Chat
codellama:7b – 3.8 GB, 8 GB RAM, Code Generation
deepseek-coder:33b – 18 GB, 22 GB RAM, Complex Coding
llama3.1:70b – 40 GB, 48 GB RAM, Advanced Reasoning

Start Ollama server in a separate terminal: ollama serve
Download a model: ollama pull gemma3:1b
Run in Jupyter Notebook (VS Code):

    # Ensure Ollama server is running!
    model_name = 'gemma3:1b'

    try:
        ollama.list()
        print("Ollama Server reachable.")

        response = ollama.chat(
            model=model_name,
            messages=[
                {
                    'role': 'user',
                    'content': 'Why is the sky blue? Explain simply.',
                },
            ]
        )
        print("\nFull response:")
        print(response)
        # Robust: Zeige die Antwort, falls das Feld existiert
        if 'message' in response and isinstance(response['message'], dict) and 'content' in response['message']:
            print("\nModel response:")
            print(response['message']['content'])
        else:
            print("\nNo valid model response found in 'message' field.")
    except Exception as e:
        print(f"\nError: {e}")
        print("Make sure 'ollama serve' is running in a separate terminal.")

Common Issues and Solutions

ModuleNotFoundError: No module named 'ollama'
Install ollama in your active virtual environment: pip install ollama.
Slow performance or high CPU usage
Try smaller models (e.g., gemma3:1b). For NVIDIA GPUs, ensure CUDA Toolkit is installed.
Connection to Ollama server failed
Make sure ollama serve is running in a separate terminal. Check firewall settings for port 11434.
CUDA conflicts or GPU not detected
Update NVIDIA drivers, check CUDA version with nvidia-smi, and restart your system if needed.

Refer to documentation or forums for further help.