LLMs explained: You need attention, a cat, and not to stare

Article structure

A guide to Large Language Models

Large Language Models (LLMs) are a hot topic, making headlines and sparking a myriad of conversations. But what exactly are they, and how do they work? This guide takes you from the basics of AI to the intricate workings of LLMs, step by step.

<aside> 💡

I added a few questions to each section throughout the article to help you recall what you read. It is an essential step in committing new knowledge to memory.

</aside>

To understand Large Language Models, let's begin with the fundamental concept of AI.

1. The foundations of AI

At its core, artificial intelligence refers to machines designed to perform tasks that typically require human intelligence. These tasks include learning from experience, recognizing patterns, understanding language, and making decisions.

Machine Learning: The heart of AI

Machine learning (ML) is a subset of AI that enables computers to learn from data without being explicitly programmed. Instead of following hard-coded rules, ML models identify patterns in data to make predictions or decisions.

Here is how they differ:

Hard-coded rules example If the temperature is > 25°C, then turn on the air conditioning. This is a simple, predefined rule (if/then statements are typical) that doesn't learn or adapt.

Pattern recognition example A spam filter learns to identify spam emails by analyzing patterns in subject lines, sender information, and content, improving its accuracy over time as it processes more emails.

The key difference is that machine learning models, like the spam filter, can identify patterns and make decisions without being explicitly programmed with rules such as the if/then one above.

Quick test

Define Artificial Intelligence in simple terms.