wibbit
← All standards documents
Frameworkv0.2· Published

Wibbit AI Literacy Standards Framework

A concept sequence, competency definitions, assessment criteria, and pedagogical principles for AI literacy for ages 8 and older. Updated to include Course 3 concepts covering computer vision, generative AI, multimodal systems, and synthetic media.

Published May 6, 2026

Wibbit AI Literacy Standards Framework

Version: v0.2 — May 2026 Status: Published Prepared by: Wibbit Standards Cartographer Companion documents: Standards Landscape Map, Curriculum Alignment Report


Preamble

The international AI literacy standards landscape is actively forming. Major frameworks — AI4K12's Five Big Ideas (2020–2022), UNESCO's AI Competency Framework for School Students (2024), the AILit/OECD review draft (May 2025), and the CSTA + AI4K12 AI Learning Priorities (2025) — share broad commitments: students should understand how AI systems work, why they fail, and how they affect society. None of these frameworks, however, specifies a coherent concept sequence for the 10-to-13 age range grounded in the AI systems children actually use today. That gap is the motivation for this framework.

This document proposes a concept sequence, competency definitions, assessment criteria, and pedagogical principles specifically designed for large language model–era AI literacy at the middle school level (approximately grades 5–8, ages 10–13). It is intended to be complementary to — not a replacement for — the frameworks above. Citations to those frameworks appear throughout; divergences from them are named and explained.

This framework is informed by Wibbit's curriculum design and delivery experience across Courses 1–3, validated against the Standards Landscape Map and Curriculum Alignment Report. Where Wibbit's curriculum makes choices that diverge from external frameworks, those divergences are surfaced explicitly.

v0.2 disclosure: This version extends v0.1 with five new concepts from Course 3 (computer vision, CNN feature extraction, generative AI, synthetic media provenance, and multimodal embedding spaces). Competency definitions for the four highest-priority new concepts are included. Sections on assessment criteria and pedagogical principles are unchanged from v0.1 and apply to all concepts in the sequence.


Section 1: Purpose and Scope

1.1 What this framework is for

This framework defines what AI literacy means for students approximately 10 to 13 years old — the middle school years in the US system, approximately grades 5 through 8. It specifies:

  • A concept sequence: the canonical progression of AI literacy concepts, each defined as an atomic unit of understanding that students can demonstrate, build on, and connect to adjacent concepts.
  • Competency definitions: observable behavioral descriptions of what it looks like when a student has reached Introductory, Intermediate, or Proficient mastery of each concept — anchored to what students do, not what they can recite.
  • Assessment criteria: framework-neutral criteria that any educator, using any instructional approach, can apply to evaluate student competency at each level.
  • Pedagogical principles: the teaching principles that research and practice suggest are effective for this age range and subject matter.

1.2 Who this framework is for

The primary audiences are curriculum designers, classroom educators, and school administrators making decisions about AI literacy programs for students in the 10-to-13 age range. Secondary audiences include researchers studying K-12 AI literacy outcomes and policymakers designing AI education mandates at the state or district level.

This framework is not a school policy document. It does not address teacher training requirements, infrastructure requirements, device access, or school AI use policies. Those are complementary considerations; this framework addresses instructional content and learning outcomes only.

1.3 Age scope and rationale

The 10-to-13 range is chosen deliberately. Developmental research on abstract reasoning (Piaget, 1952; Inhelder & Piaget, 1958) establishes that students in this range are entering or consolidating the formal operational stage — capable of hypothetical-deductive reasoning, probability thinking, and reasoning about systems that operate through principles not directly visible. These capacities are exactly what AI literacy requires: reasoning about how probabilistic systems behave, understanding why training data shapes outputs, and evaluating AI systems against ethical criteria.

UNESCO's 2024 AI Competency Framework for School Students assigns its most cognitively demanding competencies — critical evaluation of AI outputs, ethical reasoning about AI design choices — to "upper secondary" students. This framework argues that with appropriate scaffolding (see Section 5), these competencies are achievable in the 10-to-13 range, 3–4 years earlier than UNESCO's implicit expectation. The scaffolding principle at stake is experience before label (see Section 5, Principle 1): concepts that appear cognitively demanding when presented abstractly become accessible when grounded in prior interactive experience.

Research note (v0.1): The developmental readiness claim above draws primarily on Piaget (1952) and UNESCO (2024). Subsequent revisions will incorporate additional cognitive development literature, including work on scientific reasoning in early adolescence (Kuhn, 2006) and the design of scaffolded inquiry environments (Zimmerman & Klahr, 2000).

1.4 What this framework does not claim

  • This framework does not claim Wibbit's product fully implements every concept and competency defined here. The Curriculum Alignment Report documents current coverage and gaps with specificity.
  • This framework is not a comprehensive AI education framework. It is specifically an AI literacy framework for the 10-to-13 age range, focused on understanding AI systems as they exist today — particularly large language models, computer vision systems, and generative AI.
  • This framework does not address how to teach AI as a vocational or engineering skill. Its focus is literacy: the capacity to understand, critically evaluate, and be an informed citizen in a world shaped by AI systems.

Section 2: Concept Sequence

2.1 Organizing principles

The concept sequence is organized as a dependency graph. Each concept has defined prerequisites; no concept is introduced until its prerequisites are established. This sequencing principle is grounded in cognitive load theory (Sweller, 1988; Sweller, van Merriënboer & Paas, 1998) and in the empirical observation — consistent across AI4K12's grade-band progressions, UNESCO's competency framework, and Wibbit's curriculum evidence — that abstract AI concepts require concrete prior experience to be durable.

The sequence is organized in four tiers:

  • Tier 1 — Foundation: Accessible at first encounter; require no AI-specific prerequisites.
  • Tier 2 — Mechanism: Require one or more Tier 1 concepts as active prerequisites.
  • Tier 3 — Critical Use: Require Tier 2 concepts; most directly connected to safe and responsible use of AI tools.
  • Tier 4 — Architecture and Systems: Appropriate for the upper end of the 10–13 range or as extension material; require Tier 2 prerequisites.

External-framework alignment is noted for each concept where applicable.


2.2 Tier 1 — Foundation Concepts

Concept 1.1 — Tokens and symbol representation

AI systems process text not as words but as sub-word units (tokens). Each token is mapped to a numeric identifier. The vocabulary of tokens — the complete set of sub-word units a model can recognize — is fixed at training time.

  • AI4K12: Big Idea 2 (Representation & Reasoning) — representing data in computational systems, grades 6–8.
  • CSTA: 2-DA-07 (Represent data using multiple encoding schemes).
  • Literature: Sennrich et al. (2016) — byte-pair encoding; Kudo & Richardson (2018) — SentencePiece.

Concept 1.2 — Probabilistic generation

Language models generate text by calculating a probability distribution over the vocabulary at each position and sampling from that distribution. Output is probabilistic, not deterministic; the same input can produce different outputs. Temperature controls the sharpness of the sampling distribution.

  • AI4K12: Big Idea 3 (Learning) — model output as probabilistic; Big Idea 2 — probability as a representation of model state.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 3 (Machine Learning), grade 6–8 outcome on model output as probabilistic.
  • UNESCO: AI Competency Framework (2024) — "AI Fundamentals" competency area.
  • Literature: Shannon (1948) — language as probability distribution; Brown et al. (2020) — GPT-3 and few-shot prompting.

Concept 1.3 — AI as pattern-matching, not understanding

[Novel relative to external frameworks — see note.]

A language model generates plausible-sounding text by learning statistical patterns in training data; it does not possess beliefs, intentions, understanding, or goals. Outputs that appear to reflect reasoning are the product of pattern completion, not cognition. This concept is foundational because it corrects the most common and persistent misconception students bring to AI literacy, and because it undergirds accurate interpretation of AI behavior throughout the curriculum.

  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 1 (Humans and AI), grade 6–8 outcome on distinguishing AI behavior that "appears intentional" from AI behavior that results from pattern-matching. Wibbit's placement of this concept in Tier 1 — rather than as a late-stage critical thinking module — is a deliberate pedagogical choice: misconceptions established early are harder to displace once further knowledge is built on them.
  • Literature: Mitchell (2021) — Artificial Intelligence: A Guide for Thinking Humans; Marcus & Davis (2019) — Rebooting AI; Bender et al. (2021) — "On the Dangers of Stochastic Parrots."

2.3 Tier 2 — Mechanism Concepts

Concept 2.1 — Training and the supervised learning loop

AI models learn by repeated cycles of prediction, comparison to a target (loss measurement), and parameter adjustment (gradient descent). Supervised learning requires labeled examples — human-annotated data that tells the model what the correct output is. The scale of training (billions of parameters, trillions of tokens) is qualitatively distinct from human learning and produces qualitatively distinct model capabilities.

  • Prerequisites: 1.1, 1.2, 1.3
  • AI4K12: Big Idea 3 (Learning) — the core Big Idea 3 concept at grades 6–8 and beyond.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 3 (Machine Learning), primary coverage.
  • UNESCO: AI Competency Framework (2024) — "AI Fundamentals."
  • AILit/OECD: "Engaging with AI" — understanding AI system design.
  • Literature: Rumelhart, Hinton & Williams (1986) — backpropagation; Mitchell (1997) — machine learning.

Concept 2.2 — Human feedback in training (RLHF)

[Novel relative to all major external frameworks — see note.]

Modern AI systems incorporate human feedback as a training signal, not only labeled data. Reinforcement Learning from Human Feedback (RLHF) is a training stage in which human raters evaluate model outputs; those ratings shape model behavior. This concept is important because it explains why AI systems behave in ways that reflect human value judgments — including human biases — and not only statistical patterns in raw text.

  • Prerequisites: 2.1
  • External-framework note: No current K-12 AI literacy framework includes RLHF as a learning objective. The inclusion rationale: RLHF is sufficiently important to understanding why modern AI systems behave as they do — why they appear helpful, why they reflect evaluator biases — that AI literacy without it is incomplete for the current generation of AI tools. Research note (v0.1): This rationale is pending further review.
  • Literature: Christiano et al. (2017) — RLHF; Ouyang et al. (2022) — InstructGPT.

Concept 2.3 — Representations and embeddings

AI systems encode tokens and concepts as vectors in a high-dimensional space. Semantic relationships — synonymy, category membership, analogical relationships — are preserved as geometric relationships in that space. Embeddings are the bridge between token representation (Concept 1.1) and context-sensitive computation (Concept 2.4).

  • Prerequisites: 1.1
  • AI4K12: Big Idea 2 (Representation & Reasoning) — representing knowledge in ways that enable computation.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 2 (Representation and Reasoning).
  • Literature: Mikolov et al. (2013) — Word2Vec; Vaswani et al. (2017) — transformer input representations.

Concept 2.4 — Attention and context

The attention mechanism allows each token to be interpreted in light of other tokens in the same context. Attention weights represent the degree to which each other token influences the interpretation of the current token. This mechanism explains why transformer-based AI can handle long-range dependencies in language — why a pronoun at position 400 can be correctly resolved to a noun at position 3.

  • Prerequisites: 2.3
  • AI4K12: Big Idea 2 (Representation & Reasoning) — world model construction; Big Idea 4 (Natural Interaction) — how language models process natural language.
  • Literature: Vaswani et al. (2017) — "Attention Is All You Need."

Concept 2.5 — Pixels and visual representation (v0.2 addition)

Computer vision begins with the same insight as language AI: perception requires converting input into numbers. A digital image is a grid of pixels, each represented as one or three integers (brightness, or R/G/B values 0–255). AI systems do not "see" in any ordinary sense — they operate on those numeric grids. A filter (a small grid of numbers) can detect patterns (edges, textures) by sliding across the image and computing a weighted sum at each position. This is convolution: the mathematical operation at the core of computer vision.

  • Prerequisites: 1.1 (number encoding as prerequisite to visual encoding), 2.3 (embeddings as numeric space — pixel grid as an analog)
  • AI4K12: Big Idea 1 (Perception) — the primary Big Idea 1 concept: computers perceive the world by converting sensory input to numbers.
  • CSTA: 2-DA-07 extension — image as a data encoding scheme.
  • Literature: LeCun et al. (1989) — convolutional neural networks; Zeiler & Fergus (2014) — visualizing what CNNs learn.

Concept 2.6 — Convolutional feature extraction (v0.2 addition)

Convolutional neural networks (CNNs) stack layers of learned filters, each detecting increasingly complex patterns. The first layers detect edges; subsequent layers detect textures, object parts, and whole objects. Pooling reduces spatial resolution while preserving the strongest detected features — allowing the network to recognize objects regardless of their exact position in the image. This layered architecture is what enables modern image classification, object detection, and computer vision at scale.

  • Prerequisites: 2.5, 2.1 (training — CNN filters are learned, not hand-designed)
  • AI4K12: Big Idea 1 (Perception) — how AI learns to perceive from training data; Big Idea 3 (Learning) — convolutional training as a specialization of supervised learning.
  • Literature: Krizhevsky et al. (2012) — AlexNet; Zeiler & Fergus (2014) — layer visualization; Wang et al. (2020) — CNN Explainer.

2.4 Tier 3 — Critical Use Concepts

Concept 3.1 — Hallucination and generation failure

AI language models can produce outputs that are fluent, confident, and factually incorrect — hallucinations. Hallucinations are not bugs in a conventional software sense; they are an expected consequence of probabilistic generation from a fixed training corpus. Understanding this concept is prerequisite to any responsible use of AI tools.

  • Prerequisites: 1.2, 2.1
  • AI4K12: Big Idea 5 (Societal Impact) — AI failure modes and their consequences.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 1 (Humans and AI), understanding limitations of AI systems.
  • UNESCO: AI Competency Framework (2024) — "Critical Evaluation of AI Outputs."
  • AILit/OECD: "Engaging with AI" — critically evaluating AI outputs for accuracy and reliability.
  • Literature: Ji et al. (2023) — hallucination survey.

Concept 3.2 — AI bias and training data as values

AI systems reflect patterns in their training data, including cultural biases, demographic skews, and values embedded in human-annotated labels. Bias encompasses multiple distinct phenomena: representation bias (who appears in training data), annotation bias (whose judgments define correct outputs), and output bias (what populations are harmed by biased model behavior). Bias produces disparate outcomes: AI systems that perform better for some groups than others, or that encode stereotyped associations.

  • Prerequisites: 2.1, 2.2
  • AI4K12: Big Idea 5 (Societal Impact) — AI's reflection and reinforcement of social patterns.
  • CSTA: 2-IC-21; CSTA + AI4K12 AI Learning Priorities (2025) — Categories 4 (Ethical AI System Design) and 5 (Societal Impacts).
  • UNESCO: AI Competency Framework (2024) — "AI Ethics" competency area.
  • AILit/OECD: "Engaging with AI" — evaluating AI outputs for bias.
  • Literature: Bolukbasi et al. (2016) — gender bias in word embeddings; Bender et al. (2021) — bias in large language models; Buolamwini & Gebru (2018) — Gender Shades.

Concept 3.3 — Prompt engineering and AI as a tool

The quality of AI outputs depends substantially on the specificity and structure of the input prompt. Prompting is a skill: vague prompts produce generic outputs; specific, context-rich prompts with explicit task framing produce substantially better results. This concept reframes students' relationship to AI as active shapers of AI behavior rather than passive recipients of AI output.

  • Prerequisites: 1.2, 3.1
  • AI4K12: Big Idea 4 (Natural Interaction) — designing effective human-AI interaction.
  • ISTE: ISTE Standards (2024) — Innovative Designer (4); ISTE "Bringing AI to School" (2023) — prompt literacy.
  • AILit/OECD: "Creating with AI" — collaborating with AI for problem-solving.
  • Literature: Brown et al. (2020) — few-shot prompting; Wei et al. (2022) — chain-of-thought prompting.

Concept 3.4 — Generative AI and latent space (v0.2 addition)

Generative AI systems create new images, audio, or text by navigating a learned internal representation called a latent space — a compressed, structured encoding of the training distribution where similar concepts are positioned near one another. In modern image generation (diffusion models), generation works by learning to remove noise step by step: given a target description, the model starts from random noise and iteratively recovers a coherent image that matches it. A guidance parameter controls the trade-off between creative variation (low guidance) and literal prompt fidelity (high guidance).

  • Prerequisites: 2.3 (latent space is an extension of the embedding concept), 2.1 (training — generative models are trained, not programmed), 2.6 (visual features — generative models produce the same kinds of visual structure CNNs detect)
  • AI4K12: Big Idea 3 (Learning) — generative training as a learning paradigm; Big Idea 5 (Societal Impact) — AI-generated content and authenticity.
  • ISTE: Innovative Designer (4) — creating with AI tools.
  • AILit/OECD: "Creating with AI" — collaborating with AI for ideation and content creation.
  • UNESCO: AI Competency Framework (2024) — "AI Design and Creation."
  • Literature: Goodfellow et al. (2014) — GANs; Ho et al. (2020) — DDPM diffusion models; Rombach et al. (2022) — Latent Diffusion Models; Kingma & Welling (2013) — VAE and latent space.

Concept 3.5 — Synthetic media and content provenance (v0.2 addition)

Generative AI can produce realistic images, audio, and video of events and people that do not exist. The realism of AI-generated content has increased dramatically since 2014. Detecting synthetic media is an adversarial arms race: as generators improve, detection tools that worked before the improvement fail — making detection unreliable as a long-term strategy. An alternative approach is provenance: cryptographically signing content at the moment of creation with metadata (tool, date, AI-generated flag), so that authentic content can be verified rather than relying on detecting forgeries. Standards like C2PA formalize this approach. A separate but related problem is consent: AI models trained on creative work without the creator's permission raise unresolved legal and ethical questions about data rights.

  • Prerequisites: 3.4 (generative AI — synthetic media is produced by generative models), 3.2 (bias and data rights — training data consent as an ethics dimension), 4.4 (adversarial examples — the detection arms race is structurally similar to adversarial robustness)
  • AI4K12: Big Idea 5 (Societal Impact) — AI-generated content and its social consequences.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 5 (Societal Impacts), IP and content rights.
  • UNESCO: AI Competency Framework (2024) — "Critical Evaluation of AI Outputs"; "AI Ethics" — data rights.
  • Literature: Karras et al. (2019/2020) — StyleGAN1/2; Verdoliva (2020) — deepfake detection survey; C2PA Standard 2.0 (2024); Shan et al. (2023) — Glaze; Chesney & Citron (2019) — deepfakes and democracy.

2.5 Tier 4 — Architecture and Systems Concepts

Concept 4.1 — Neural networks and backpropagation

Neural networks consist of layers of parameters (weights) connected in a directed graph. Training adjusts these weights through backpropagation — a gradient-based update rule that propagates error signals backward through the network layer by layer. At the middle-school level, this concept is best approached experientially: students who have built a thorough understanding of gradient descent and loss (Concept 2.1) can approach backpropagation as "figuring out which steps each layer should take."

  • Prerequisites: 2.1, 2.3
  • AI4K12: Big Idea 3 (Learning) — 9–12 grade band depth.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 3 (Machine Learning), upper range.
  • Literature: Rumelhart, Hinton & Williams (1986); LeCun, Bengio & Hinton (2015).

Concept 4.2 — Overfitting, generalization, and model evaluation

A model that performs well on training data may perform poorly on new data — this is overfitting. Generalization, the ability to apply learned patterns to new situations, is the actual goal of training. The train/test/validation split is the standard experimental design for measuring generalization. This concept introduces students to a fundamental epistemological question: how do we know that a learned pattern is genuinely general rather than an artifact of the training set?

  • Prerequisites: 2.1
  • AI4K12: Big Idea 3 (Learning) — 9–12 grade band, substantially exceeds 6–8 expectation.
  • CSTA: 2-DA-09 (partially); CSTA + AI4K12 AI Learning Priorities (2025) — Category 3 (Machine Learning).
  • Literature: Bishop (2006) — Pattern Recognition and Machine Learning.

Concept 4.3 — ML paradigm diversity

[Novel relative to external frameworks — see note.]

The field of machine learning encompasses multiple philosophical approaches to learning from data: connectionist (neural networks), symbolic (rule-based and logic), Bayesian (probabilistic inference), evolutionary (genetic algorithms), and analogical (instance-based methods). Understanding that AI is not a single approach — and that different paradigms make different tradeoffs — prevents students from over-generalizing from their experience of LLMs to claims about AI in general.

  • Prerequisites: 2.1
  • External-framework note: No K-12 AI literacy framework requires paradigm-level understanding of ML schools. The inclusion rationale: students who understand only connectionist/LLM AI are at risk of conflating "how LLMs work" with "how AI works."
  • Literature: Domingos (2015) — The Master Algorithm.

Concept 4.4 — Adversarial examples and robustness

AI systems can be deliberately or accidentally misled by inputs that appear normal to humans but produce dramatically incorrect outputs from the model. Adversarial examples — inputs crafted to exploit vulnerabilities in a model's learned decision surface — demonstrate that AI systems do not "understand" inputs in any robust sense. This concept deepens and stress-tests the pattern-matching framing established in Concept 1.3.

  • Prerequisites: 1.3, 2.1
  • AI4K12: Big Idea 5 (Societal Impact) — failure modes and security implications.
  • CSTA: CSTA + AI4K12 AI Learning Priorities (2025) — Category 4 (Ethical AI System Design).
  • Literature: Szegedy et al. (2014) — adversarial examples; Goodfellow, Shlens & Szegedy (2015).

Concept 4.5 — Multimodal embedding spaces (v0.2 addition)

The same attention mechanism that allows transformers to process language can be applied to images: an image is divided into fixed-size patches, each patch treated as a token and encoded as a vector. With contrastive training (CLIP and related systems), image embeddings and text embeddings are pulled into the same shared vector space, so "a dog running in a field" and an image of that scene occupy nearby positions. This shared space enables cross-modal retrieval (finding an image from a text description), visual question answering, and text-to-image generation with guidance. It also preserves the failure modes of unimodal systems: multimodal AI can hallucinate, show bias from training data, and fail on adversarial inputs.

  • Prerequisites: 2.3 (embeddings as vector space), 2.4 (attention mechanism), 2.5 (image as number grid — visual input representation)
  • AI4K12: Big Idea 4 (Natural Interaction) — multimodal natural interaction; Big Idea 2 (Representation & Reasoning) — shared cross-modal representations.
  • Literature: Dosovitskiy et al. (2020) — Vision Transformer (ViT); Radford et al. (2021) — CLIP; Liu et al. (2023) — LLaVA.

Section 3: Competency Definitions

Competency levels are defined as Introductory, Intermediate, and Proficient. Definitions are observable behaviors — what a student does, not what a student knows. "Can recite the definition of hallucination" is not a competency. "Identifies a specific hallucination in an AI output and explains its probable cause in terms of training data and probabilistic generation" is.

The three levels are calibrated to the 10–13 age range with typical progression from Introductory at the start of a first AI literacy course to Proficient by the end of a second or third course. v0.1 note: This calibration reflects Wibbit's curriculum design experience; validation against independent cohort outcome data is in progress.


Concept 1.1 — Tokens and symbol representation

Level Observable behavior
Introductory Given a short sentence, predicts how an AI tokenizer would segment it — placing splits approximately correctly, including subword splits on unfamiliar words. Recognizes that token IDs are numbers, not meanings.
Intermediate Explains why the same word might be tokenized differently in different contexts (compound words, punctuation, capitalization). Predicts downstream effects: more tokens → slower, costlier output; unusual tokens → less predictable model behavior.
Proficient Reasons about how tokenizer vocabulary choices affect what patterns a model can learn. Explains the connection between token representation and embedding lookup, and why a discrete token is the input to a continuous representation system.

Concept 1.2 — Probabilistic generation

Level Observable behavior
Introductory Explains that AI text generation involves selecting among probable next tokens, not "deciding" what to say. Identifies that two runs of the same prompt may produce different outputs and explains why.
Intermediate Given a temperature parameter (Low / Medium / High), predicts the effect on output variability before running it. Connects hallucination to probabilistic generation: the model selects a plausible-sounding token, not a verified-true token.
Proficient Describes the output of a language model as a probability distribution over vocabulary, updated at each position by the preceding context. Reasons about when high-temperature outputs are appropriate (creative generation) vs. when low-temperature outputs are appropriate (factual tasks). Evaluates a given AI output for hallucination risk based on what types of information are unlikely to be well-represented in training data.

Concept 1.3 — AI as pattern-matching, not understanding

Level Observable behavior
Introductory Given an AI output that appears to "know" something, explains why it doesn't mean the AI "understands." Provides at least one example of an AI output that looks knowledgeable but reflects training data patterns rather than reasoning.
Intermediate Distinguishes between: (a) claims that AI outputs reflect genuine understanding; (b) outputs that demonstrate useful pattern completion without understanding. Identifies the specific mechanism (statistical co-occurrence, not semantic grounding) behind an AI output on an unfamiliar topic.
Proficient Evaluates popular claims about AI "reasoning" or "creativity" against the pattern-matching explanation. Applies the pattern-matching framing to interpret AI failures — when an AI produces a plausible-sounding but wrong answer, traces the failure to training data patterns rather than reasoning errors.

Concept 2.1 — Training and the supervised learning loop

Level Observable behavior
Introductory Explains the predict-check-adjust cycle of supervised learning. Explains what loss measures and why training tries to reduce it. Understands that training requires labeled data: human-annotated examples of correct outputs.
Intermediate Explains gradient descent as a direction-finding mechanism in high-dimensional space — why training takes many small steps rather than one large adjustment. Connects data quality to model quality: mislabeled or unrepresentative data degrades model performance. Distinguishes between pre-training and fine-tuning.
Proficient Reasons about the effect of dataset size and diversity on trained model behavior. Explains why scale produces qualitatively different model capabilities, not merely more of the same. Evaluates claims about training data quality and identifies the types of problems (annotation bias, distribution skew, memorization) that data quality failures produce.

Concept 2.5 — Pixels and visual representation

Level Observable behavior
Introductory Explains that a digital image is a grid of numbers (pixel values 0–255 per channel). Given an image, predicts the approximate numeric values for a bright pixel vs. a dark pixel. Understands that a filter is a small grid of numbers that slides across an image to detect patterns.
Intermediate Predicts what kind of filter would detect vertical edges vs. horizontal edges vs. blur, and explains why. Explains why early CNN layers detect edges and textures rather than objects. Connects the pixel representation concept to the token representation concept: both are discrete units that AI converts to numbers as its first step.
Proficient Explains how a learned filter emerges from training rather than being designed by hand. Reasons about why the same filter architecture can detect different features depending on the training data it saw. Connects the numeric nature of image representation to why adversarial examples (small pixel-level perturbations) can fool AI systems that appear to "see" correctly.

Concept 3.1 — Hallucination and generation failure

Level Observable behavior
Introductory Identifies hallucinations in AI output samples when they are pointed out. Understands that hallucinations are a structural property of probabilistic generation, not errors the AI "knows about" or can self-correct.
Intermediate Predicts which types of AI outputs are at higher hallucination risk (specific names, dates, citations, rare events) vs. lower risk (widely-attested facts, general explanations). Identifies hallucinations in AI outputs without prompting. Identifies which claims in an AI output require external verification.
Proficient Reasons about why some prompting strategies reduce hallucination risk. Evaluates AI outputs systematically: separates high-confidence claims from hallucination-risk claims within a single response. Explains the relationship between hallucination and expressed confidence (calibration).

Concept 3.2 — AI bias and training data as values

Level Observable behavior
Introductory Identifies a stereotype or representation bias in an AI output when given an example. Understands that training data is not neutral: what's overrepresented and what's absent shapes AI behavior.
Intermediate Distinguishes between types of bias: representation bias (who appears in training data), annotation bias (whose judgments define correct labels), and output bias (who is harmed by biased outputs). Identifies potential bias sources in a described training process. Explains disparate impact: bias that produces measurably worse outcomes for specific groups.
Proficient Evaluates a described AI system for bias risk across multiple dimensions. Reasons about tradeoffs in bias mitigation: reducing one type of bias may increase another; adjusting training labels has downstream effects. Assesses the ethical implications of an AI system's deployment in a specific context, identifying who bears the costs of system errors.

Concept 3.3 — Prompt engineering and AI as a tool

Level Observable behavior
Introductory Given two prompts (vague vs. specific) for the same task, predicts which will produce better output before running them and explains why.
Intermediate Writes an improved prompt for a given task that includes: task context, output format specification, and audience framing. Explains why each addition improves the result.
Proficient Iteratively refines a prompt through at least two improvement cycles, diagnosing why each iteration produces better or worse results in terms of probabilistic generation and training data coverage. Identifies prompt strategies that reduce hallucination risk for a given task type.

Concept 3.4 — Generative AI and latent space

Level Observable behavior
Introductory Explains that AI image generation works by navigating a learned "idea space" where similar images are positioned near each other. Understands that generative AI does not "look up" images — it creates them by decoding a position in latent space.
Intermediate Given a guidance slider, predicts whether high or low guidance will produce more literal vs. more creative outputs before observing the result, and explains the tradeoff. Explains why diffusion model generation involves many steps (iterative denoising) rather than a single output. Connects latent space to the embedding space concept: both are learned vector spaces where proximity encodes similarity.
Proficient Explains why two text prompts that describe similar concepts should produce images that are close in latent space. Identifies the key difference between GAN-era and diffusion-era generation: GANs produce outputs in one step from a generator/discriminator adversarial loop; diffusion models produce outputs through iterative denoising trained on noise prediction. Reasons about why guidance produces a quality-creativity tradeoff in terms of the latent space geometry.

Concept 3.5 — Synthetic media and content provenance

Level Observable behavior
Introductory Identifies that some images and videos they encounter may be AI-generated. Understands that detecting synthetic media is unreliable because generators and detectors improve together. Explains what a content credential (C2PA) is and what information it records.
Intermediate Explains the detection arms race: improvements to generators make current detectors fail, creating a permanent reliability problem for post-hoc detection. Distinguishes between provenance (labeling at creation) and detection (guessing after the fact) as different solutions to different problems. Identifies the consent dimension of generative AI: training on creative work without permission raises questions about data rights that are not yet resolved by law.
Proficient Reasons about why provenance systems require adoption across creators, platforms, and viewers to be effective — an adoption problem, not only a technical problem. Evaluates a claimed AI detection tool against the arms-race framing: explains why a tool that works today may not work after the next generator update. Connects the consent problem to the training bias problem: if training data is ethically questionable (unconsented scraping), the resulting model's outputs carry that problem forward.

Section 4: Assessment Criteria

4.1 Principles for AI literacy assessment at ages 10–13

Assessments should be designed around the principle that understanding precedes vocabulary. A student who accurately predicts AI behavior, identifies AI failure modes in new examples, and reasons about AI systems without prompting has demonstrated literacy — regardless of whether they use technical terminology. A student who recites correct definitions but applies them incorrectly on new examples has demonstrated memorization, not literacy.

Effective assessment at this age range tends toward:

  • Transfer tasks — presenting a student with an AI system, output, or scenario they have not previously encountered and asking them to apply conceptual understanding.
  • Explanation tasks — asking the student to explain AI behavior to a peer, in their own words, without reference to technical definitions.
  • Prediction tasks — asking the student to predict AI behavior under changed conditions (different temperature, different training data, different prompt) before seeing the result.
  • Identification tasks — asking the student to find hallucinations, biases, or prompt engineering opportunities in real AI outputs.

4.2 Competency level criteria

Introductory — observable evidence:

  • Correctly applies the concept to at least 2 of 3 novel examples (not examples used in instruction).
  • Can explain the concept in their own words; vocabulary is acceptable but not required.
  • May require structured prompting to apply the concept; does not spontaneously apply it without a cue.

Intermediate — observable evidence:

  • Correctly applies the concept to novel examples with ≥ 80% consistency, including edge cases.
  • Predicts behavior under changed conditions before observing the result.
  • Begins to spontaneously apply the concept without prompting when analyzing AI outputs.
  • Can identify when the concept applies and when it does not.

Proficient — observable evidence:

  • Applies the concept reliably across diverse contexts and AI system types.
  • Connects the concept to other framework concepts (e.g., explains hallucination risk simultaneously in terms of probabilistic generation and training data gaps).
  • Spontaneously applies the concept when analyzing unseen AI outputs.
  • Can articulate limitations of the concept: when the explanation is incomplete or when a more complex account is needed.

4.3 Practical assessment formats

Token analysis task (Concept 1.1): Given 5 input strings of varying complexity, predict the tokenization output. Introductory: ≥ 3/5 on familiar-type inputs. Intermediate: ≥ 4/5 including at least 1 novel compound word or rare term.

Probabilistic generation task (Concept 1.2): Given three temperature settings and a fixed prompt, rank the expected outputs from most to least variable before running them, then explain one hallucination risk in the highest-temperature output.

Hallucination audit task (Concept 3.1): Given a 3-paragraph AI-generated document on a verifiable topic, identify and annotate all claims that require verification, mark the one highest-risk claim, and explain why it is higher risk than the others. Introductory: identifies ≥ 2 verification-required claims. Proficient: correctly identifies the highest-risk claim AND explains the risk in terms of training data coverage.

Bias scenario task (Concept 3.2): Given a described AI system (e.g., a hiring screening tool, an image captioning service, a language translation model), identify: (a) the training data sources most likely to create bias, (b) who is most at risk from biased outputs, and (c) one mitigation strategy and its tradeoffs.

Prompt engineering task (Concept 3.3): Given a task description, write three prompts — basic, improved, and optimized — and explain the reasoning for each improvement. Assessment criterion: does the optimized prompt include task context, output format specification, and audience framing? Does the student correctly predict which prompt will perform best before running it?

Visual representation task (Concept 2.5): Given a grayscale image and a set of 3×3 filter kernels, predict which filter will detect horizontal edges vs. vertical edges vs. apply blur, before applying them. Explain why the numeric values in the filter produce that effect. Introductory: correctly identifies edge vs. blur filters. Proficient: explains the mechanism (the kernel's positive/negative weight pattern creates a difference detector).

Synthetic media evaluation task (Concept 3.5): Given two images — one with a C2PA content credential and one without — and a description of a detection tool's results: (a) explain which is more trustworthy and why; (b) explain why the detection result alone is insufficient evidence; (c) identify what the content credential cannot prove even when present. Proficient: correctly addresses the adoption-dependency problem (provenance only works if creators and platforms use it).


Section 5: Pedagogical Principles

The following principles govern AI literacy instruction for the 10-to-13 age range. They are derived from Wibbit's curriculum design practice and from the constructivist and constructionist research traditions (Piaget, 1952; Papert, 1980; Papert & Harel, 1991). They are stated in general terms, applicable to any AI literacy instructional approach.

Principle 1: Experience precedes vocabulary

Students should encounter a concept through observation, manipulation, or prediction before they receive the technical term for it. Vocabulary introduced before experience tends to produce surface-level pattern matching rather than conceptual ownership. This principle is consistent with Vygotsky's zone of proximal development (Vygotsky, 1978): scaffolded experience brings a concept into the student's reach before technical vocabulary makes it permanent.

In practice: A student who has watched a language model generate different outputs from the same prompt at different temperature settings has already experienced probabilistic generation. The word "temperature" lands on top of an experience, not in a vacuum. A student who receives the term first and the experience second may correctly associate them but has not developed the intuitive understanding that supports transfer.

Principle 2: Interactivity is the instructional medium

At ages 10–13, AI literacy concepts are durable when learned through active manipulation rather than passive observation. This is not an argument for "engagement" in a generic sense — it is a specific claim that the concepts in this framework (probabilistic generation, training dynamics, embedding space geometry, attention, convolution, latent space navigation) are particularly well-suited to interactive demonstration and particularly resistant to verbal-only instruction.

Research basis: Constructionist learning (Papert, 1980; Papert & Harel, 1991) establishes that students build robust knowledge structures through the experience of constructing something. Embodied cognition research (Barsalou, 2008) supports the claim that abstract concepts are grounded in sensorimotor simulation, which interactive demonstrations can approximate. v0.1 note: Additional cognitive science citations are planned for subsequent revisions.

Principle 3: Concept dependency is the sequencing constraint

The order in which concepts are introduced should follow their dependency relationships, not their nominal complexity or external frameworks' grade bands. A concept nominally "advanced" (backpropagation) may be entirely accessible to a 12-year-old who has built thorough understanding of gradient descent — because the prerequisite experience makes the new concept approachable. A concept nominally "basic" (AI bias) may land shallowly if introduced before students have a working model of training and labeled data.

External-framework note: This principle implies divergence from several external frameworks that organize AI concepts by grade band rather than dependency chain. The divergence reflects different instructional assumptions: grade-band organization assumes students encounter AI concepts year by year across subjects; dependency-graph organization assumes a self-contained course sequence. These are different contexts, not disagreements about cognitive development.

Principle 4: Ethics and failure modes are first-class content

The societal implications of AI — who is harmed by AI failure, whose interests are reflected in training data, what it means to deploy AI in high-stakes contexts — should be introduced in the same course sequence as technical foundations, not deferred to an "AI ethics" unit. Students ages 10–13 are cognitively and morally ready to engage with these questions (cf. Kohlberg, 1984; Gilligan, 1982). Waiting until later risks students developing the false impression that AI is a neutral tool and that ethical questions are advanced or optional.

External-framework alignment: This principle is broadly consistent with AI4K12's Big Idea 5 (Societal Impact), UNESCO's emphasis on "ethical agency" as a core competency, and the ISTE Digital Citizen standard.

Principle 5: Depth before breadth on a chosen paradigm

AI literacy for ages 10–13 should teach one AI paradigm with enough depth that students can reason from first principles — rather than surveying many paradigms at the level of names and descriptions. For the current moment in AI development, the large language model paradigm is the appropriate depth target: it is the paradigm students encounter most frequently in their own AI use, it is rich enough to support all concepts in Tiers 1–3 of this framework, and genuine understanding of LLMs transfers to critical evaluation of AI more broadly. Course 3's extension into computer vision and generative AI follows the same principle — depth through the convolution and diffusion model concepts, not a survey of all vision architectures.

Trade-off acknowledged: Depth-before-breadth produces students with strong LLM and generative-AI literacy who understand that other AI paradigms exist and can reason about them conceptually, but who cannot demonstrate the same first-principles understanding of reinforcement learning or symbolic AI systems. This is the right trade-off for a first through third AI literacy course; a fourth course can expand paradigm breadth.

Principle 6: Anti-anthropomorphism is a first-contact principle

The framing of AI as a pattern-matching system rather than a reasoning, understanding, or goal-having agent should be introduced in the first lesson of a first course and maintained consistently throughout. This is not merely a conceptual correction; it is a safety principle. Students who treat AI outputs as the product of understanding are more likely to trust AI outputs uncritically, to defer to AI when they should not, and to fail to detect hallucinations.

This principle has an immediate language implication: instruction should consistently use "generates," "predicts," and "calculates" rather than "knows," "thinks," and "believes" when describing AI behavior.

Research basis: Attribution of mental states to non-human agents is a well-documented human tendency (Heider & Simmel, 1944; Dennett, 1987 — the intentional stance). Proactively naming the tendency and providing a corrective framework is more effective than waiting for students to develop misconceptions and correcting them retroactively.


Gaps and Planned Extensions

This framework's current version (v0.2) covers LLM-paradigm and generative-AI concepts with depth. Several concept areas remain planned for future versions as Wibbit's course catalog grows:

Gap Relevant frameworks Status
General digital privacy and surveillance (not limited to AI training data context) CSTA 2-IC-23; UNESCO Privacy; AILit digital wellbeing Open — Course 3 covers training data consent; general digital privacy is a future course
Environmental impact of AI CSTA AI Learning Priorities Cat. 5; AILit Domain 1 Open — not covered in Courses 1–3
AI agents and autonomous systems AI4K12 Big Idea 3 extension; OECD "Managing AI" Open — future course territory
High-stakes algorithmic decision-making (hiring, criminal justice, credit) AI4K12 Big Idea 5; UNESCO AI Ethics Open — future course territory
UNESCO equity, linguistic diversity, cultural accessibility UNESCO core posture Open — curriculum and positioning gap for international expansion

The following gaps from v0.1 are now closed:

Previously listed gap Closed by
Perception / computer vision / sensor-based AI (AI4K12 Big Idea 1) Concepts 2.5 and 2.6 (Courses 1–3 map: C3-M1, C3-M2)
IP ownership and training data rights Concept 3.5 — synthetic media and content provenance (C3-M5)
Creating with AI (OECD/AILit "Creating with AI" domain) Concept 3.4 — generative AI and latent space (C3-M3)
Multimodal AI interaction Concept 4.5 — multimodal embedding spaces (C3-M4)
Synthetic media as a distinct literacy domain Concept 3.5 — synthetic media and content provenance (C3-M5)

This framework is updated when new course modules ship. For questions about alignment coverage, contact the Wibbit team at hello@wibbit.ai.


Version History

Version Date Notes
v0.1 May 2026 First publication. Covers Courses 1–2 (LLM-paradigm concepts, Tiers 1–4). Several sections flagged for further research review.
v0.2 May 2026 Course 3 extension. Adds Concepts 2.5, 2.6, 3.4, 3.5, and 4.5 (computer vision, convolutional feature extraction, generative AI / latent space, synthetic media / provenance, multimodal embedding spaces). Adds competency definitions for Concepts 2.5, 3.4, and 3.5. Adds two assessment tasks. Gaps table updated.

Governing curriculum: Wibbit Pedagogy Manifesto · Standards Landscape Map · Curriculum Alignment Report

← Back to all standards documents

Questions? hello@wibbit.ai