AI the Tool maker and Tool user: New roles for agents in study and research

Dominik Lukeš (Dominik.Lukes@oerc.ox.ac.uk)

dominiks-handouts.pages.dev/hkcb

Introduction

Introduction

What was the last time you used ChatGPT?

Introduction

I haven't used ChatGPT regularly in months and not at all last week.

Screenshot illustrating daily Codex use

Introduction

I have been using Codex every day

codex profile card 1

Introduction

What this means

  • I am still using AI
  • I am using the same model power in a different context
  • ChatGPT is no longer my main interface to the large language model

Introduction

Codex

The Codex desktop app

Introduction

Linux only? VS Code plugin

Codex VS Code plugin

Chatbot vs agent

Agent

Model using tools in a loop to achieve complex tasks that require multiple steps without user intervention.

Chatbot

Chatbot answering questions based on prompts and documents (sometimes called agent e.g. Customer service agent)

Introduction

How it works: chatbot vs agent

Chatbot

  • Prompt
  • Documents
  • Web-search
  • Model responds

Agent

  • User types prompt
  • Model makes plan
  • Model calls tools (search, code, command)
  • Computer runs tools and sends results to model
  • Model changes plan
  • Model responds

Introduction

ChatGPT v Codex

ChatGPT

  • Web-based app
  • Uses basic tools
  • Works on uploaded files

Codex

  • Desktop app
  • Uses tools in local sandbox
  • Works on local files

Introduction

ChatGPT and Codex

ChatGPT

  • Powered by GPT-5
  • Login with SSO
  • Usage limits
  • Available to all

Codex

  • Powered by GPT-5
  • Login with SSO
  • Usage limits
  • Available after gate

Introduction

Why that matters

  • The work is more transparent
  • Outputs are easier to retrieve later
  • Files stay in the normal workflow
  • The model can act through tools on the computer

Introduction

A step up in utility

AI as oracle
AI as tool maker
AI as tool user

What Codex Is

What Codex Is

What exactly is Codex?

  • General computer productivity tool
  • Desktop app
  • Works on the local machine
  • Acts through tools on that machine
  • Powered by the GPT-5.4 family

What Codex Is

What tools does it use?

  • Run commands
  • Open and inspect files
  • Create files and folders
  • Run computer code

What Codex Is

Folders, projects, and sandboxes

  • Codex works in folders
  • A project is really a folder
  • A sandbox limits what it can touch
  • Permissions matter

What Codex Is

ChatGPT and Codex comparison

TopicChatGPTCodex
Main formCloud-based appDesktop app
How you control itBy chattingBy chatting
Model powerGPT-5.4 familyGPT-5.4 family
Where chat history livesIn the cloud appIn the Codex app, but important project state should still be saved into files
Where working files liveCloud or uploaded into the cloud appLocal machine
What tools it can useLimited by ChatGPT Code InterpreterAnything you can do on your computer with the keyboard

What Codex Is

Accounts and usage limits

  • Sign in with your ChatGPT Edu account
  • Codex has usage limits
    • Higher reasoning levels use allowance faster

What Codex Is

Alternatives to Codex

  • Claude Code
  • Cursor
  • Google Antigravity

Benefits of working locally

Benefits of working locally

Where the work lives

  • The work is saved to a folder on your machine
  • Outputs become part of your normal file structure
  • The chat is not the main archive

Benefits of working locally

Benefits of local work

  • Retrieval
  • Portability across tools
  • Transparency

Benefits of working locally

What gets sent to the model

TaskWhat may be enough
OrganiseFile name, size, dates, header, excerpt
SummarisePart of the file or the whole file
InterpretPart of the file or the whole file

Benefits of working locally

Trade-offs of local work

AspectAdvantageTrade-off
File handlingFiles stay in your normal folder structureYou need to manage that structure properly
RetrievalOutputs are easier to find laterYou need to save and name things well
ScaleEasier to work across many local filesYou need to know what folder you are giving it access to
PrivacySome tasks can stay mostly localSome tasks still require remote model processing
PersistenceThe work is tied to your project folderIt does not automatically follow you to every machine

Benefits of working locally

Some other nice features

Benefits of working locally

Context Transparency

Codex context transparency view

Five Things You Can Do with Codex

Five Things You Can Do with Codex

Five things you can do with Codex

  • Catalogue and organise data
  • Manage projects and create project outputs
  • Build and maintain a note system
  • Create small tools, websites, and dissemination outputs
  • Set up and control your computer

Five Things You Can Do with Codex

Catalogue and organise your data

Five Things You Can Do with Codex

Example: organising a downloads folder

  • Ask what is in the folder
  • Check names, dates, sizes, and duplicates
  • Ask for suggested organising schemes
  • Rename or move files after review
Organising a downloads folder with Codex

Five Things You Can Do with Codex

Manage projects and create project outputs

Five Things You Can Do with Codex

Research project outputs

  • Trackers
  • Timelines
  • Literature-review tracker
  • Dashboards
  • Fieldwork or admin timeline

Five Things You Can Do with Codex

Research project example

  • Read milestones and task lists from a project folder
  • Build a literature-review tracker
  • Turn notes into a dashboard or timeline
Research project tracker example

Five Things You Can Do with Codex

Build and maintain a note system

Five Things You Can Do with Codex

Notes example

  • Design a note structure in Markdown
  • Add new notes into the same system
  • Link notes to each other
  • Summarise material into reusable research notes
Markdown note system example

Five Things You Can Do with Codex

Create small tools, websites, and dissemination outputs

Five Things You Can Do with Codex

From script to website

  • Start from a prompt and a folder of files
  • Generate a script, tool, or web page
  • Keep the files locally so you can keep editing them
  • Publish later if you want to
From a script to a published website

Five Things You Can Do with Codex

Set up and control your computer

Five Things You Can Do with Codex

Setup example

  • Install a utility or CLI tool
  • Configure it on the machine
  • Keep a log of what changed
  • Repeat the same setup in another project folder
Machine setup example

Bringing it all together

How to Use Codex Effectively

How to Use Codex Effectively

Put instructions in files

  • AGENTS.md
  • Plan files
  • Notes and reports
  • Example folders

How to Use Codex EffectivelyWhat Markdown looks like

What Markdown looks like

Markdown elementExample
Heading# Project plan
Bullet list- interview notes
Linkinstructions
Markdown source example

How to Use Codex EffectivelyWhat to put in AGENTS.md

What to put in AGENTS.md

IncludeWhy it helps
What the folder is forKeeps the project purpose visible
Naming conventionsMakes new files consistent
Expected outputsTells Codex what to create
Things to avoidReduces accidental changes
Privacy or sharing notesMakes handling expectations explicit
AGENTS.md example

How to Use Codex Effectively

Plan mode and project memory

  • Use plan mode before larger tasks
  • Review the plan before execution
  • Save plans and results into files
  • Do not leave important project state only in the chat

How to Use Codex Effectively

Point Codex to files, URLs, and examples

  • Point Codex to AGENTS.md, plans, and notes
  • Give it URLs with instructions or source material
  • Ask it to search and save a report into the folder
  • Point it at another folder and ask it to follow that example

How to Use Codex Effectively

Skills, plugins, interface, and model choice

  • Bundled skills first
  • Learn the file tree and previews
  • Use lower model or reasoning settings when enough
  • Keep backup and sync in place

How to Use Codex Safely

How to Use Codex Safely

Local safety basics

  • Understand permissions
  • Sandbox
  • Review before action
  • Limited scope

How to Use Codex Safely

Careful about YOLO mode!

YOLO mode warning

How to Use Codex Safely

External skills and scripts

  • Built-in first
  • Inspect outside skill
  • Explain and recreate if needed

How to Use Codex Safely

Security beyond local control

  • Public/private settings
  • Deployment
  • Secrets
  • Software review

How to Use Codex Safely

If you want something to be private, say so explicitly and then verify that it is actually private.

What do you need to know?

What do you need to know?

What do you need to know about computers in the age of AI?

Question: Who uses git to manage versions of your work?

Leaky abstractions

All non-trivial abstractions, to some degree, are leaky.

What do you need to know?

Leaky abstractions

  • Abstractions hide underlying complexity and make it seem that things just work as one thing.
  • But no abstraction is perfect. The complexity leaks, and if you have some understanding of it, you can deal with it better.
  • Abstractions are all around us. Words, objects, machines, concepts, theories.

What do you need to know?

Is this possible?

file 2fb4de740d68a2395842e1c6b284909b

What do you need to know?

History of software abstractions

  • Machine code: 89 06 D1 E0 8B 07
  • Assembler: movl (%rdi), %eax sall $1, %eax movl %eax, (%rsi)
  • C: int b = 42; int ptr_b = &b; int a = (ptr_b) * 2;
  • Python: a = b * 2
  • LLM: Write code that multiplies all values a user enters by 2

What do you need to know?

But actually most of the code we write is

jsimport ffi from 'ffi-napi';
import ref from 'ref-napi';

// 1. Define explicit types using the ref library
const intPtr = ref.refType(ref.types.int);

// 2. Load the compiled native C library and declare the function signature
const lib = ffi.Library('./libmultiply', {
  'double_value': ['void', [intPtr, intPtr]] // returns void, takes two int pointers
});

// 3. Allocate physical memory buffers to act as pointers
const b_ptr = ref.alloc(ref.types.int, 42); // b = 42
const a_ptr = ref.alloc(ref.types.int, 0);  // a = 0 (allocated space for result)

// 4. Pass the raw memory pointers to the compiled C function
lib.double_value(b_ptr, a_ptr);

// 5. Dereference the pointer to retrieve the value from physical memory
const a = a_ptr.deref(); 

console.log(`Value at pointer a_ptr: ${a}`); // Outputs: 84

Abstractions leak at every level

Abstractions leak at every levelLeaky abstractions are not just a problem for computers

Leaky abstractions are not just a problem for computers

Abstractions leak at every level

The knowledge needed today

Knowledge you don't need

  • Complex algorithms
  • Syntax of computer languages
  • Variables, functions, etc.
  • How to use developer tools

Knowledge you do need

  • How your computer works
  • Principles of software architecture
  • Developer tools landscape
  • How AI agents work
  • What is possible with software

Abstractions leak at every level

Some useful things to know

  • Computer
    • File system details
    • Plain text / Markdown vs Rich Text
    • CLI tools
    • Pipes and deamons
  • Development
    • Libraries
    • Git and GitHub
    • Content vs display
    • What an API can do for you
    • Hosting (Cloudflare)
  • Software architecture
    • When you need a database
    • Refactoring
    • Modularity
    • Security concerns
  • Agents
    • Model v harness
    • Model tool use

Abstractions leak at every level

Example: List a folder.

USER
How many PDFs are in my Downloads folder?
HARNESS → MODEL
[system: tools available — list_dir(path), read_file(path)] [user: How many PDFs are in my Downloads folder?]
MODEL · reasoning
I need to inspect the directory first. I'll call list_dir on ~/Downloads, then count files matching .pdf.
MODEL → HARNESS
<tool_call name="list_dir" args={"path": "~/Downloads"}/>
TOOL RESULT
paper-draft.pdf notes-2026-04.md receipt.pdf holiday.jpg slides.pdf budget.xlsx …
MODEL → USER
There are 12 files in Downloads. Three are PDFs: • paper-draft.pdf • receipt.pdf • slides.pdf

Abstractions leak at every level

Example 2: Check Gmail — with authentication.

USER
Summarise unread email from this week.
HARNESS → MODEL
[tools: gmail_search, gmail_read, gmail_authenticate] [user message]
MODEL · reasoning
I'll need to query Gmail. First check auth — call gmail_search and see if it errors.
MODEL → HARNESS
<tool_call name="gmail_search" args={"q": "is:unread newer_than:7d"}/>
TOOL RESULT
error
not authenticated
MODEL → HARNESS
<tool_call name="gmail_authenticate" args={}/>
HARNESS
// opens browser to OAuth consent // user clicks "Allow" // harness stores refresh token securely // model never sees the token
TOOL RESULT
authenticated as dominik@oerc.ox.ac.uk
MODEL → HARNESS
<tool_call name="gmail_search" args={"q": "is:unread newer_than:7d"}/>
TOOL RESULT
8 messages
3 from collaborators, 2 newsletters, 2 admin, 1 student
MODEL → USER
Eight unread messages this week. Three are from collaborators (one waiting on your reply), two are newsletters (skim or skip), two are admin (one needs action by Friday), and one is from a student asking about supervision.

Abstractions leak at every level

Example 3: Research and write a report.

USER
Write me a 2-page brief on retrieval-augmented generation in legal research, with citations. Save it as brief.md.
MODEL · reasoning
Plan
(1) survey current literature, (2) identify the 4–5 main approaches, (3) check for legal-domain specifics, (4) outline, (5) draft, (6) revise. Start with a broad search.
MODEL → HARNESS
<tool_call name="web_search" args={"q": "RAG retrieval-augmented generation survey 2025"}/>
TOOL RESULT
[10 results — survey papers, blog posts, two arxiv preprints]
MODEL · reasoning
I want the survey paper and the legal-domain results. Open the top two.
MODEL → HARNESS
<tool_call name="web_fetch" args={"url": "arxiv.org/abs/2503.12345"}/> <tool_call name="web_fetch" args={"url": "harvardlawreview.org/…"}/>
TOOL RESULT
[two long documents]
MODEL · reasoning
Useful. Both flag retrieval quality and citation faithfulness as the bottleneck. I should write notes before continuing or I'll lose detail.
MODEL → HARNESS
<tool_call name="write_file" args={"path": "research/notes-rag.md", "content": "# RAG notes\n\n…"}/>
TOOL RESULT
wrote research/notes-rag.md
MODEL → HARNESS
<tool_call name="web_search" args={"q": "RAG legal research case law citation faithfulness"}/>
TOOL RESULT
[8 more results, three legal-tech specific]
MODEL → HARNESS
<tool_call name="web_fetch" args={"url": "thomson-reuters.com/…"}/> <tool_call name="web_fetch" args={"url": "casetext-blog.com/…"}/>
TOOL RESULT
[two more documents]
MODEL · reasoning
Enough material. Write the outline first, then the draft.
MODEL → HARNESS
<tool_call name="write_file" args={"path": "brief-outline.md", "content": "..."}/>
MODEL → HARNESS
<tool_call name="write_file" args={"path": "brief.md", "content": "# RAG in Legal Research\n\n…(2 pages)…"}/>
TOOL RESULT
wrote brief.md (1,734 words)
MODEL → USER
Brief saved to brief.md (1,734 words). Three sources, six citations, structured around retrieval quality / citation faithfulness / legal-domain specifics. Notes are in research/notes-rag.md if you want to dig further.

Abstractions leak at every level

Big lesson from the agent revolution

It is now worth learning more about computers at every level because you can now do something about it.

Getting started

Getting started

How to get started

  • Start small
  • Use a defined folder
  • Set AGENTS.md
  • Back up
  • Monitor permissions

Getting started

First step

Open Codex (or Antigravity) and ask for an organization of your Downloads folder

Getting started

Second step

Point Codex at your course materials and ask it to build a step by step explainer of the process.

Getting started

Third step

Ask Codex to rerun the latest analysis in a paper you read.

Thank you

Dominik Lukeš (Dominik.Lukes@oerc.ox.ac.uk)

dominiks-handouts.pages.dev/hkcb