Prompts Are Not Rules

Nov 12, 2025 ai 2025 llm mlops programming

AI/ML is only half of the future. The other half is, well, old fashioned programming. What everyone is going to eventually figure out is that for many problems, AI can provide only half-baked solutions that are better solved by explicit rules—in other words, loops and conditionals. If you're serious about using AI successfully, you need to understand the fundamentals of why AI is right for some problems and not others.

To be clear, I use AI constantly. It is the right tool for many problems, and it's enhanced my productivity tremendously. But there are some problems that it just doesn't do well with: problems that require precision, zero mistakes, or perfect predictability.

The crucial difference between AI and traditional programming

In programming: you write the rule. This is done mostly with conditionals and loops. If this then that, while this do that, etc. Inputs map to outputs in a predictable way.

In machine learning: you train with examples and the machine guesses the rule. The machine reverse engineers a model—a function that roughly matches the data distribution, but not exactly. Inputs map to outputs in an unpredictable way.

Think of AI as like compressing FLAC or a WAV sound file to MP3: you keep most of what you want, but lose some detail, and the output "sounds" right until it doesn’t. Usually at the extremes is where we notice the difference, the really high or really low frequencies.

If you need exact, provable results (math, strict finance, critical control loops), don’t use AI alone. Use deterministic code that you can test and reason about. Some might say that it's fine to use AI with integrated tools that will handle this rigid logic. But will AI be 100% right about when to call or not call the tool? Plugging deterministic tools into a nondeterministic workflow will sometimes give incorrect results. If you can write a rule to solve a problem, do it. Don't force AI into the problem.

If errors are acceptable, and the rules are ambiguous, extraordinarily complex, or unknown, AI is powerful. It handles complexity and patterns we can't easily describe or even identify. Using AI means dealing with errant outputs and inaccuracies that you have to account for down the line. Determine how to deal with these errors and have a backup plan. If AI gives an obviously wrong answer, do you have it keep trying indefinitely, fall back to a formulaic response, or halt for human review?

Prompts are not rules

What's curious about all this is that there's a growing emphasis on "prompt engineering", which is an attempt to force AI to adopt and follow rules that govern its output. Don't get me wrong. Prompt and context engineering is important to guide the model. Even if we are okay with some errors, we still don't want the model going wild and giving us bizarre results. It might land on the wrong planet, but we want to keep it in the right universe.

Think of prompts as shaping the output. It will rarely be exactly what you want, but you can construct a prompt to remove things you don't want and rearrange or replace things so that the final output is good enough.

What AI and regular programming have in common: inputs and outputs

In the end, we're always dealing with data. Data goes in and data comes out. The old adage "garbage in, garbage out" holds true in both traditional programming and AI/ML. So we have to understand that anywhere AI is integrated, we're going to be dealing with inputs and outputs. But with AI, we're going to have more errors.

AI adds complexity, which is the tradeoff for all the useful things it gives us. But we still have to account for this additional complexity. To illustrate this additional overhead and how we deal with it, I'll give you two scenarios.

Simple rules and arithmetic are AI's weakness

The following interaction is typical and shows how AI models can fail at even the simplest tasks.

1User prompt 1 of 9
2Step 1. Generate a grammatically correct 7-word English sentence that describes a positive customer interaction.
3
4
5GPT-5 mini:
6The representative quickly resolved my issue with care.

I had to count a few times to make sure I wasn't going crazy. The sentence is 8 words, not 7. As we'll see, the model will not pick up on this error.

1User prompt 2 of 9
2Step 2. Remove articles and prepositions
3
4GPT-5 mini:
5Representative quickly resolved my issue with care.

It removed "the" but left the preposition "with". Moving on...

 1User prompt 3 of 9
 2Step 3. Reverse the order of the words
 3
 4GPT-5 mini:
 5care with issue my resolved quickly Representative
 6
 7User prompt 4 of 9
 8Step 4. Count the number of words
 9
10GPT-5 mini:
117

It's correct that there are 7 words. But it thought the original 8-word sentence was 7 words. Why is it saying 7 words this time? Who knows? The model didn't notice that 7 minus 1 doesn't equal 7. But it got the count right this time.

1
2User prompt 5 of 9
3Step 5. Count the total number of words removed
4
5GPT-5 mini:
63

Let's see. It removed "the" and nothing else. Saying it removed 3 words is way off.

1User prompt 6 of 9
2Step 6. State the number of words in the first sentence
3
4GPT-5 mini:
57

It seems to be going off its earlier answer about the original sentence being 7. But again it doesn't recognize that it's contradicting itself. 7 minus 1 isn't 7. 7 minus 3 isn't 7 either. It's starting to compound its own errors.

 1User prompt 7 of 9
 2Step 7. Validate the last answer
 3
 4GPT-5 mini:
 5The last answer is correct — the first sentence contains 7 words.
 6
 7User prompt 8 of 9
 8Count again
 9
10GPT-5 mini:
11The first sentence contains 7 words.

Even when asked explicitly, it doesn't correct itself. It is not going back and counting the original sentence. It's sticking with its original answer. It's worth pointing out here that the model sees the context of the conversation. This is significant from a developer perspective, and goes to show why sometimes you don't want context. Perhaps if we removed the context and just gave the model the original sentence and asked it to count the words, then it would get the right answer.

 1User prompt 9 of 9
 2Walk word-by-word through the sentence and count
 3
 4GPT-5 mini:
 51) The  
 62) representative  
 73) quickly  
 84) resolved  
 95) my  
106) issue  
117) with  
128) care
13
14Total: 8 words.

By asking differently, I was able to throw the model off enough that it didn't just refer to its original answer. I suspect it's because I didn't use the phrase "number of words" again. But again, who knows?

"But it's going to keep getting better!"

Now's a good time to address the ever-present objection "AI is just going to keep getting better." We need to be clear exactly what "better" means. In the last example, it failed 2nd grade arithmetic tasks. The problem is not that AI isn't "good enough" but rather that people sometimes have the wrong expectations. If a pair of scissors can't cut a chain, we don't wistfully say, "Scissors are just going to keep getting better!" No, we know that scissors aren't supposed to cut chains. They're the wrong tool for the job. In the same way, we have to recognize that AI isn't supposed to carry out explicit rules with military precision.

Speaking of tools

To address this problem of large language models being bad at math, modern models implement the concept of tools. When a user's prompt indicates something the model might have a problem with, such as an arithmetic problem, the model will call a tool which is essentially a function call. The function will take some aspect of the prompt as input, perform some logic, and yield an output to the model.

In the above example, the model might key in on words like "count" and "number" and call a particular tool that counts the number of words in a string and returns an integer which the model then outputs to the user. In principle, this works, but it doesn't completely solve the problem. How does the model know whether to call a tool, and which one? It may fail to call a tool when it should, and instead yield a wildly incorrect answer. Again, it's going to probably be right most of the time, but there will be exceptions which have to be handled.

Reverse engineering a linear formula

The thing that makes AI great at navigating ambiguous territory is also the thing that makes it bad at devising and following rigid, deterministic rules. To illustrate, let's see if we can get a machine to figure out a formula (i.e a function) when given a set of (x,y) values. We'll use pyTorch to train a simple model with the values (-1,-3), (0,-1), (1,1), (2,3), (3,5), (4,7), and (100,199).

These values fit perfectly into the formula y=2x-1, which describes a linear relationship between x and y. We'll see how closely the model can approximate the function. Here's the code:

 1import torch
 2torch.manual_seed(0)
 3
 4xs = torch.tensor([[-1.,0.,1.,2.,3.,4.,100.]]).t()
 5ys = torch.tensor([[-3.,-1.,1.,3.,5.,7.,199.]]).t()
 6
 7X = torch.cat([xs, torch.ones_like(xs)], dim=1)
 8sol = torch.linalg.lstsq(X, ys).solution.squeeze()
 9print("w,b:", sol[0].item(), sol[1].item())
10
11scale = xs.abs().max()
12xs_s, ys_s = xs/scale, ys/scale
13m = torch.nn.Linear(1,1)
14opt = torch.optim.Adam(m.parameters(), lr=1e-3)
15loss_fn = torch.nn.HuberLoss()
16for e in range(5000):
17    opt.zero_grad()
18    loss_fn(m(xs_s), ys_s).backward()
19    opt.step()
20    if e%500==499: print(f"e{e+1} loss={loss_fn(m(xs_s),ys_s).item():.2e}")

Remember that AI is essentially a lossy compression function, so we shouldn't expect it to give us an accurate y value for every x. Let's try it with an x value of 100000. According to the formula y=2x-1, this should give us a y value of 19999 (199999=200000-1). Did the machine figure it out? Let's see..

1with torch.no_grad():
2    prediction = m(torch.tensor([[100000.0]], dtype=torch.float32))
3    print(prediction)

The result:

1tensor([[199995.4219]])

It guesses 199995 and some change. Close, but not exact. The further away we move from the training examples, the worse the output is going to get. An x value of 0.0001 gives a y value of -0.0098. The correct y value would be -0.9998, so it's way off.

Now imagine inserting this into a larger workflow where the slightly wrong output serves as an input to another model, which then gives a close-but-wrong output, and so on. This is why any workflow using AI must have proper error handling.

Incorrect output isn't necessarily a failure of the model, but it might be a clue that you need more training data, or a different approach altogether. In this case, if we're trying to represent y=2x-1, then it would make more sense just to programmatically use that formula rather than trying to train a model to derive it.

Takeaway

There's a place for both AI and traditional programming paradigms. AI is great at messy pattern recognition and dealing with the unknown. But it's not a magic replacement for precise logic. Many modern problems need a combination of both. Choose the tool based on the error you can live with — and have a plan for when the model messes up. Because it will.