Kiro CLI S3 403 Diagnosis: Let It Read, Keep the Fix Yours

Blog 10 min read

A `403 Forbidden` on a bucket I owned cost me an afternoon once. The IAM policy was clean. The bucket policy looked fine. The object would not come down. The block turned out to be three layers away from where I was looking: a KMS key that my user could not call `Decrypt` on. S3 returned the same opaque 403 it returns for a missing bucket grant, a deny statement, or a network policy I had never thought to check. That ambiguity is the whole problem, and it is the problem Amazon's Kiro CLI is built to attack.

AWS published a walkthrough of troubleshooting S3 access denied errors with Kiro CLI, an AI-powered command-line tool that reads across IAM, bucket policy, KMS, and VPC endpoint policy in one pass and tells you which layer actually denied the request. It is genuinely useful, and I have started reaching for it during incidents.

There is a line the marketing around these tools tends to blur, and it is worth saying plainly. Kiro CLI is excellent at telling you *why* you are blocked. You should never let it tell you *how to unblock* without a human reading the diff. The read is the product. The fix is your liability.

The 403 is opaque on purpose, and that is what wastes your time

S3 returns `403 Forbidden` whether the denial comes from an explicit deny in a bucket policy, an absent allow anywhere in the chain, a missing `kms:Decrypt` grant on an encrypted object, or a VPC endpoint policy that drops the call before IAM is even consulted. Same status code, four different root causes, sometimes stacked. AWS chose generic codes so a denial does not leak which control blocked you. That is sound security. It is also the reason the debugging experience is miserable.

The manual cost of that ambiguity is the part operators underestimate. You end up tabbing between Permissions, Properties, and CloudTrail, correlating a denied API call against a nested JSON policy, and the failure modes do not announce themselves. An explicit deny and an implicit deny are operationally different. One actively overrides every allow you hold; the other is just the absence of a grant. Yet they surface identically.

Kiro CLI's contribution is to query those layers together and return a single root-cause summary instead of leaving you to assemble it. In a recorded session, listing a bucket's contents resolved in roughly 9 seconds and consumed 0.07 credits under its usage model. The speed is real. What I care about more is that it does not make you guess which layer to suspect first.

The three denials worth memorizing

The AWS walkthrough runs three scenarios, and they map cleanly onto the three layers people forget. I keep them in this order because it is the order I now check them in.

Denial typeWhere it livesHow it shows upWhy it fools you
Explicit denyBucket policyA `Deny` on `s3:GetObject` for your ARNYour IAM allow is correct, so you stop looking at IAM
Implicit deny (KMS)KMS key policyObject is SSE-KMS, you lack `kms:Decrypt`Bucket ACL shows `FULL_CONTROL`, so storage looks fine
VPC endpoint policyNetwork path"no VPC endpoint policy allows the s3:GetObject action"It runs before IAM evaluation, outside the policy chain you know

The first scenario is the cleanest illustration of an AWS principle worth tattooing on the back of your hand: an explicit deny always wins. A bucket policy that denies `s3:GetObject` for `arn:aws:iam::123456789012:user/johndoe` overrides any allow that user holds in IAM. The user's permissions are sufficient and the download still fails, because deny precedence is absolute in AWS policy evaluation. If you do not internalize that, you will add redundant allow rules trying to out-vote a deny that cannot be out-voted, and you will widen your attack surface while the block stays exactly where it was.

The KMS case is the one that got me. S3 evaluates object access separately from the decryption grant. You can own the bucket, own the object, hold `FULL_CONTROL` on both, and still eat a 403 because the object is encrypted with a KMS key and your principal lacks `kms:Decrypt`. The storage layer is innocent; the crypto layer is the wall. Manual inspection of bucket configuration will never find it, because the bucket configuration is correct.

The VPC endpoint case is the hardest to spot precisely because it sits outside the evaluation path everyone has memorized. An EC2 role with sufficient S3 permissions still gets denied when traffic routes through a gateway endpoint whose policy does not allow the action. IAM is fine. The bucket is fine. The network authorization layer said no first.

Why I let it diagnose and never let it remediate

Here is where I part ways with the convenience narrative. Kiro CLI can suggest policy changes and ask permission to apply them. The AWS post is explicit that its walkthrough demonstrates diagnostic capability and stops there; it does not show automated remediation, and that scoping is the most important sentence in the whole document.

Diagnostic accuracy does not transfer to remediation safety. Reading "you are blocked by an explicit deny" is a fact about the present state. Writing "remove the deny" is a change to a production policy that may exist for a reason the tool cannot see. A generated `Allow` that resolves your immediate 403 can expose encrypted objects to principals who should never have had them, which is a worse outcome than the outage you started with. When the change is a Service Control Policy, a single over-broad grant can ripple across an entire organization. The tool draws the policy from natural language fast; it cannot reason about the blast radius of applying it.

So the workflow I enforce on my own team is built into the mechanics, so nobody has to remember to be careful. When Kiro CLI prompts, you answer `y` (approve this one read) and never `t` (trust the session). Choosing `y` per command keeps the run a read-only audit; `t` hands it standing authority to act, and that is the exact moment a diagnostic session becomes an incident.

Run it with temporary credentials from AWS Identity Center or STS rather than long-term keys. If a diagnostic context is going to read across four policy layers, it should expire on its own. And every suggested fix is a draft for a human, applied by hand through `aws iam`, after someone who understands the dependent services has read it.

The cost asymmetry justifies the friction. The diagnosis is cheap and fast. A bad remediation is measured in downtime hours and, worse, in silent exposure you do not notice until later. I will trade nine seconds of speed for an afternoon of not having caused a breach.

Working the next 403 by hand

When a bucket returns 403, the order in which you ask questions decides how fast you get to the answer, and that order follows from how AWS evaluates the request. The tool accelerates the read whether or not Kiro CLI is in the loop, but the reasoning underneath stays the same and it is worth being able to do it without the tool.

Start by being sure of who you are. Run `aws sts get-caller-identity` before anything else, because stale session credentials will skew every conclusion downstream, and a surprising number of 403 hunts end the moment someone realizes they were authenticated as the wrong principal. Once the identity is confirmed, look at the bucket policy for an explicit deny before you touch IAM at all. Deny precedence is absolute, so a single `Deny` statement is both the cheapest thing to rule out and the most likely to be hiding under a correct-looking IAM allow.

If identity and bucket policy both check out and the object is encrypted, the question moves to the crypto layer. Verify `kms:Decrypt` on the key independently of any bucket grant, because `FULL_CONTROL` on storage tells you nothing about whether your principal can decrypt. And if IAM, bucket policy, and KMS all read clean and the call still fails, the remaining suspect is the network path: a VPC endpoint policy can reject the action before IAM is ever consulted, which is why it survives every check you ran inside the policy chain.

The last criterion concerns the fix rather than the diagnosis. When you reach a remediation, draft it, read it, and apply it by hand. Approve reads with `y` and never `t`, and never let the tool write a policy you have not checked yourself. If you are on long-term IAM keys because federated identity is not available yet, rotate them at the very outside every 90 days. Even then, that interval is the worst acceptable case rather than a target; what you actually want are short-lived credentials that nobody has to remember to rotate.

About

I am Alex Kumar, Senior Platform Engineer and Infrastructure Architect at Rabata.io. The storage we run is S3-compatible, and my week is mostly backup design, disaster-recovery drills, and the permission models that decide who can read what. Access-denied incidents are a recurring theme; I have worked enough of them at 3 AM, here and as a staff SRE on a SaaS platform serving more than a million daily users, that the pattern recognition is involuntary by now. The e-commerce DevOps work before that taught me the same lesson from the revenue side: a misapplied grant is its own outage.

The way I treat any tool that offers to fix infrastructure for me comes straight from how I run DR. I assume nothing works until a test proves it does, and I do not let automation apply a change I have not read line by line. That is why this piece lands where it does. Kiro CLI keeps a permanent slot in how I diagnose, and exactly zero authority to touch a live policy.

Conclusion

Kiro CLI solves a real and specific problem. The S3 `403` is deliberately opaque, and correlating four policy layers by hand during an incident is slow and error-prone. Letting an AI read the full permission chain and name the blocking layer in seconds is a legitimate operational win, and I use it. The same fluency that makes the diagnosis fast also makes the remediation dangerous, because a generated policy applies your uncertainty to production with full force.

The discipline that keeps it safe is simple to state: trust the read, audit the write. Approve each command with `y`, run on ephemeral credentials, and put a human between every suggested fix and the policy it would change. As these tools keep getting faster at reading our infrastructure, I expect human review on the write side to become the part of the job worth protecting.

Frequently Asked Questions

AWS uses a generic 403 Forbidden so the response does not leak which control blocked you, whether that is an explicit bucket deny, a missing allow, a missing kms:Decrypt grant, or a VPC endpoint policy. That is good security and poor debugging ergonomics, which is exactly why a tool that reads all four layers at once saves real time.

No. Use it to diagnose, not to remediate. AWS documents the walkthrough as diagnostic only, and a generated allow can expose data or, as an SCP, disrupt an entire organization. Read every suggested change and apply it by hand through aws iam after a human confirms it is safe for dependent services.

Entering y approves one specific read command and keeps the session a read-only audit. Entering t grants persistent trust for the session, letting the tool act without re-asking. Always use y so you can validate every action before it happens; t is how a diagnostic run quietly turns into an unreviewed change.

S3 evaluates object access separately from the decryption grant. If the object is encrypted with SSE-KMS and your principal lacks kms:Decrypt on the key, you get a 403 even with FULL_CONTROL on the bucket and object. Check the KMS key policy independently of the storage permissions; the storage layer is not the problem.

It operates outside the IAM and bucket-policy evaluation path most people have memorized. A request can pass IAM and bucket checks and still be denied because the gateway endpoint policy rejects the action before IAM is consulted. If identity and resource policies both look correct and the call still fails, suspect the network path next.