Designing Permission Systems That Don't Become Spaghetti
Every application starts with a single permission check and ends with a tangle of role flags, conditional ifs, and special cases that nobody fully understands. The path from one to the other is well-trodden. Here is what makes permission systems hold up.
Almost every application begins authorization with a single line: if user.is_admin. That line works for the first six months. Then a customer asks for read-only users. Then a billing user. Then a team admin who can manage their own team but not other teams. Then a contractor who can see invoices but not edit them. Then a support engineer who can read everything but only modify status fields. Each of these requirements arrives independently and gets bolted on with another flag, another role, another conditional. Two years later the authorization layer is hundreds of branching ifs that nobody can reason about and that everyone is afraid to touch.
The path from a single boolean to spaghetti permissions is so well-trodden that the failure modes are predictable. The system that holds up is the one whose author saw this future early and chose a model with structural answers, rather than letting the model emerge from incident-driven patching.
Roles vs. permissions
The first vocabulary distinction that helps is between roles and permissions. A permission is a primitive: invoices.read, invoices.write, users.delete. A role is a named bundle of permissions: admin includes everything, billing includes invoices.* and payments.*, support includes everything as read and a small set of writes.
The mistake is to check roles in code: if user.role == 'admin'. This works until the day a new role needs to do something that admin already does, and you discover that the code conflates "is admin" with "can do this thing." The right pattern is to check permissions in code (if user.has_permission('invoices.write')) and define the relationship between roles and permissions in a configuration that lives outside the code path. When a new role is introduced, no code changes; only the role-to-permissions mapping changes.
The granularity question
The hardest decision in any permission system is granularity. Too coarse, and you cannot express the access patterns customers actually want; you end up with workarounds in code or with overly-broad roles that grant more than they should. Too fine, and the surface area of permissions becomes its own problem: hundreds of permission strings that nobody can keep track of, role definitions that span pages, and a maintenance burden that compounds with every new feature.
The heuristic that works is to start with permissions at the level of resources and verbs (resource.action) and add specificity only when a real customer scenario demands it. Resist the temptation to pre-build permissions for hypothetical future needs. A permission you defined but never enforced is worse than no permission at all because it lies about your security model.
The pattern of resource.action covers most needs. The action is one of a small set: read, write, delete, admin, with optional refinements like create vs update when those are meaningfully different. The resource is the object type from your domain. invoices.read, flags.write, users.admin. This naming convention scales to a few hundred permissions before it starts to feel arbitrary.
The ownership dimension
The single thing that turns a clean permission model into a tangle of special cases is ownership. The customer asks: this user should be able to edit their own invoices but not other users' invoices. The naive response is to add a permission like invoices.write.own. The next request is the same shape: edit invoices for their team but not other teams. Now you have invoices.write.own and invoices.write.team. By the third request, the permission space has exploded.
The structural answer is to separate the permission check (can this user perform this action?) from the scope check (on which objects?). The permission grants a capability; the scope determines the set of objects to which the capability applies. The check becomes a two-part question: does the user have invoices.write, and is the target invoice within the user's scope? Scopes are typically expressed as: own (target.owner_id == user.id), team (target.team_id in user.team_ids), tenant (target.tenant_id == user.tenant_id), or unrestricted.
This separation collapses the permission space dramatically. Instead of invoices.write.own / invoices.write.team / invoices.write.tenant, you have one permission and a scope attribute on the role. The same permission can be assigned at different scopes for different roles, without inflating the permission catalog.
Attribute-based access control
For the cases where ownership-based scopes are not enough (the customer wants users to edit invoices in any state except "finalized," for example), the next tool is attribute-based access control: ABAC. ABAC permission checks include attributes of the actor, the action, and the target. can_edit_invoice = has_permission('invoices.write') AND target.status != 'finalized' AND (target.owner_id == user.id OR has_permission('invoices.admin')).
ABAC is more expressive than role-based access control, but it has a serious cost: the permission logic now lives in code rather than configuration, and the checks become harder to audit. The pragmatic compromise is to use roles and scopes for the bulk of authorization decisions, and reserve ABAC for the small set of attribute-dependent rules. A typical application has perhaps five to ten ABAC rules out of dozens of total permissions; trying to do everything with ABAC means burying authorization logic across the codebase where nobody can find it.
Auditability and change
The feature that distinguishes a permission system that works from one that has gradually decayed is its support for auditing changes. Permissions changes are high-stakes: a misconfigured role assignment can grant unauthorized access for as long as it goes unnoticed. The system should log every change to roles and role assignments, with the actor and timestamp. The audit log should be queryable by the affected user (what permissions did this user have on this date?) and by the role (who has had the admin role assigned in the last 90 days?).
The other auditability feature is impact analysis. Before changing a role's permissions, you should be able to ask the system: which users currently hold this role, and what would change for them? This catches the class of mistake where a role gets a new permission that was intended for one specific user but ends up granted to fifty.
The four APIs we run at DocuMint, CronPing, FlagBit, and WebhookVault use simple permission models because they are single-tenant per API key. The deeper rule applies regardless: clean permissions stay clean only when the model has structural answers to the predictable extension requests, rather than emerging from a sequence of patches.