Authorization and search operations

Background: at work I help to build a SAAS web application for healthcare. An important aspect of our web application is authorization. It's a pretty hard problem, because the business has a fairly complex set of rules about who can see what data. The rules involve attributes of the subject and object, and the direct or indirect (e.g. via a group) relationship between them. It's also a pretty important problem; healthcare data is typically very sensitive and we need to obey the law and keep our users' trust in order to provide useful services. The problem is also constrained by speed; in order to render a web page in a reasonable time e.g. under a second, data must be fetched and authorized in just a few milliseconds.

One way to imagine authorization is as a function: allow(subject, action, object) which returns a simple boolean for permitted or not permitted. The 'action' would typically be 'create', 'read', 'update' or 'delete', but it could be something else for more complex operations. This post is going to explore 'read', and more specifically 'search' operations because they are extremely common and they introduce some complexity that can be tackled in several different ways.

Logically a 'search' might be considered the same as a 'read' since the end result is the same: returning the data to the requestor. It might then be reasonable to expect the same authorization rules apply. The difference is that in a 'read' operation, the requestor already knows how many results they're expecting and has some kind of unique identifier for that data, wheras in a 'search' operation they don't know in advance how many results they will get, or what the unique identifiers are for that data. This has an impact on the implementation of those authorization rules.

Authorizing a read operation can be quite simple: it's possible to have an actual function which is executed after fetching the data from storage, and if the read isn't allowed then return an error, e.g. an HTTP 403 error would be appropriate for a web-based API.

fn read_foo(subject, foo_id) -> Result<Foo, Error> {
    let data: Foo = fetch_foo(foo_id);
    if !allow(subject, "read", data) {
        Err(Error::Forbidden)
    } else {
        Ok(data)
    }
}

Authorizing a search operation is more complicated since we don't know how many objects we will actually receive. A naïve solution is to simply filter unauthorized results from the set.

fn search_foo(subject, criteria) -> Result<Vec<Foo>> {
    fetch_foos(criteria)
        .filter(|data| {allow(subject, "read", data)})
        .collect()
}

This solution has several issues. Firstly leakage: redacted results leak the fact that there was something to redact. By making repeated searches and varying the criteria, it's possible for someone to figure out some attributes of the hidden data. Secondly paging: redacting results can trigger unpleasant edge cases e.g. a whole page of redacted results, which can be a poor experience if those pages are supposed to be presented to an end user. Thirdly efficiency: someone might issue a query for many results they can't see amd our application might fetch and process a bunch of data, only for most of that work to be thrown away by filtering. Despite these disadvantages, post-filtering is an attractive option becuase it is simple, read and search operations can be authorized in the same way, even using the same function.

An alternative approach is to bake authorization rules into queries; e.g. if the application's storage is a database, then authorization might become part of an SQL query.

fn search_foo(subject, criteria) -> Result<Vec<Foo>> {
    let new_criteria = add_criteria_for_authz("foo", subject, criteria);
    fetch_foos(criteria)
}

This solution also has downsides, notably there is no longer code sharing between read and search operations. If the rules change, both the allow and add_criteria_for_authz functions must be updated.

A variation on this approach which preserves a implementation for read and search is possible using Oso. The Oso policy evaluation engine is capable of partially evaluating a function, and instead of emitting a simple boolean it returns a set of constraints which must be met to ensure the data fetch is authorized. Those constraints might be turned into query parameters for the underlying storage. This approach is called data filtering in Oso's documentation. Code for it might look like this:

fn search_foo(subject, criteria) -> Result<Vec<Foo>> {
    let constraints = allow(subject, "read", Variable{});
    let query = add_constraints(criteria, constraints);
    fetch_foo(query)
}

Our 'allow' function can now be shared between read operations where it receives the real object, and searches where it receives a variable and returns constraints. The downside of this approach is the complexity of the add_constraints function. We might end up in a situation where it's not possible to turn some complicated set of constraints into a reasonable query.

A completely different approach is to push the authorization check further down the stack. Most storage engines offer some kind of authorization facility. Postgresql database has an extensive permission system, as well as row security policies. These could be used to implement authorization policies independently of any queries an application might issue. There are a couple of advantages to this approach. Firstly; arbitrary queries can be authorized. This means authorization happens even if the application layer is bypassed e.g. by a statistics gathering process or manual query. Secondly; queries can be simpler for the developer to write. This is especially useful when joins are involved. If authorization is implemented as part of the query itself, then security policies for both tables involved in the join have to be combined somehow. If row level security is used however, then the postgresql execution planner handles applying the relevant policies to each table, and the developer's query can remain simple.

One downside to this approach is that it's very specific to the storage implementation. If an application has to support multiple storage backends e.g. mysql database or a plain filesystem, then the implementation will be very different for each one. Another downside is the expression of authorization rules: postgresql row level security policies are implemented as SQL queries which must return 'true' or 'false' and have a few built-in variables available. These policies may not be easy to read and test, and they certainly aren't portable across different systems.

I made a quick experiment to invoke Oso's authorization engine from postgresql row level security policy. This is an attempt to keep the advantages of row level security and mitigate one of the downsides It should be possible to define access control policies in Oso's policy language. The policies themselves are no longer defined in a database-specific language and might be re-used in other contexts, as well as tested independently of the database. This does introduce an additional downside that it requires customization of the postgresql installation, which may be undesirable in some circumstances.

Related articles