a tangled mess of cables in a physical database location

Want to hear something about query languages that will make you actually appreciate callback functions and promises in programming languages?

I have experienced what it’s like solving a problem that would be perfect for them and I didn’t like it.

In JavaScript, you can just write something that says “wait for this step to be done, then do this other thing”.

You know, your classic kind of ‘Are we there yet?’ that you absolutely have to wait to get resolved.

functionweGetThereWhenWeGetThere(there){ 
    returnnewPromise((resolve, reject) => {
        if (here !== there) {
            reject(newError('We are not there yet'))
        }
        else {
            resolve(there)
        }
    })
}

Oracle SQL doesn’t appear to me to have reached the concept of promises yet.

Database engines have their own type of PEMDAS, which basically dictates how different parts of a query are resolved. It’s typically the same order of operations every time across database engines, no matter what the query actually ends up being.

The part we’re going to be annoyed at today is that where is resolved before aggregate and analytical functions that would generate things like calculated fields. An analogous feature in Excel would be those sum columns for subtotals, that would have to wait to drop out criteria that didn’t meet the where filters. And yeah, this makes sense, you don’t want to potentially have to do a bunch of calculations on records that didn’t even match – that could be computationally expensive calculations on millions of records.

But because of this, the resulting query when business logic needs to happen looks unbeauteous and difficult to parse.

For example, let’s say there some task like finding the recent legal credential issued for an entity. These credentials may be issued, expired or revoked, or reissued to the same entity. Even properly normalized data might have an issue where because the entity has multiple historical credentials, that entity will show up in intermediate selection results multiple times and then decisions have to be made about which is the most recent credential. Those decisions have to then become part of the criteria for filtering results, which means that where has to wait for an aggregate function like max() on a date field.

I got around this the SQL way which is to create a subquery with the initial results set of all the credentials as well as a calculated field for what the most recent credential issuance date was. Then, I could access this in an enveloping query and filter for only the credentials issued to entities where the issuance date matches the most recent issuance date for all the credentials associated with that same entity.

with subquery as (
    select
    entity_id,
    entity_name,
    entity_address,
    entity_credential_id,
    entity_credential_status
    entity_credential_issue_date,
    max(entity_credential_issue_date) over (partition by entity_id) as most_recent_credential_issue_date

    join ...
    where ...
)

select * from subquery
    where entity_credential_issue_date = most_recent_credential_issue_date

That’s so ugly and unsatisfying!!! I am revolted and would rather just tag a where with ‘wait for this window function to be done actually’ instead of having to replicate most of the same exact select with slightly different filter criteria. Especially since the clauses in a real world situation can end up already being fairly complex and I don’t like having to scroll up and down to load additional constraints on the final data set into my understanding of the query.

Resources:

Related Posts

One thought on “Want to hear something about query languages that will make you actually appreciate callback functions and promises in programming languages?

Comments are closed.