Querying Metadata – it’s like a dinosaur dig
Lately I have been doing some work on a tool that connects to an outside service. This tool does not have any documentation, nor does there appear to be any live humans with institutional knowledge. How is a developer supposed to gain insight into a tool like that?
Most database engines allow for users with the privilege to do so to query the system tables for information about the relationship between different database tools.
The way I approached this issue was starting from the field level.
First, the backstory: The ask came in from a user, who described a particular change that needed to be made to a certain piece of data. Helpfully, this piece of data is relatively unique in the database. Between referencing the form itself that the user would interface with and an out-of-date data dictionary, I was able to determine what the field that contains this data is called. In our example, we’ll call this field ‘crypticallyNamedField’.
Now, it is possible to query a database’s system for information about items that use a field. I like this answer by AdaTheDev on StackOverflow for its brevity:
SELECT COLUMN_NAME AS'ColumnName'
,TABLE_NAME AS'TableName'FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE'%MyName%'ORDERBY TableName
,ColumnName;
To suit our purposes we only need to change one thing:
SELECT COLUMN_NAME AS'ColumnName'
,TABLE_NAME AS'TableName'FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE'%crypticallyNamedField%'ORDERBY TableName
,ColumnName;
This will gather us a list of list of not just tables, but views, that contain a field with the name that we’re looking for.
In my case, I only got two tables back, likelyTable, and tableNoOneHasEverHeardOf.
I can gain insight into which table I think is more likely the one that the tool is using by looking at what depends on this table. If there are any stored procedures or reports that this tool uses, then they should come up in a query for what references this table. Use a query like this one by bjnr.
SELECT
referencing_schema_name, referencing_entity_name,
referencing_class_desc, is_caller_dependent
FROM sys.dm_sql_referencing_entities ('<TableName>', 'OBJECT')
And note that again we only need to change one thing:
SELECT
referencing_schema_name, referencing_entity_name,
referencing_class_desc, is_caller_dependent
FROM sys.dm_sql_referencing_entities ('<likelyTable>', 'OBJECT')
I got a few reports, and here we come to some Life Professional Skills. As a junior developer, it’s unlikely that you’re working in a complex legacy environment purely by yourself. This is a great amount of legwork to do to work with the data services or database administration team to identify what else your tool references.
In this case, I was able to provide the list of reports, with likely candidates for what the tool uses highlighted, to a senior developer and a database administrator. We explained the situation to the database admin and said that we weren’t quite sure what we were looking for, but we knew where the data was coming from, and had a close idea for where the data was going. The database administrator was able to use our clues to identify an SSIS package that references stored procedures and exports data that matches what the user describes. This saves time over trying to learn advanced database administration as well as gain institutional knowledge from crumbs.
References:
- Find all tables containing column – AdaTheDev
- Find all entities that refer to a given entity – bjnr
- Photo
Thank you for reading. Please let me know if there are any clarifications I can make or further questions I can answer either down in the comments, on LinkedIn, or hit me up on Mastodon.