Skip to content
Advertisement

How do I find out which SQL queries utilize the tables in a certain database programmatically?

Let’s say that I have a table called queries, which holds hundreds of predefined SQL queries in JSON format.

Then let’s say that I have two different databases, the original being a 200+ table DB and the new one being a 50~ table DB. This new DB is a subset of the original, so every table it has exists in the original one too.

What I’m trying to achieve is to find out which queries in the queries table does not require access to the original DB tables. So a query must only do SELECTs to the tables that exist on the new DB.

This would take hours, maybe days for me to figure out manually so I want to do this programmatically but I don’t know where to start. Any ideas?

Advertisement

Answer

Assuming your table queries contains query column,

You can count how many times each table appears in all your queries from queries:

select *
from (
    select
        table_schema || '.' || table_name as table,
        (
            select count(*)
            from queries
            where query like '%' || table_schema || '.' || table_name || '%'
        ) as cnt
    from information_schema.tables
) tmp
where cnt > 0

Important Note: The above query counts only apparitions of <schema>.<table> form. There may also appear other forms, like: <schema> . <table> (with spaces), <table> (without schema), or even function names.

If function names are present, than you should search, for each function occurrence, what other tables and functions depends on it. So, it would be a recursive search.

However, there may also be cases in functions that a name of a table is commented out (so, the function does not actually use it).

This is not a perfect solution, but it may help you to discover a part of the dependencies you need.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement