This series of posts attempt to look at the pros and cons of using Stored Procedures as data access method for Tableau reports. The points made here are valid as of Tableau 8.2.
Part 1 gives a general introduction and identifies areas where some of SProcs advantages become irrelevant when it comes to Tableau
Part 2 looks at metadata considerations
Part 3 looks at Execution Plan considerations
Part 4 looks at how Tableau uses stored procedures to answer different queries
Stored Procedures are an important part of typical database development. In a typical web application or desktop application, the application interacts with the database through of stored procedures. It is recommended to use stored procedures rather embedding direct DML or SELECT sql statements in the application code. For a database developer involved in Business Intelligence development, how much of a benefit do Stored Procedures offer in developing dashboards with a tool like Tableau (or any typical self-service BI tool) ?
In general, Stored Procedures offer the following advantages among others:
- Ability to return different kinds of data/information to the application
- Multiple Result Sets
- Output Parameters
- Error Messages
- Return Codes
- Additional Pre/Post processing (such as enforcing business rules, having complex logic, loops, calls to other procs etc)
- IF Else branching (points 2 and 3 could be mimicked in the app logic also, but the database also has the ability if needed)
- Portability to an extent
- Security – no direct access to data tables to end users
- Prepared execution plan
- Use a series of simpler SQL statements instead of one big monolithic sql statement
- Simpler statements == better and more accurate query plans
A typical non-BI application can make use of most of the above advantages. If I am coding a typical .NET application, I can have error handling code in my app; or I can take advantage of Multiple Result Sets and minimized the number of round trips to the database; or I can do some transformations on the input data; or give different results based on different inputs.
In Tableau (and to an extent in a typical BI application), unfortunately, most of these advantages become a moot point. Currently (This link here notes the various limitations on using Stored Procedures);
- Tableau, does not handle multiple result sets;
- Stored procedures with an output parameter are not even shown for selection in the Data Source
- Error messages and Return codes are not useful since we do not code the flow of the application
- While If Else branches can be present in, different branches cannot return different metadata output.
- Code portability can sometimes be subjective and in light of how typically the underlying tables or views are exposed directly to the user, I will not attempt to argue in favor or against of points 4, 5 and 6 above.
- I will try to address the remaining points about Additional Pre/Post processing, Prepared execution plans and breaking into simpler t-sql statements in Part 2 and Part 3.