Multi-statement Table Valued Function vs Inline Table Valued Function
Asked Answered
B

9

213

A few examples to show, just incase:

Inline Table Valued

CREATE FUNCTION MyNS.GetUnshippedOrders()
RETURNS TABLE
AS 
RETURN SELECT a.SaleId, a.CustomerID, b.Qty
    FROM Sales.Sales a INNER JOIN Sales.SaleDetail b
        ON a.SaleId = b.SaleId
        INNER JOIN Production.Product c ON b.ProductID = c.ProductID
    WHERE a.ShipDate IS NULL
GO

Multi Statement Table Valued

CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT)
RETURNS @CustomerOrder TABLE
(SaleOrderID    INT         NOT NULL,
CustomerID      INT         NOT NULL,
OrderDate       DATETIME    NOT NULL,
OrderQty        INT         NOT NULL)
AS
BEGIN
    DECLARE @MaxDate DATETIME

    SELECT @MaxDate = MAX(OrderDate)
    FROM Sales.SalesOrderHeader
    WHERE CustomerID = @CustomerID

    INSERT @CustomerOrder
    SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
    FROM Sales.SalesOrderHeader a INNER JOIN Sales.SalesOrderHeader b
        ON a.SalesOrderID = b.SalesOrderID
        INNER JOIN Production.Product c ON b.ProductID = c.ProductID
    WHERE a.OrderDate = @MaxDate
        AND a.CustomerID = @CustomerID
    RETURN
END
GO

Is there an advantage to using one type (in-line or multi statement) over the other? Is there certain scenarios when one is better than the other or are the differences purely syntactical? I realise the two example queries are doing different things but is there a reason I would write them in that way?

Reading about them and the advantages/differences haven't really been explained.

Brooklet answered 31/3, 2010 at 15:41 Comment(2)
Also one of the huge benefit of the inline function is that you can select ROWID(TIMESTAMP) columns, while you cannot insert TIMESTAMP data to the return table in multistatement function!Tonl
Thanks for an excellent thread. I've learnt a lot. However, one thing to keep in mind is when ALTERing a function that was ITV to MSTV, the profiler thinks you are altering an ITV. No matter what you do to get the syntax right from an MSTV point of view, the recompile always fails, usually around the first statement after BEGIN. The only way around this was to DROP the old function and CREATE the new one as an MSTV.Sniff
A
159

In researching Matt's comment, I have revised my original statement. He is correct, there will be a difference in performance between an inline table valued function (ITVF) and a multi-statement table valued function (MSTVF) even if they both simply execute a SELECT statement. SQL Server will treat an ITVF somewhat like a VIEW in that it will calculate an execution plan using the latest statistics on the tables in question. A MSTVF is equivalent to stuffing the entire contents of your SELECT statement into a table variable and then joining to that. Thus, the compiler cannot use any table statistics on the tables in the MSTVF. So, all things being equal, (which they rarely are), the ITVF will perform better than the MSTVF. In my tests, the performance difference in completion time was negligible however from a statistics standpoint, it was noticeable.

In your case, the two functions are not functionally equivalent. The MSTV function does an extra query each time it is called and, most importantly, filters on the customer id. In a large query, the optimizer would not be able to take advantage of other types of joins as it would need to call the function for each customerId passed. However, if you re-wrote your MSTV function like so:

CREATE FUNCTION MyNS.GetLastShipped()
RETURNS @CustomerOrder TABLE
    (
    SaleOrderID    INT         NOT NULL,
    CustomerID      INT         NOT NULL,
    OrderDate       DATETIME    NOT NULL,
    OrderQty        INT         NOT NULL
    )
AS
BEGIN
    INSERT @CustomerOrder
    SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
    FROM Sales.SalesOrderHeader a 
        INNER JOIN Sales.SalesOrderHeader b
            ON a.SalesOrderID = b.SalesOrderID
        INNER JOIN Production.Product c 
            ON b.ProductID = c.ProductID
    WHERE a.OrderDate = (
                        Select Max(SH1.OrderDate)
                        FROM Sales.SalesOrderHeader As SH1
                        WHERE SH1.CustomerID = A.CustomerId
                        )
    RETURN
END
GO

In a query, the optimizer would be able to call that function once and build a better execution plan but it still would not be better than an equivalent, non-parameterized ITVS or a VIEW.

ITVFs should be preferred over a MSTVFs when feasible because the datatypes, nullability and collation from the columns in the table whereas you declare those properties in a multi-statement table valued function and, importantly, you will get better execution plans from the ITVF. In my experience, I have not found many circumstances where an ITVF was a better option than a VIEW but mileage may vary.

Thanks to Matt.

Addition

Since I saw this come up recently, here is an excellent analysis done by Wayne Sheffield comparing the performance difference between Inline Table Valued functions and Multi-Statement functions.

His original blog post.

Copy on SQL Server Central

Antiphrasis answered 31/3, 2010 at 16:10 Comment(4)
This simply isn't true - Multi-statement functions are very very often a huge performance hit because they stop the query optimiser from using statistics. If I had $1 for every time I've seen multi-statement function use cause a very poor choice of execution plan (mostly because it usually estimates the returned row count as 1), I'd have enough to buy a small car.Kist
The very best explanation I've ever found is in the first answer, and the related post: #4109652 Don't miss teh related document, you can read it quickly, and it's extremely interesting.Inapprehensible
Will there be an update to this answer for SQL Server 2017?: youtube.com/watch?time_continue=2&v=szTmo6rTUjMBarb
"but it still would not be better than an equivalent, non-parameterized ITVS" - does this mean the inclusion of a parameter changes the way it runs?Gomphosis
P
30

Internally, SQL Server treats an inline table valued function much like it would a view and treats a multi-statement table valued function similar to how it would a stored procedure.

When an inline table-valued function is used as part of an outer query, the query processor expands the UDF definition and generates an execution plan that accesses the underlying objects, using the indexes on these objects.

For a multi-statement table valued function, an execution plan is created for the function itself and stored in the execution plan cache (once the function has been executed the first time). If multi-statement table valued functions are used as part of larger queries then the optimiser does not know what the function returns, and so makes some standard assumptions - in effect it assumes that the function will return a single row, and that the returns of the function will be accessed by using a table scan against a table with a single row.

Where multi-statement table valued functions can perform poorly is when they return a large number of rows and are joined against in outer queries. The performance issues are primarily down to the fact that the optimiser will produce a plan assuming that a single row is returned, which will not necessarily be the most appropriate plan.

As a general rule of thumb we have found that where possible inline table valued functions should be used in preference to multi-statement ones (when the UDF will be used as part of an outer query) due to these potential performance issues.

Pruter answered 16/4, 2010 at 19:20 Comment(3)
Although it may treat multi-statement table valued functions similar to a stored procedure, a functionally identical stored procedure is a lot faster than a table valued function for large datasets. I'm sticking with stored procs over multi-statement table valued functions.Clustered
Unless you need to join those results in another query.Sphenic
why not use both? A stored proc that returns the result of a multi-statement table-valued function. Best of both worlds.Jingo
V
13

There is another difference. An inline table-valued function can be inserted into, updated, and deleted from - just like a view. Similar restrictions apply - can't update functions using aggregates, can't update calculated columns, and so on.

Veloz answered 25/7, 2011 at 9:23 Comment(0)
S
3

Your examples, I think, answer the question very well. The first function can be done as a single select, and is a good reason to use the inline style. The second could probably be done as a single statement (using a sub-query to get the max date), but some coders may find it easier to read or more natural to do it in multiple statements as you have done. Some functions just plain can't get done in one statement, and so require the multi-statement version.

I suggest using the simplest (inline) whenever possible, and using multi-statements when necessary (obviously) or when personal preference/readability makes it wirth the extra typing.

Sophocles answered 31/3, 2010 at 15:47 Comment(3)
Thanks for the answer. So basically, the multi-statement is only really to be used when the function is more complicated than is feasible to do in a inline function, for the sake of readability? Are there any performance benefits at all to multi-statement?Brooklet
I don't know, but I wouldn't think so. It is probably better to let sql server figure out the optimizations that you might try to make manually (by using variables, temp tables, or whatever). Though you could certainly do some performance testing to prove/disprove this in specific cases.Sophocles
Many thanks again. I may look further into this when I have more time! :)Brooklet
L
1

Another case to use a multi line function would be to circumvent sql server from pushing down the where clause.

For example, I have a table with a table names and some table names are formatted like C05_2019 and C12_2018 and and all tables formatted that way have the same schema. I wanted to merge all that data into one table and parse out 05 and 12 to a CompNo column and 2018,2019 into a year column. However, there are other tables like ACA_StupidTable which I cannot extract CompNo and CompYr and would get a conversion error if I tried. So, my query was in two part, an inner query that returned only tables formatted like 'C_______' then the outer query did a sub-string and int conversion. ie Cast(Substring(2, 2) as int) as CompNo. All looks good except that sql server decided to put my Cast function before the results were filtered and so I get a mind scrambling conversion error. A multi statement table function may prevent that from happening, since it is basically a "new" table.

Legislate answered 5/4, 2019 at 15:29 Comment(0)
R
0

look at Comparing Inline and Multi-Statement Table-Valued Functions you can find good descriptions and performance benchmarks

Remonstrant answered 17/2, 2015 at 9:17 Comment(0)
L
0

I have not tested this, but a multi statement function caches the result set. There may be cases where there is too much going on for the optimizer to inline the function. For example suppose you have a function that returns a result from different databases depending on what you pass as a "Company Number". Normally, you could create a view with a union all then filter by company number but I found that sometimes sql server pulls back the entire union and is not smart enough to call the one select. A table function can have logic to choose the source.

Legislate answered 5/4, 2019 at 15:19 Comment(0)
H
0

Maybe in a very condensed way. ITVF ( inline TVF) : more if u are DB person, is kind of parameterized view, take a single SELECT st

MTVF ( Multi-statement TVF): Developer, creates and load a table variable.

Handicraft answered 18/2, 2020 at 10:1 Comment(0)
D
-2

if you are going to do a query you can join in your Inline Table Valued function like:

SELECT
    a.*,b.*
    FROM AAAA a
        INNER JOIN MyNS.GetUnshippedOrders() b ON a.z=b.z

it will incur little overhead and run fine.

if you try to use your the Multi Statement Table Valued in a similar query, you will have performance issues:

SELECT
    x.a,x.b,x.c,(SELECT OrderQty FROM MyNS.GetLastShipped(x.CustomerID)) AS Qty
    FROM xxxx   x

because you will execute the function 1 time for each row returned, as the result set gets large, it will run slower and slower.

Dmz answered 31/3, 2010 at 15:48 Comment(3)
Ah, so you would say that the inline is much better in terms of performance?Brooklet
No, they both return a table, which makes your second SQL invalid as you are trying to put a table in a column.Kremlin
@ck, I've updated the query yo commented on. the parameters of the function used in the second function lend it to be used as a sub query, which will result in worse performance.Dmz

© 2022 - 2024 — McMap. All rights reserved.