SqlDataReader and SQL Server 2016 FOR JSON splits json in chunks of 2k bytes

Asked 9/8, 2017 at 16:8 Answered 27/7, 2023 at 8:45

Solved c#sql-server t-sql sql-server-2016 azure-sql-database

Recently I played around with the new for json auto feature of the Azure SQL database.

When I select a lot of records for example with this query:

Select
    Wiki.WikiId
    , Wiki.WikiText
    , Wiki.Title
    , Wiki.CreatedOn
    , Tags.TagId
    , Tags.TagText
    , Tags.CreatedOn
From
    Wiki
Left Join
    (WikiTag
Inner Join 
    Tag as Tags on WikiTag.TagId = Tags.TagId) on Wiki.WikiId = WikiTag.WikiId
For Json Auto

and then do a select with the C# SqlDataReader:

var connectionString = ""; // connection string
var sql = "";  // query from above
var chunks = new List<string>();

using (var connection = new SqlConnection(connectionString)) 
using (var command = connection.CreateCommand()) {
    command.CommandText = sql;
    connection.Open();

    var reader = command.ExecuteReader();

    while (reader.Read()) {
            chunks.Add(reader.GetString(0)); // Reads in chunks of ~2K Bytes
    }
}

var json = string.Concat(chunks);

I get a lot of chunks of data.

Why do we have this limitation? Why don't we get everything in one big chunk?

When I read a nvarchar(max) column, I will get everything in one chunk.

Thanks for an explanation

Gulledge answered 9/8, 2017 at 16:8 Comment(0)

From Format Query Results as JSON with FOR JSON:

Output of the FOR JSON clause

The result set contains a single column.

A small result set may contain a single row.

A large result set splits the long JSON string across multiple rows. By default, SQL Server Management Studio (SSMS) concatenates the results into a single row when the output setting is Results to Grid. The SSMS status bar displays the actual row count.

Other client applications may require code to recombine lengthy results into a single, valid JSON string by concatenating the contents of multiple rows. For an example of this code in a C# application, see Use FOR JSON output in a C# client app.

I would say it is strictly for performance reasons, similiar to XML. More SELECT FOR XML AUTO and return datatypes and What does server side FOR XML return?

In SQL Server 2000 the server side XML publishing - FOR XML (see http://msdn2.microsoft.com/en-us/library/ms178107(SQL.90).aspx) - was implemented in the layer of code between the query processor and the data transport layer. Without FOR XML a SELECT query is executed by the query processor and the resulting rowset is sent to the client side by the server side TDS code. When a SELECT statement contains FOR XML the query processor produces the result the same way as without FOR XML and then FOR XML code formats the rowset as XML. For maximum XML publishing performance FOR XML does steaming XML formatting of the resulting rowset and directly sends its output to the server side TDS code in small chunks without buffering whole XML in the server space. The chunk size is 2033 UCS-2 characters. Thus, XML larger than 2033 UCS-2 characters is sent to the client side in multiple rows each containing a chunk of the XML. SQL Server uses a predefined column name for this rowset with one column of type NTEXT - “XML_F52E2B61-18A1-11d1-B105-00805F49916B” – to indicate chunked XML rowset in UTF-16 encoding. This requires special handling of the XML chunk rowset by the APIs to expose it as a single XML instance on the client side. In ADO.Net, one needs to use ExecuteXmlReader, and in ADO/OLEDB one should use the ICommandStream interface.

Mainsheet answered 9/8, 2017 at 16:28 Comment(0)

As a workaround in the SQL code (i.e. if you don't want to change your querying code to put the chunks together), I found that wrapping the query in a CTE and then selecting form that gives me the results I expected:

--Note that I query from information_schema to just get a lot of data to replicate the problem.

--doing this query results in multiple rows (chunks) returned
SELECT * FROM information_schema.columns FOR JSON PATH, include_null_values

--doing this query results in a single row returned
;WITH SomeCTE(JsonDataColumn) AS
(
    SELECT * FROM information_schema.columns FOR JSON PATH, INCLUDE_NULL_VALUES
) 
SELECT JsonDataColumn FROM SomeCTE

The first query reproduces the problem for me (returns multiple rows, each a chunk of the total data), the second query gives one row with all the data. SSMS wasn't good for reproducing the issue, you have to try it out with other client code.

Enlarge answered 1/9, 2020 at 18:48 Comment(0)

I don't know about a setting preventing FOR JSON from splitting a long string, but it can be solved via subselect. Just wrap your select with another one.

SELECT (SELECT Wiki.WikiId ....)

Lacilacie answered 27/7, 2023 at 8:45 Comment(0)

Recommended topics

Hot tags