What is the minimum valid JSON?
Asked Answered
P

8

237

I've carefully read the JSON description http://json.org/ but I'm not sure I know the answer to the simple question. What strings are the minimum possible valid JSON?

  • "string" is the string valid JSON?
  • 42 is the simple number valid JSON?
  • true is the boolean value a valid JSON?
  • {} is the empty object a valid JSON?
  • [] is the empty array a valid JSON?
Phylis answered 24/8, 2013 at 14:8 Comment(8)
some JSON parsers expect an array or an object. They complain about just a number, or a string.Hibbard
As of now, those are validSapowith
Possible duplicate of Is this simple string considered valid JSON?Torques
short answer - {}Leeuwarden
I've just tried on jsonlint, and it now accepts all of these. It must have been a bug that it previously rejected the first 3.Borsch
Something else I've just discovered is that the Newtonsoft JSON library accepts an empty string and returns null, as if the JSON content was actually null. But an empty string is not actually a valid JSON text.Borsch
Just realised my last comment about jsonlint was inaccurate - at the time it was correct based on RFC 4627, but RFCs 7159 and 8259 have changed this, and jsonlint has been updated accordingly.Borsch
The most used "empty json" is {}Surfboarding
D
202

At the time of writing, JSON was solely described in RFC4627. It describes (at the start of "2") a JSON text as being a serialized object or array.

This means that only {} and [] are valid, complete JSON strings in parsers and stringifiers which adhere to that standard.

However, the introduction of ECMA-404 changes that, and the updated advice can be read here. I've also written a blog post on the issue.


To confuse the matter further however, the JSON object (e.g. JSON.parse() and JSON.stringify()) available in web browsers is standardised in ES5, and that clearly defines the acceptable JSON texts like so:

The JSON interchange format used in this specification is exactly that described by RFC 4627 with two exceptions:

  • The top level JSONText production of the ECMAScript JSON grammar may consist of any JSONValue rather than being restricted to being a JSONObject or a JSONArray as specified by RFC 4627.

  • snipped

This would mean that all JSON values (including strings, nulls and numbers) are accepted by the JSON object, even though the JSON object technically adheres to RFC 4627.

Note that you could therefore stringify a number in a conformant browser via JSON.stringify(5), which would be rejected by another parser that adheres to RFC4627, but which doesn't have the specific exception listed above. Ruby, for example, would seem to be one such example which only accepts objects and arrays as the root. PHP, on the other hand, specifically adds the exception that "it will also encode and decode scalar types and NULL".

Detonator answered 24/8, 2013 at 14:17 Comment(8)
Regarding "across browsers", there's a standard ECMAScript parser (see MDN page for JSON.parse) - but JS is only one of many many languages for which there are JSON parsers.Chesser
true, false, and null are as valid as {} or[]Tittletattle
@Tittletattle Could you please clarify your comment? Are you saying true, false, or null alone is a valid JSON text? Could you please cite a source, as this contradicts most of the other answers/comments here?Blinkers
@LawrenceJohnston: Yes, that's my reading of the RFC. JSON text is a serializED object. The sterilization of a string is "string", number is the number, null is null and a Boolean is true/false. The serialization of an object is {} and an array is []. I think that's a resonable interpretation of either section 2 or 2.1, together I consider them conclusive -- apparently the browsers agree with me.Tittletattle
@jmoreno: Surely the quote from section 2"A JSON text is a serialized object or array." Opposes that? JSON Lint also does not think a non-array or object is valid. There is no debate over whether a string is a valid JSON literal; this is over whether a string by itself is valid.Detonator
@Detonator I've asked a follow-up question because it seems like ECMA-404 may have redefined things.Rebbecarebbecca
RFC 7159 obsoletes RFC 4627. It allows any JSON value, not just an object or array. This is similar to the ECMA-404 change.Culberson
@Culberson Indeed. It has since been superseded by RFC 8259.Borsch
H
54

There are at least four documents which can be considered JSON standards on the Internet. The RFCs referenced all describe the mime type application/json. Here is what each has to say about the top-level values, and whether anything other than an object or array is allowed at the top:

RFC-4627: No.

A JSON text is a sequence of tokens. The set of tokens includes six structural characters, strings, numbers, and three literal names.

A JSON text is a serialized object or array.

JSON-text = object / array

Note that RFC-4627 was marked "informational" as opposed to "proposed standard", and that it is obsoleted by RFC-7159, which in turn is obsoleted by RFC-8259.

RFC-8259: Yes.

A JSON text is a sequence of tokens. The set of tokens includes six structural characters, strings, numbers, and three literal names.

A JSON text is a serialized value. Note that certain previous specifications of JSON constrained a JSON text to be an object or an array. Implementations that generate only objects or arrays where a JSON text is called for will be interoperable in the sense that all implementations will accept these as conforming JSON texts.

JSON-text = ws value ws

RFC-8259 is dated December 2017 and is marked "INTERNET STANDARD".

ECMA-262: Yes.

The JSON Syntactic Grammar defines a valid JSON text in terms of tokens defined by the JSON lexical grammar. The goal symbol of the grammar is JSONText.

Syntax JSONText :

JSONValue

JSONValue :

JSONNullLiteral

JSONBooleanLiteral

JSONObject

JSONArray

JSONString

JSONNumber

ECMA-404: Yes.

A JSON text is a sequence of tokens formed from Unicode code points that conforms to the JSON value grammar. The set of tokens includes six structural tokens, strings, numbers, and three literal name tokens.

Heliogravure answered 12/3, 2014 at 18:55 Comment(0)
C
10

According to the old definition in RFC 4627 (which was obsoleted in March 2014 by RFC 7159), those were all valid "JSON values", but only the last two would constitute a complete "JSON text":

A JSON text is a serialized object or array.

Depending on the parser used, the lone "JSON values" might be accepted anyway. For example (sticking to the "JSON value" vs "JSON text" terminology):

  • the JSON.parse() function now standardised in modern browsers accepts any "JSON value"
  • the PHP function json_decode was introduced in version 5.2.0 only accepting a whole "JSON text", but was amended to accept any "JSON value" in version 5.2.1
  • Python's json.loads accepts any "JSON value" according to examples on this manual page
  • the validator at http://jsonlint.com expects a full "JSON text"
  • the Ruby JSON module will only accept a full "JSON text" (at least according to the comments on this manual page)

The distinction is a bit like the distinction between an "XML document" and an "XML fragment", although technically <foo /> is a well-formed XML document (it would be better written as <?xml version="1.0" ?><foo />, but as pointed out in comments, the <?xml declaration is technically optional).

Chesser answered 24/8, 2013 at 14:18 Comment(4)
The XML comparison might be inappropriate, because an XML document is entirely valid without the optional XML declaration. See the XML recommendation at w3.org/TR/xml/#sec-well-formedBrinton
@Brinton Ah, yes, I'd forgotten that it's technically optional, although highly encouraged.Chesser
@Gunther: A nitpick: <foo /> is a well-formed XML document, but not a valid one. (But the same is true of <?xml version="1.0" ?><foo />.)Fortenberry
@Fortenberry Interestingly, the definition here implies XML can only be "valid" against a DTD, meaning that very few XML documents are, as DTDs are very rarely written and declared in practice (as compared to schema definition formats such as XSD or RelaxNG). I was checking, because if you could be valid against an external schema, without referencing it, then <foo /> may or may not be valid against a particular schema, but that's not what that standard states.Chesser
T
6

JSON stands for JavaScript Object Notation. Only {} and [] define a Javascript object. The other examples are value literals. There are object types in Javascript for working with those values, but the expression "string" is a source code representation of a literal value and not an object.

Keep in mind that JSON is not Javascript. It is a notation that represents data. It has a very simple and limited structure. JSON data is structured using {},:[] characters. You can only use literal values inside that structure.

It is perfectly valid for a server to respond with either an object description or a literal value. All JSON parsers should be handle to handle just a literal value, but only one value. JSON can only represent a single object at a time. So for a server to return more than one value it would have to structure it as an object or an array.

Themistocles answered 24/8, 2013 at 14:27 Comment(2)
I think approaching the answer from this direction muddies more than it clarifies: the origin of the name has no bearing on the details of the standard, and the types available in JavaScript may be an inspiration for the types in JSON, but there is no requirement that they match. The introduction on json.org makes this clear: "JSON is a text format that is completely language independent"Chesser
@Chesser I totally agree. I mixed Javascript types with JSON and that is not correct. I'll update my answer.Themistocles
S
5

The ecma specification might be useful for reference:

http://www.ecma-international.org/ecma-262/5.1/

The parse function parses a JSON text (a JSON-formatted String) and produces an ECMAScript value. The JSON format is a restricted form of ECMAScript literal. JSON objects are realized as ECMAScript objects. JSON arrays are realized as ECMAScript arrays. JSON strings, numbers, booleans, and null are realized as ECMAScript Strings, Numbers, Booleans, and null. JSON uses a more limited set of white space characters than WhiteSpace and allows Unicode code points U+2028 and U+2029 to directly appear in JSONString literals without using an escape sequence. The process of parsing is similar to 11.1.4 and 11.1.5 as constrained by the JSON grammar.

JSON.parse("string"); // SyntaxError: Unexpected token s
JSON.parse(43); // 43
JSON.parse("43"); // 43
JSON.parse(true); // true
JSON.parse("true"); // true
JSON.parse(false);
JSON.parse("false");
JSON.parse("trueee"); // SyntaxError: Unexpected token e
JSON.parse("{}"); // {}
JSON.parse("[]"); // []
Sherrill answered 24/8, 2013 at 14:19 Comment(3)
While a useful reference, that is the specification of a particular JSON parser (the one defined in the ECMAScript standard) not for the format itself. json.org explicitly states that JSON is "completely language independent", so there is no one correct parser.Chesser
JavaScript/ECMAScipt is the inspiration for JSON, and a user of it, but not the "home" of it. JSON was derived from the object literal notation in (all earlier versions of) ECMAScript, but is not identical to it. The JSON.parse function was then added to later versions of the ECMAScript standard based on Crockford's grammar and the RFC.Chesser
You should do JSON.parse("\"string\"");Simonesimoneau
S
3

Yes, yes, yes, yes, and yes. All of them are valid JSON value literals.

However, the official RFC 4627 states:

A JSON text is a serialized object or array.

So a whole "file" should consist of an object or array as the outermost structure, which of course can be empty. Yet, many JSON parsers accept primitive values as well for input.

Steinway answered 24/8, 2013 at 14:18 Comment(0)
A
0

Starting from PHP 8.3, there's a new function available called json_validate(). This function is quite handy as it checks if a string contains valid JSON.

The usage is straightforward. You simply pass the JSON string to json_validate(), and it will verify whether the string is a valid JSON format or not. Returns true if the given string is syntactically valid JSON, otherwise returns false.

More: https://www.php.net/manual/en/function.json-validate.php

Aruba answered 27/12, 2023 at 12:59 Comment(0)
A
-1

Just follow the railroad diagrams given on the json.org page. [] and {} are the minimum possible valid JSON objects. So the answer is [] and {}.

Adhamh answered 24/8, 2013 at 16:26 Comment(4)
It's not a FSM, it's a grammar. And it doesn't seem to indicate which production is the start rule. If the start rules were array and object you would be right, but it's reasonable to expect value to be the start.Burchell
Looks fairly straightforward to me though. Douglas Crockford calls them that and we always start from left and follow the tracks to the right. The smallest track gives the minimal valid JSON.Adhamh
It's not your interpretation of any particular grammar rule I'm objecting to, it's that you chose two rules and assume one can only start from those, not from others. If you look at the values rule instead (or in addition to) the array and object rules, then standalone numbers and strings are a valid JSON document.Burchell
-1. Firstly, as @delnan points out, nothing in the diagrams at json.org suggests that a full JSON text must be an object or array; you've picked those two arbitrarily, not based upon anything on json.org. Secondly, nitpicking over terminology: [], while a valid JSON text under every spec that's ever had an opinion on the matter, is not a "valid JSON object", since it's not a JSON object. "Object" in JSON specifically refers to the {} notation; JSON arrays are not JSON objects.Edvard

© 2022 - 2024 — McMap. All rights reserved.