Why doesn't Node.js have a native DOM?
Asked Answered
C

13

60

When I discovered that Node.js was built using the V8 JavaScript engine, I thought:

Great, web scraping will be easier as the page will be rendered like in the browser, with a "native" DOM supporting XPath and any AJAX calls on the page executed.

  1. Why doesn't it have a native DOM when it uses the same JavaScript engine as Chrome?
  2. Why doesn't it have a mode to run JavaScript in retrieved pages?
  3. What am I not understanding about JavaScript engines vs the engine in a web browser?

Many thanks!

Cicatrize answered 11/7, 2011 at 22:3 Comment(3)
Try phantomjs.org.Dorkus
Because node isn't a browser.Mortise
The fact that Node.js is not a browser is correct, but that's not the reason. DOM is not a browser thing, it's an API for working with XML/HTML/SGML-like documents, that's all. The reason Node.js doesn't have it simply because their primary scope was "backend" services, and DOM parsing evidently isn't an API they considered essential there. Also, it has zilch to do with JavaScript -- the DOM API is specified with WebIDL and can and arguably would be best implemented as a native module for Node.js.Hibernal
D
70

The DOM is the DOM, and the JavaScript implementation is simply a separate entity. The DOM represents a set of facilities that a web browser exposes to the JavaScript environment. There's no requirement however that any particular JavaScript runtime will have any facilities exposed via the global object.

What Node.js is is a stand-alone JavaScript environment completely independent of a web browser. There's no intrinsic link between web browsers and JavaScript; the DOM is not part of the JavaScript language or specification or anything.

I use the old Rhino Java-based JavaScript implementation in my Java-based web server. That environment also has nothing at all to do with any DOM. It's my own application that's responsible for populating the global object with facilities to do what I need it to be able to do, and it's not a DOM.

Note that there are projects like jsdom if you want a virtual DOM in your Node project. Because of its very nature as a server-side platform, a DOM is a facility that Node can do without and still make perfect sense for a wide variety of server applications. That's not to say that a DOM might not be useful to some people, but it's just not in the same category of services as things like process control, I/O, networking, database interop, and so on.

There may be some "official" answer to the question "why?" out there, but it's basically just the business of those who maintain Node (the Node Foundation now). If some intrepid developer out there decides that Node should ship by default with a set of modules to support a virtual DOM, and successfully works and works and makes that happen, then Node will have a DOM.

Decathlon answered 11/7, 2011 at 22:4 Comment(12)
There's also no "window" object, fwiw.Disposal
@Disposal right - it's the "global" object, which in browsers happens to be exposed via the symbol "window".Decathlon
This doesn't address why "doesn't it have a mode to run JS in retrieved pages?". C# and Java can be run independently of a web browser, but they both have DOM functionality in their standard library. It is often used for this purpose.Brame
Questions of the form, "why doesn't x do y?" are generally pretty hard to answer.Decathlon
Coming from a PHP background, I'm shocked to learn that NodeJs has no native DOM support. Sure, a server environment doesn't need DOM as much as a browser, but that doesn't mean native DOM support isn't a valuable tool in any server environment. It can be used for (1) templating, (2), site scraping, (3) parsing data from third party webservices, etc. Especially the latter feature seems essential to me for a modern HTTP server.Harshman
@JohnSlegers I don't deny anything you wrote, but in the 36 years I've been writing software I've never wanted server-side DOM support, just as a data point.Decathlon
@Decathlon : So you never had any use case that required you to parsed a HTML or XML serverside in 36 years? Lucky you, I guess ;-)Harshman
@JohnSlegers yes I consider that lucky :) :) Actually I've had to deal with XML, but I wrote an XML parser generator for that because most general-purpose libraries at the time were abominably slow.Decathlon
@Decathlon : In PHP, I've been using an older version of github.com/Masterminds/html5-php for parsing XML and HTML as the parsing layer of github.com/PHPPowertools/DOM-Query. The best option I've found for NodeJS so far is github.com/jindw/xmldom. Are you familiar with that library?Harshman
Nope I'm not a Node user beyond trivial CLI hacks.Decathlon
It's fair to say this isn't restricted to Nodejs and V8 this is more generalised to JS engine implementations. EcmaScript & DOM specifications are totally different things, they are just both implemented and available by a browser. "The DOM is not, however, typically provided by the JavaScript engine but instead by a browser." V8 IntroCharged
I think what some of the answers might be missing is that while web browsers typically provide the DOM functionality, there's nothing that says that a DOM library has be be intrinsically tied to an actual web browser being displayed on a screen. And just because the ecmascript specification doesn't say anything about the DOM, doesn't mean that a DOM library couldn't be part of the standard library shipped with node. And a node DOM library isn't any more "virtual" than the one shipped with your web browser.Romanaromanas
L
25

P.S: When reading this question I was also wondering if V8 (node.js is built on top of this) had a DOM

Why when it uses the same JS engine as Chrome doesn't it have a native DOM?

But I searched google and found Google's V8 page which recites the following:

JavaScript is most commonly used for client-side scripting in a browser, being used to manipulate Document Object Model (DOM) objects for example. The DOM is not, however, typically provided by the JavaScript engine but instead by a browser. The same is true of V8—Google Chrome provides the DOM. V8 does however provide all the data types, operators, objects and functions specified in the ECMA standard.

node.js uses V8 and not Google Chrome.

Likewise, why doesn't it have a mode to run JS in retrieved pages?

I also think we don't really need it that bad. Ryan Dahl created node.js as one man (single programmer). Maybe now he (his team) will develop this, but I was already extremely amazed by the amount of code he produced (crazy). He wanted to make a non-blocking easy/efficient library, which I think he did a mighty good job at.

But then again, another developer created a module which is pretty good and actively developed (today) at https://github.com/tmpvar/jsdom.

What am I not understanding about Javascript engines vs the engine in a web browser? :)

Those are different things as is hopefully clear from the quote above.

Lancewood answered 11/7, 2011 at 22:48 Comment(2)
In that case, which "engine in a web browser" is used by Google Chrome?Conchiferous
@AndersonGreen, as of now, Blink.Brame
S
11

The Document Object Model (DOM in short) is a programming interface for HTML and XML documents and it represents the page so that programs can change the document structure, style, and content. More on this subject.


The necessary distinction between client-side (browser) and server-side (Node.js) and their main goals:

  • Client-side: accessing and displaying information of the web
  • Server-side: providing stable and reliable ways to deliver web information

Why is there no DOM in Node.js be default?

By default, Node.js doesn't have access, nor have any knowledge about the actual DOM in your own browser. Node.js just delivers the data, that will be used by your own browser to process and render the whole website, the DOM included. The server provides the data to your browser to use and process. That is the intended way.

Why wouldn't you want to access the DOM in Node.js?

Accessing your browser's actual DOM using Node.js would be just simply out of the goal of the server. Your own browser's role is to display the data coming from the server. However it is certainly possible and there are multiple solutions in different level of depths and varieties to pre-render, manipulate or change the DOM using AJAX calls. We'll see what future trends will bring.

Why would you want to access the DOM in Node.js?

By default, you shouldn't access your own, actual DOM (at least some data of it) using Node.js. Client-side and server-side are separated in terms of role, functionality, and responsibility based on years of experience and knowledge. Although there are several situations, where there are solid reasons to do so:

  • Gathering usage data (A/B testing, UI/UX efficiency and feedback)
  • Headless testing (Development, automation, web-scraping)

How can you access the DOM in Node.js?

  • jsdom: pure-JavaScript implementation, good for testing your own DOM/browser-related project
  • cheerio: great solution if you like/often use jQuery
  • puppeteer: Google's own way to provide headless testing using Google Chrome
  • own solution (your possible future project link here)

Although these solutions do not provide a way to access your browser's own, actual DOM by default, but you can create a project to send some form of data about your DOM to the server, then use/render/manipulate that data based on your needs.

...and yes, web-scraping and web development in terms of tools and utilities became more sophisticated and certainly easier in several fields.

Sunbow answered 12/6, 2018 at 19:54 Comment(1)
I feel like this answer misses the point of having a DOM parser in the standard library of node. When you state something like, "these solutions do not provide a way to access your browser's own, actual DOM by default", it presumes that a user is looking for access to their browser's DOM, or that the "browser's own, actual DOM" is the "one true DOM". Any library that provides DOM parsing is an "actual DOM", whether it's tied to a browser instance or not.Romanaromanas
B
8

node.js chose not to include it in their standard library. For any functionality, there is an inevitable tradeoff between comprehensiveness, scalability, and maintainability.

That doesn't mean it's not potentially useful. There is at least one JavaScript DOM implementation intended for NodeJS (among other CommonJS implementations).

Brame answered 15/3, 2012 at 2:15 Comment(0)
H
4

You seem to have a flawed assumption that V8 and the DOM are inextricably related, that's not the case. The DOM is actually handled by Webkit, V8 doesn't handle the DOM, it handles Javascript calls to the DOM. Don't let this discourage you, Node.js has carved out a significant niche in the realtime server market, but don't let anybody tell you it's just for servers. Node makes it possible to build almost anything with JavaScript.

It is possible to do what you're talking about. For example there is the very good jsdom library if you really need access to the DOM, and node-htmlparser, there are also some really good scraping libraries that take advantage of these like apricot.

Hance answered 11/7, 2011 at 22:33 Comment(0)
M
3

2018 answer: mainly for historical reasons, but this may change in future.

Historically, very little DOM manipulation was done on the server. Addiotinally, as other answers allude, the JS stdlib and the DOM are seperate libraries - if you're using node, for, say, Unix scripting, then HTMLElement and NodeList etc aren't really relevant to that.

However: server-side DOM manipulation is now a very common part of delivering web apps. Web servers need to understand the structure of pages, and, if asked to render a resource as HTML, deliver HTML content that reflects the initial state of a web application. This means web apps load much faster than if the server simply delivers a stub page and has the browsers then do the work of filling in the real content. Currently this is done with JSDom and similar, but in the same way node has Request and Response objects built in, having DOM functions maintained as part of the stdlib would help with this task.

Multifold answered 10/1, 2018 at 12:39 Comment(2)
Someone probably just read the first line and downvoted. I think it's more accurate to say Nodejs doesn't implement DOM. Leading with "mainly for historical reasons" isn't really accurate "The DOM is not, however, typically provided by the JavaScript engine but instead by a browser." V8 introCharged
@V8intro the answer already says 'JS stdlib and the DOM are seperate libraries' right at the beginning. It's not "more accurate to say Nodejs doesn't implement DOM." - we already knows node doesn't implement the DOM, the question is why. And this (along with MatthewFlaschen's) is the most accurate answer.Multifold
L
0

Javascript != browser. Javascript as a language is not tied to browsers; node.js is simply an implementation of Javascript that is intended for servers, not browsers. Hence no DOM.

Longsome answered 11/7, 2011 at 22:5 Comment(1)
When I think of a DOM, I think of the DomDocument interface, the DomElement interface and other interfaces we've all become used to in C, PHP, Javascript and probably some other languages. I also think of a parser that allows text to be parsed into DOM. These are features I've grown used to using in a PHP environment and that I'm already missing now I'm doing my first experiments in NodeJs. What do I use it for, you ask? Well, I use it for web scraping, templating and - most importantly - for web services!Harshman
T
0

Node is a runtime environment, it does not render a DOM like a browser.

Ticklish answered 12/6, 2018 at 19:34 Comment(0)
E
-1

If you read DOM as 'linked objects immediately accessible from my script' then the answer 'it does, but it's very different from set of objects available from web document script'. The main reason is that node is 'evented I/O for V8', not 'HTML tree objects for V8'

Extranuclear answered 12/7, 2011 at 2:27 Comment(1)
When I think of a DOM, I think of the DomDocument interface, the DomElement interface and other interfaces we've all become used to in C, PHP, Javascript and probably some other languages. I also think of a parser that allows text to be parsed into DOM. These are features I've grown used to using in a PHP environment and that I'm already missing now I'm doing my first experiments in NodeJs. What do I use it for, you ask? Well, I use it for web scraping, templating and - most importantly - for web services!Harshman
G
-1

Because there isn't a DOM. DOM stands for Document Object Model. There is no document in Node, so not DOM to manipulate it. That is definitively a browser thing.

You can use a library like cheerio though which gives you some simple DOM manipulation.

Node is server-level JavaScript. It's just the language applied to a basic system API, more like C++ or Java.

Gargan answered 12/6, 2018 at 19:34 Comment(0)
H
-2

Node.js is for serverside programming. There is no DOM to be rendered in the server.

Heterogenesis answered 11/7, 2011 at 22:5 Comment(5)
Sure, but does the V8 engine not handle the DOM in the browser? Or is it created by the browser and then 'attached' (in some way) to the V8 engine to allow Javascript to interact with it? I.e. why does Node.js need what-appear-to-be fragile libraries to query the DOM, when I can use V8 in Chrome to make the queries in a fast and reliable fashion?Cicatrize
@Cicatrize no, the V8 engine does not "handle the DOM".Decathlon
That's a terrible answer. Even php has a DOM.Gomar
When I think of a DOM, I think of the DomDocument interface, the DomElement interface and other interfaces we've all become used to in C, PHP, Javascript and probably some other languages. I also think of a parser that allows text to be parsed into DOM. These are features I've grown used to using in a PHP environment and that I'm already missing now I'm doing my first experiments in NodeJs. What do I use it for, you ask? Well, I use it for web scraping, templating and - most importantly - for web services!Harshman
DOM is rendered on the server to fast load AJAX apps all the time.Multifold
P
-2

It seems people have answered 'why' but not how. A quick answer of how is that in a web browser, a document object is exposed (hence DOM , document object model). On windows this object is called document object. You can refer to this page and look at the methods it exposes which are for handling HTML documents like createElement. I don't use node.js or haven't done COM programming in a while but I'd imagine you could use DOM in node.js by simply calling the COM object IHTMLDocument3. Of course for other platforms like Mac OS X or Linux you would probably have to use something from their OS api. This should allow you to easily build a webpage server side using DOM, or to scrape incoming web pages.

Pyrethrum answered 16/3, 2016 at 20:7 Comment(0)
N
-5

1) What does it mean for it to have a D ocument O bject M odel? There's no document to represent.

2) You're most of the time you're not retrieving pages. You can, but most Node apps probably won't be.

3) Without a document and a browser, Javascript is just another programming language. So you may ask why there isn't a DOM in C# or Java

Narcosynthesis answered 11/7, 2011 at 22:9 Comment(4)
There is a DOM in both the .NET and Java standard libraries, and it's a reasonable why it's not part of the node.js standard library.Brame
@MatthewFlaschen Those are for XML documents which I don't believe is what the OP is talking about. There is no omnipresent document that is always guaranteed to exist like there is in a browser environment.Narcosynthesis
True, with just the standard library he would be limited to XHTML. That alone should be valuable. However, there are third-party libraries plugging into these standard libraries for HTML. .NET has HtmlAgilityPack and Java has the org.w3c.dom.html subpackage, which extends org.w3c.dom. No one said there is an "omnipresent document". He specifically asked about a particular scenario, "web scraping."Brame
When I think of a DOM, I think of the DomDocument interface, the DomElement interface and other interfaces we've all become used to in C, PHP, Javascript and probably some other languages. I also think of a parser that allows text to be parsed into DOM. These are features I've grown used to using in a PHP environment and that I'm already missing now I'm doing my first experiments in NodeJs. What do I use it for, you ask? Well, I use it for web scraping, templating and - most importantly - for web services!Harshman

© 2022 - 2024 — McMap. All rights reserved.