Introduction to DOM

By Srijan 3 June, 2018
Introduction to DOM
Image from Unsplash

The Document Object Model(DOM) is an application programming interface for HTML and XML documents. It defines logical structure of documents and how a document can be accessed and manipulated using it.

The DOM is an object-oriented representation of the web page. Documents are represented in DOM using nodes and objects which facilitates changing its structure, style, and content with a programming language like javascript. All HTML tags and even text inside tags are represented as objects so that they can be easily accessed and modified according to the requirement.

From begining DOM is designed in such way that it can be used with any programming language. In this series, we will be using Javascript for all operations using DOM.

DOM Tree

The HTML Document, when parsed by the browser, is converted into DOM for all further operations. The DOM represents HTML Document as a tree structure of tags as shown below. For example:

<!DOCTYPE HTML>
<html>
<head>
<title>DOM | Hackinbits</title>
</head>
<body>
<h1>Welcome to hackinbits<h1>
<p> Learn programming and technology in bits.<p>
</body>
</html>

This Document will be structured internally as :

You can edit the example document and see the tree structure at this link hixie.ch

Parsing of HTML Document by Browser

dom parsing
DOM generation from raw data

Let's discuss briefly how HTML document is parsed by the browser and DOM is generated. When the browser processes HTML document, it performs following steps:

Raw Data
Recieved Raw Data
  1. Conversion: The browser first converts received data into individual characters based on specified character encoding of the document( ex.UTF-8).
    Characters
    Characters obtained from Raw Data
  2. Tokenizing: In next step, browser read strings of characters obtained from the first step and convert them into distinct tokens as specified by the W3C standards; for example "<html>" is a token.
    tokens
    Tokens generated from characters
  3. Lexing: The tokens produced in the second step are converted into "objects", which define their properties and rules.
    python list nodes
    Nodes
  4. DOM construction: The objects created in this way are then linked to a tree data structure which also captures the relationship between HTML tags as defined in the original document. For example, The HTML object is the parent of body object, the body object is the parent of paragraph object and so on.
    tree
    DOM Tree

The DOM generated by above steps are used by the browser for all further processing.

In next article, we will see DOM tree in detail and how we can use javascript to modify structure, content, and style of an HTML document.

Useful Resources


Recommended Reading



Follow us on
Sponsored