Table of Contents
Introduction
The Document Object Model(DOM) is an application programming interface for HTML and XML documents. It defines the logical structure of documents and how a document can be accessed and manipulated using it.
The DOM is an object-oriented representation of the web page. Documents are represented in DOM using nodes and objects which facilitates changing its structure, style, and content with a programming language like javascript. All HTML tags and even text inside tags are represented as objects so that they can be easily accessed and modified according to the requirement.
From the beginning, DOM is designed in such a way that it can be used with any programming language. In this series, we will be using Javascript for all operations using DOM.
DOM Tree
The HTML Document, when parsed by the browser, is converted into DOM for all further operations. The DOM represents HTML Document as a tree structure of tags as shown below. For example:
<!DOCTYPE HTML> <html> <head> <title>DOM | Hackinbits</title> </head> <body> <h1>Welcome to hackinbits<h1> <p> Learn programming and technology in bits.<p> </body> </html>
You can edit the example document and see the tree structure at this link hixie.ch
Parsing of HTML Document by Browser
Let's discuss briefly how the HTML document is parsed by the browser and DOM is generated. When the browser processes the HTML document, it performs the following steps:
- Conversion: The browser first converts received data into individual characters based on specified character encoding of the document( ex.UTF-8).
- Tokenizing: In the next step, browser read strings of characters obtained from the first step and convert them into distinct tokens as specified by the W3C standards; for example "<html>" is a token.
- Lexing: The tokens produced in the second step are converted into "objects", which define their properties and rules.
- DOM construction: The objects created in this way are then linked to a tree data structure which also captures the relationship between HTML tags as defined in the original document. For example, The HTML object is the parent of body object, the body object is the parent of paragraph object and so on.
The DOM generated by the above steps is used by the browser for all further processing.
In the next article, we will see DOM tree in detail and how we can use javascript to modify the structure, content, and style of an HTML document.
Useful Resources
DOM Specification - whatwg.org