Web Scraping Using Node JS..!

4 min readNov 4, 2020

In my previous blogs I already discussed Webscraping using Python, R programming & its legality. I recommend you to go through my previous blogs it will be more helpful for you to understand it better.

In this session, I’ll describe you, How to Web Scrape using Node JS..?

Finding Quality Data is like “Looking for a needle in a haystack”.

What is Node JS..?

JavaScript is a popular programming language and it runs in any web browser.

Node JS is an interpreter and provides an environment for JavaScript with some specific useful libraries.

In short, Node JS adds several functionality & features to JavaScript in terms of libraries & make it more powerful.

Let’s begin our topic of Web Scraping using Node JS.

I’m using Visual Studio to run this task.

Node JS Codes

Step 1- Creating the “package.json” file

To create package.json file, I need to run npm init and give a few details as needed in the below screenshot.

Step 2- Install & Call the required libraries

Need to run below codes to install these libraries.

Install required libraries

Once the libraries are installed properly then you will see these messages are getting displayed.

Call the required libraries:

call the libraries

Step 3- Select the Website & Data need to Scrape.

I picked this website “https://www.bullion-rates.com/gold/INR/2007-1-history.htm" and want to scrape data of gold rates along with dates.

Step 4- Set the URL & Check the Response Code

Node JS code look like this to pass the URL & check the response code.

sample code

Step 5- Inspect & Find the Proper HTML tags

It’s quite easy to find the proper HTML tags in which your data is present.

To see the HTML tags; right click and select inspect option.

Proper HTML Tags:-

If you noticed there are 3 columns in our table, so our HTML tag for table row would be “HeaderRow” & all the column names are present with tag “th” (Table Header).

And for each table row (tr) our data resides in DataRow HTML tag

Now, I need to get all HTML tags to reside under “HeaderRow” & need to find all the “th” HTML tags & finally iterate through “DataRow” HTML tag to get all the data within it.

Step 6- Include the HTML tags in our Code

After including the HTML tags, our code will be:-