Web Scraping Using Node JS..!

In my previous blogs I already discussed Webscraping using Python, R programming & its legality. I recommend you to go through my previous blogs it will be more helpful for you to understand it better.

In this session, I’ll describe you, How to Web Scrape using Node JS..?

Finding Quality Data is like “Looking for a needle in a haystack”.

What is Node JS..?

JavaScript is a popular programming language and it runs in any web browser.

Node JS is an interpreter and provides an environment for JavaScript with some specific useful libraries.

In short, Node JS adds several functionality & features to JavaScript in terms of libraries & make it more powerful.

Let’s begin our topic of Web Scraping using Node JS.

I’m using Visual Studio to run this task.

Node JS Codes

Step 1- Creating the “package.json” file

To create package.json file, I need to run npm init and give a few details as needed in the below screenshot.

Step 2- Install & Call the required libraries

Need to run below codes to install these libraries.

Once the libraries are installed properly then you will see these messages are getting displayed.

Call the required libraries:

Step 3- Select the Website & Data need to Scrape.

I picked this website “https://www.bullion-rates.com/gold/INR/2007-1-history.htm" and want to scrape data of gold rates along with dates.

Step 4- Set the URL & Check the Response Code

Node JS code look like this to pass the URL & check the response code.

Step 5- Inspect & Find the Proper HTML tags

It’s quite easy to find the proper HTML tags in which your data is present.

To see the HTML tags; right click and select inspect option.

Proper HTML Tags:-

If you noticed there are 3 columns in our table, so our HTML tag for table row would be “HeaderRow” & all the column names are present with tag “th” (Table Header).

And for each table row (tr) our data resides in DataRow HTML tag

Now, I need to get all HTML tags to reside under “HeaderRow” & need to find all the “th” HTML tags & finally iterate through “DataRow” HTML tag to get all the data within it.

Step 6- Include the HTML tags in our Code

After including the HTML tags, our code will be:-

Step 7- Cross-check the Data

Print the Data in the Console as logs, so the code for this is like:-

If you go to a more granular level of HTML Tags & iterate them accordingly then you get more precise data.

Conclusion-

I tried to explain Web Scraping using Node JS in a precise way, Hope this will help you in understanding it better.

Find full code on

access full code on github

If you have any questions about the code or web scraping in general, reach out to me on

linkedin.com/in/gyan-vardhan-347570163

We will meet again with something new.

Till then,

Happy Coding..!

Data Scientist at Private Organization