Scraping information from website using httpClient()

Hey everyone, I am building an inventory for a customer. They want accurate pricing added to a database. Is it possible to use httpClient() to scrape a price from a website?
This is an example.
I want to use this website to get the price of the fuse.

https://www.automationdirect.com/adc/shopping/catalog/circuit_protection_-z-_fuses_-z-_disconnects/fuses/current_limiting_class_j_fuses/jdl12

Can someone give an example of how this could be accomplished? What response method do I need to use in order to accomplish this?

Maybe, but not likely. Most major vendor websites are javascript-driven dynamic sites, or have lots of dynamic substitutions within the pages that carry the information of interest. Very difficult to scrape such sites. (Aside from the actual parsing tasks--non-trivial.)

So would it be best to just have a link on screen for them to follow in order to update the price manually?

You should contact Automation Direct to ask if they have a price-lookup web API.

2 Likes

Will do, thank you @pturmel

FWIW, the way that particular website is developed they naturally have a REST API backing things:
E.G. a GET request to https://www.automationdirect.com/rest/gridCompare/solr/JDL12 returns this JSON:

{
    "response":
    {
        "docs":
        [
            {
                "Agency_Approvals_ms":
                [
                    "UL Listed",
                    "CE",
                    "CSA"
                ],
                "Amperage_Rating_mf":
                [
                    12.0
                ],
                "Amperage_Rating_ms":
                [
                    "12A"
                ],
                "Application_ta":
                [
                    "Feeder/branch protection applications"
                ],
                "Body_Type_ms":
                [
                    "Ferrule"
                ],
                "Brand_Name_ms":
                [
                    "Edison"
                ],
                "Current_M_Limiting_ms":
                [
                    "Yes"
                ],
                "Curve_Characteristics_ms":
                [
                    "Time-delay"
                ],
                "Fuse_Class_ms":
                [
                    "Class J"
                ],
                "Interrupting_Rating_mf":
                [
                    200.0
                ],
                "Interrupting_Rating_ms":
                [
                    "200kA @ 600VAC"
                ],
                "Item_Type_ms":
                [
                    "Fuse"
                ],
                "Made_in_China_s": "No",
                "Made_in_USA_s": "No",
                "Quantity_per_Pack_ta":
                [
                    "10"
                ],
                "Series_ms":
                [
                    "JDL"
                ],
                "Size_mf":
                [
                    13.0
                ],
                "Size_ms":
                [
                    "13/16x2-1/4in"
                ],
                "Voltage_Rating_mf":
                [
                    600.0
                ],
                "Voltage_Rating_ms":
                [
                    "600 VAC"
                ],
                "_version_": 1777683988331102209,
                "alt_keywords": "",
                "download_flg": 0,
                "downloadurl": "",
                "facet_list":
                [
                    "Item_Type_ms:10",
                    "Brand_Name_ms:11",
                    "Amperage_Rating_ms:110",
                    "Voltage_Rating_ms:120",
                    "Series_ms:12",
                    "Size_ms:400",
                    "Fuse_Class_ms:30",
                    "Curve_Characteristics_ms:190",
                    "Body_Type_ms:390",
                    "Current_M_Limiting_ms:140",
                    "Interrupting_Rating_ms:125"
                ],
                "image_file_name": "jdl12.jpg",
                "item_attribute": "Fuse",
                "item_code": "JDL12",
                "item_filter": "Fuse",
                "item_type": "Fuse: 10/pk, Class J, 12A (PN# JDL12)",
                "leadtime_cd": 0,
                "license_type": 0,
                "ltl_flg": 0,
                "made_in_country": "MEXICO",
                "main_category_id": "1972172",
                "on_sale_flg": 0,
                "orderable_flg": 1,
                "p1_category": "circuit protection / fuses / disconnects",
                "p1_node_id": 548783,
                "parent_nodeid":
                [
                    1972172,
                    548783,
                    1972170
                ],
                "price": 248.0,
                "primary_desc": "Edison fuse, JDL series, Class J, current-limiting, time-delay, 12A, 600 VAC, ferrule. Package of 10. Feeder/branch protection applications.",
                "prod_status": 0,
                "prod_type": 1,
                "production_time": -1,
                "rating_enabled": 1,
                "seq_1972170_s": "000006/000027",
                "seq_1972172_s": "000027",
                "seq_548783_s": "000006/000006/000027",
                "shipping_weight": 0.98,
                "spec_url": "/static/specs/efusejdl.pdf",
                "stock_loc_zip": "30040",
                "tech_attributes":
                [
                    "Brand: Edison",
                    "Item: Fuse",
                    "Series: JDL",
                    "Fuse Class: Class J",
                    "Current-Limiting: Yes",
                    "Curve Characteristics: Time-delay",
                    "Amperage Rating: 12A",
                    "Voltage Rating: 600 VAC",
                    "Body Type: Ferrule",
                    "Size: 13/16x2-1/4in or 20.6x57.2mm",
                    "Quantity per Pack: 10",
                    "Application: Feeder/branch protection applications"
                ],
                "timestamp": 1695331158926,
                "treepath":
                [
                    "/catalog/circuit_protection_-z-_fuses_-z-_disconnects/fuses/current_limiting_class_j_fuses/jdl12"
                ],
                "unit_of_measure": "EA",
                "url_fullpath": "/catalog/circuit_protection_-z-_fuses_-z-_disconnects/fuses/current_limiting_class_j_fuses/jdl12",
                "url_fullpath_index": "circuit protection / fuses / disconnects > fuses > current limiting class j fuses > jdl12",
                "viewable_flg": 1,
                "warranty": "12 months",
                "warranty_cd": 0
            }
        ],
        "numFound": 1,
        "numFoundExact": true,
        "start": 0
    },
    "responseHeader":
    {
        "QTime": 0,
        "params":
        {
            "q": "item_code:JDL12",
            "wt": "json"
        },
        "status": 0
    }
}

However, blindly machine scraping a publicly accessible API is a good way to get yourself blacklisted by any network operator who's paying attention, so the much better thing to do from a corporate and personal liability standpoint is to work with the vendor to use an API officially. Web scraping is officially legal (in the US, as of this writing, I am not a lawyer, this is not legal advice) but can still be, in short, considered a "dick move"; thus should not be undertaken lightly and can quickly balloon in complexity.

3 Likes