The scraping would have the following properties:
- As in the Irma project, the final output should be an Excel file with a row for each product.
- It should include all products from [url removed, login to view] within the 'Groups' "Dagligvarer", "Specialbutikker" and "Vin".
- For each product, the following properties should be in separate columns:
- Group (e.g. "Dagligvarer", where current product count is 7359)
- MainCategory (e.g. "Grønt")
- SubCategory (e.g. "Frugt & Bær")
- SubSubCategory (e.g. "Friske Bær").
- Product name (e.g. "Blåbær")
- Product subline (e.g. "125 g / Peru / Klasse 1")
- Price (e.g. "35,00")
- Price/weight (e.g. 280,00)
- Weight unit (e.g. "Kg.")
- Type (Regular or Discount, see explanation below)
Further, if it is possible, it would be good to have an extra column that is either equal to "Regular" or "Discount", depending on whether the product has a 'Discount' tag (yellow) on top of the picture. One way to discern these products would be by mapping all products to start with and then afterwards (separately) mapping only the products under the 'Discount' section (1259 products, can be viewed by choosing "Se kun discount varer (1259 stk.)" on the link above.
To explain the output setup, I have attached a brief Excel file which has two products listed