Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Scraper issue with allrecipes.com #1481

Open
jachkoune opened this issue Jan 23, 2025 · 2 comments
Open

Scraper issue with allrecipes.com #1481

jachkoune opened this issue Jan 23, 2025 · 2 comments
Labels

Comments

@jachkoune
Copy link

Recipe URL with the issue:

Which data is not being scraped correctly?
(e.g. ingredients, instructions, etc):
Ingredients

What should be shown instead?
it duplicate the ingredients list twice on allrecipes recipes below is a sample:
['1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes', '1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes']


Optional information that helps us understand our users better:

  • Which version of recipe-scrapers are you using?
  • How did you discover the package?
  • What's your use case for recipe-scrapers?
    Feel free to delete this section.
@jachkoune jachkoune added the bug label Jan 23, 2025
@jachkoune jachkoune changed the title Scraper issue with <website> Scraper issue with allrecipes.com Jan 23, 2025
@jknndy
Copy link
Collaborator

jknndy commented Feb 5, 2025

Hi @jachkoune , I'm unable to replicate your issue, could you check what version you're using and provide a full copy of your terminal output?

Output from my run
>>> from urllib.request import urlopen
>>> from recipe_scrapers import scrape_html
>>> url = "https://www.allrecipes.com/recipe/100814/authentic-thai-coc\onut-soup/"
>>> html = urlopen(url).read().decode("utf-8")
>>> scraper = scrape_html(html, org_url=url)
>>> print(scraper.title())
Authentic Thai Coconut Soup
>>> print(scraper.ingredients())
['1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes']

@jachkoune
Copy link
Author

the issue appear when using
req = requests.get(url}) scraper = scrape_html(req.text, org_url=url, wild_mode=True)

i tested with below code its working well but some website doesn't accept connect using urlopen library

response = urlopen(url) html = response.read().decode("utf-8") scraper = scrape_html(html, org_url=url, wild_mode=True)

using REQUEST LIBRARY i get doubled ingredients:

{ 'author': 'MIREYZ', 'canonical_url': 'https://www.allrecipes.com/recipe/100814/authentic-thai-coconut-soup/', 'category': 'Lunch', 'cook_time': 25, 'cuisine': 'Thai', 'description': "This authentic Thai coconut soup is made with galangal and lime leaves combined with coconut milk and shrimp to create a delicious Thai soup. If you have all the ingredients on hand it's easy to put together and makes the perfect comfort food.", 'host': 'allrecipes.com', 'image': 'https://www.allrecipes.com/thmb/UZvpxy4fHsJGrNEKq-KwRKFKR3c=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/550900-authentic-thai-coconut-soup-naurek-4x3-1-891ce479a12747a19088226188a04326.jpg', 'ingredient_groups': [ { 'ingredients': ['1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes', '1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes'], 'purpose': None }], 'ingredients': ['1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes', '1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes'], 'instructions': 'Bring a pot of water to a boil. Boil shrimp until cooked, about 1 minute. Drain shrimp and set aside.\nPour coconut milk and 2 cups water into a large saucepan; bring to a simmer. Add galangal, lemongrass, and lime leaves; simmer until flavors are infused about 10 minutes.\nStrain coconut milk into a separate pan and discard spices. Simmer shiitake mushrooms in coconut milk for 5 minutes. Stir in lime juice, fish sauce, and brown sugar. Season to taste with curry powder.\nTo serve, reheat shrimp in soup and ladle into serving bowls. Garnish with green onion and red pepper flakes.', 'instructions_list': ['Bring a pot of water to a boil. Boil shrimp until cooked, about 1 minute. Drain shrimp and set aside.', 'Pour coconut milk and 2 cups water into a large saucepan; bring to a simmer. Add galangal, lemongrass, and lime leaves; simmer until flavors are infused about 10 minutes.', 'Strain coconut milk into a separate pan and discard spices. Simmer shiitake mushrooms in coconut milk for 5 minutes. Stir in lime juice, fish sauce, and brown sugar. Season to taste with curry powder.', 'To serve, reheat shrimp in soup and ladle into serving bowls. Garnish with green onion and red pepper flakes.'], 'language': 'en', 'nutrients': { 'calories': '314 kcal', 'carbohydrateContent': '17 g', 'cholesterolContent': '86 mg', 'fiberContent': '2 g', 'proteinContent': '15 g', 'saturatedFatContent': '18 g', 'sodiumContent': '523 mg', 'sugarContent': '8 g', 'fatContent': '22 g', 'unsaturatedFatContent': '0 g' }, 'prep_time': 15, 'ratings': 4.6, 'ratings_count': 94, 'site_name': 'Allrecipes', 'title': 'Authentic Thai Coconut Soup', 'total_time': 40, 'yields': '8 servings' }

using URLOPEN library i get:
{ 'author': 'MIREYZ', 'canonical_url': 'https://www.allrecipes.com/recipe/100814/authentic-thai-coconut-soup/', 'category': 'Lunch', 'cook_time': 25, 'cuisine': 'Thai', 'description': "This authentic Thai coconut soup is made with galangal and lime leaves combined with coconut milk and shrimp to create a delicious Thai soup. If you have all the ingredients on hand it's easy to put together and makes the perfect comfort food.", 'host': 'allrecipes.com', 'image': 'https://www.allrecipes.com/thmb/UZvpxy4fHsJGrNEKq-KwRKFKR3c=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/550900-authentic-thai-coconut-soup-naurek-4x3-1-891ce479a12747a19088226188a04326.jpg', 'ingredient_groups': [ { 'ingredients': ['1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes'], 'purpose': None }], 'ingredients': ['1 pound medium shrimp - peeled and deveined', '2 (13.5 ounce) cans canned coconut milk', '2 cups water', '1 (1 inch) piece galangal, thinly sliced', '4 stalks lemon grass, bruised and chopped', '10 makrut lime leaves, torn in half', '1 pound shiitake mushrooms, sliced', '¼ cup lime juice', '3 tablespoons fish sauce', '¼ cup brown sugar', '1 teaspoon curry powder', '1 tablespoon green onion, thinly sliced', '1 teaspoon dried red pepper flakes'], 'instructions': 'Bring a pot of water to a boil. Boil shrimp until cooked, about 1 minute. Drain shrimp and set aside.\nPour coconut milk and 2 cups water into a large saucepan; bring to a simmer. Add galangal, lemongrass, and lime leaves; simmer until flavors are infused about 10 minutes.\nStrain coconut milk into a separate pan and discard spices. Simmer shiitake mushrooms in coconut milk for 5 minutes. Stir in lime juice, fish sauce, and brown sugar. Season to taste with curry powder.\nTo serve, reheat shrimp in soup and ladle into serving bowls. Garnish with green onion and red pepper flakes.', 'instructions_list': ['Bring a pot of water to a boil. Boil shrimp until cooked, about 1 minute. Drain shrimp and set aside.', 'Pour coconut milk and 2 cups water into a large saucepan; bring to a simmer. Add galangal, lemongrass, and lime leaves; simmer until flavors are infused about 10 minutes.', 'Strain coconut milk into a separate pan and discard spices. Simmer shiitake mushrooms in coconut milk for 5 minutes. Stir in lime juice, fish sauce, and brown sugar. Season to taste with curry powder.', 'To serve, reheat shrimp in soup and ladle into serving bowls. Garnish with green onion and red pepper flakes.'], 'language': 'en', 'nutrients': { 'calories': '314 kcal', 'carbohydrateContent': '17 g', 'cholesterolContent': '86 mg', 'fiberContent': '2 g', 'proteinContent': '15 g', 'saturatedFatContent': '18 g', 'sodiumContent': '523 mg', 'sugarContent': '8 g', 'fatContent': '22 g', 'unsaturatedFatContent': '0 g' }, 'prep_time': 15, 'ratings': 4.6, 'ratings_count': 94, 'site_name': 'Allrecipes', 'title': 'Authentic Thai Coconut Soup', 'total_time': 40, 'yields': '8 servings' }

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants