Skip to content

niluminous/fooddata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 

Repository files navigation

🌟 Datasets for Food Domain and AI Applications 🌟

This repository provides a curated list of datasets for research in the food domain, covering recipes, ingredients, food ordering, nutritional information, flavor data, and user interactions. These datasets are valuable for applications in machine learning, AI, and computational gastronomy.


πŸ” Datasets Overview

🍽️ Recipe1M+

  • Description:
    Recipe1M+ is a large-scale, multimodal dataset containing over 1 million structured cooking recipes and 13 million food images. It was designed for cross-modal research, particularly learning embeddings that integrate recipe text (ingredients and instructions) with food images.
    Expands upon Recipe1M by significantly increasing the number of images via web searches.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: MarΓ­n, J., Biswas, A., Ofli, F., et al. (2019). Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. IEEE TPAMI.

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: βœ”οΈ
  • Nutritional Data: βœ”οΈ (50,637 recipes with mapped details from the USDA database)
  • Flavor or Taste Data: ❌
  • Recipe Data: βœ”οΈ
    • Title
    • List of ingredients (with quantities and units parsed)
    • Instructions
    • Course labels

✨ RecipeNLG

  • Description:
    RecipeNLG is a dataset with 2.2 million recipes, designed for semi-structured text generation tasks like recipe generation. It leverages Named Entity Recognition (NER) to extract food entities.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: BieΕ„, M., Gilski, M., Maciejewska, M., et al. (2020). RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: ❌
  • Nutritional Data: ❌
  • Flavor or Taste Data: ❌
  • Recipe Data: βœ”οΈ
    • Title
    • List of ingredients (with quantities and units)
    • Step-by-step instructions

🌍 FooDI-ML

  • Description:
    A multilingual, visio-linguistic dataset containing 2.8 million images and 9.5 million text samples spanning 37 countries and 33 languages. Ideal for text-image retrieval and conditional image generation.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: Amat Olondriz, D., Palau Puigdevall, P., Salvador Palau, A.

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: βœ”οΈ (2.8M images)
  • Textual Data: βœ”οΈ (product names, descriptions)
  • Nutritional Data: ❌
  • Flavor or Taste Data: ❌

πŸ“– Allrecipes.com

  • Description:
    A dataset containing over 1 million user-recipe interactions from 2000 to 2018, featuring 52,821 recipes across 27 categories.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: Gao, Xiaoyan, et al. "Hierarchical Attention Network for Visually-aware Food Recommendation."

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: βœ”οΈ
  • Nutritional Data: ❌
  • Flavor or Taste Data: ❌
  • Recipe Data: βœ”οΈ
    • Title
    • List of ingredients
    • Instructions
    • Recipe category

🍲 Food.com

  • Description:
    Contains 230,000 recipes and 1.1 million user-recipe interactions (reviews) spanning 18 years. Useful for personalized recipe recommendation.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: Bodhisattwa Prasad Majumder, et al. (2019).

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: ❌
  • Nutritional Data: ❌
  • Flavor or Taste Data: ❌
  • Recipe Data: βœ”οΈ
    • Title
    • Ingredients
    • Tags
    • Step-by-step instructions

πŸ₯ FlavorDB2

  • Description:
    A database of 25,595 flavor molecules and 936 ingredients, providing molecular flavor profiles.

  • πŸ“„ Paper: Kumar et al. (2022)
    πŸ“š Citation: Goel M, Grover N, et al. (2024). FlavorDB2: An updated database of flavor molecules.

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: ❌
  • Nutritional Data: ❌
  • Flavor or Taste Data: βœ”οΈ
    • Molecular flavor profiles

πŸ• FoodOrdering Dataset

  • Description:
    A collection of synthetic and human-generated food order examples for building task-oriented systems.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: Rubino, M., Guenon des Mesnards, N., et al. (2022).

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: ❌
  • Nutritional Data: ❌
  • Flavor or Taste Data: ❌
  • Textual Data: βœ”οΈ (includes food orders with semantic parsing annotations)

🍜 RecipeDB

  • Description:
    A structured database of 118,171 recipes integrating nutritional profiles and flavor molecules.

  • πŸ“„ Paper: Sethi et al. (2020)
    πŸ“š Citation: Devansh Batra, Nirav Diwan, et al. (2020).

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: ❌
  • Nutritional Data: βœ”οΈ
  • Flavor or Taste Data: βœ”οΈ
  • Recipe Data: βœ”οΈ

πŸ₯— HUMMUS Dataset

  • Description:
    The HUMMUS dataset combines 507,335 recipes and 1.9 million user-recipe interactions, enriched with health-awareness metrics.

  • πŸ“„ Paper: Link to Paper
    πŸ“š Citation: Felix BΓΆlz, et al. (2023).

  • πŸ“‚ Dataset: Link to Dataset

⭐ Key Features:

  • Image Data: ❌
  • Nutritional Data: βœ”οΈ
  • Flavor or Taste Data: ❌
  • Recipe Data: βœ”οΈ

About

Datasets for Food Domain and AI Applications

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published