Dishes: a restaurant-oriented food dataset

Ruihan Xu, Shuqiang Jiang, Luis Herranz
Institute of Computing Technology, Chinese Academy of Sciences


Typical food recognition datasets only include food images and food categories. In contrast, Dishes is a restaurant-oriented dataset suitable to study both visual and context-based food recognition.

The dataset consists of dish (i.e. food category in a restaurant menu) images augmented with restaurant information. This information includes the geographic location of the restaurant and the menu (i.e. dish categories in that restaurant).

Some characteristics of the dataset:
  1. Hierarchical organization: restaurant - (menu, location) - images (see Fig. 1).
  2. All the images are collected in 6 cities (see Table I).

Figure 1: Example of data in the Beijing subset

Table I: Overall statistics (6 cities).

Data collection

Data was collected from a online restaurant review site. For each restaurant, we included the corresponding geographical coordinates and a menu with at least 3 dish categories. For each dish category, at least 15 images were included. Restaurants not meeting these requirements were discarded.

In the experiments we use 10 images to train classifiers, and the rest to test them.



Dish Images (.ZIP): Beijing Shanghai Tianjin Nanjing Hangzhou Guangzhou

Geographical Coordinates (.CSV): Beijing Shanghai Tianjin Nanjing Hangzhou Guangzhou


More details and examples (Beijing subset)

Contact: sqjiang [at]

Key Laboratory of Intelligent Information Processing
Institute of Computing Technology, CAS