• Tidak ada hasil yang ditemukan

Where did you take those photos? Tourists’ preference clustering based on facial and background recognition

N/A
N/A
Alya Nafisa

Academic year: 2023

Membagikan "Where did you take those photos? Tourists’ preference clustering based on facial and background recognition"

Copied!
13
0
0

Teks penuh

(1)

Journal of Destination Marketing & Management 21 (2021) 100632

Available online 28 June 2021 2212-571X/Published by Elsevier Ltd.

Where did you take those photos? Tourists ’ preference clustering based on facial and background recognition

Ning Deng

a,b,*

, Jiayi Liu

a,b

aSchool of Tourism Sciences, Beijing International Studies University, 10024, Beijing, China

bResearch Center of Beijing Tourism Development, 10024, Beijing, China

A R T I C L E I N F O Keywords:

UGC photos Instagram Facial recognition Tourist preference

A B S T R A C T

The Internet and social media have become major channels for communicating destination image. Pictorial destination image, as shaped by user-generated photos posted on social media, is a key factor in potential tourists’ perceptions and decision making. Over 50% of photos contain tourists’ facial information. This paper proposes a novel approach to analyzing tourists’ travel patterns and their preferences based on facial and photo content recognition techniques. Photos containing tourists’ faces were filtered, and different tourist groups were classified by age and gender. Findings indicate that the grouped tourists expressed varying preferences in terms of photographed points of interest and backgrounds. These phenomena were explained based on the notion of tourists’ gaze and several psychological theories.

1. Introduction

In the era of big data, visual content shared by tourists has become the main channel through which audiences receive locations’ destina- tion image. Social media platforms such as Instagram generate hundreds of millions of photos and videos daily. Tourists often select their favorite points of interest (POIs) or backgrounds to take photos and share them online; taking photos has thus become a popular and indispensable tourism activity (Albers & James, 1988). Photo content analysis has been widely adopted in tourism research (Balomenou & Garrod, 2019).

Over time, the manually coded analysis of only a few hundred photos has ballooned into analysis of thousands of photos or more (Balomenou

& Garrod, 2019). Research based on big data has also shifted from

focusing on tourists’ spatial and temporal behavior (Oender et al., 2016) to destination image communication through image-based content (Deng, Liu, Niu, & Ji, 2019; Deng & Li, 2018; Mak & Athena, 2017).

Destination marketing organizations (DMOs) have long played a dominant role in destination image management, with magazine, tele- vision, and brochure advertisements representing the main avenues through which destinations are publicized (Connell, 2005). Yet with the advent of the mobile Internet and Web 2.0, travel photos taken and shared by amateurs via social networks have become a popular means of attracting potential tourists, partly because photos published by everyday tourists are likely to be viewed by potential visitors.

Additionally, the perceived credibility of such content is higher than content published through official sources (Zeng & Gerritsen, 2014).

User-generated content (UGC) has thus become an important topic in tourism marketing research (Lu & Stepchenkova, 2015).

With respect to destination marketing, photos and videos are the most prevalent materials through which potential tourists’ attitudes and intentions can be shaped. The ongoing evolution of social networking has brought the Internet from the Web side to the mobile side. In today’s mobile era, smartphones are the primary terminal for information ex- change and data generation. As smartphones become more powerful with larger storage, producers have begun to consider personal enter- tainment experiences. Taking, filtering, and sharing photos are common user needs, and the process through which users capture photos has attracted growing scholarly attention. Before the invention of smart- phones, tourists could only share developed photos with their families and friends. Today, however, travelers can share photos with thousands of users on social platforms. Instagram (2019) reportedly hosted more than 100 million photos and videos daily in 2019 and had over 1 billion active monthly users. Compared with professional photos, UGC data are more realistic, intelligible, and appealing and tend to make better im- pressions on potential tourists (Bakhshi et al., 2014a). Some travelers enjoy posting photos online out of narcissism (Bakhshi et al., 2014a), while many others share photos to inform others that they have visited a specific place (Bakhshi et al., 2014a).

* Corresponding author. School of Tourism Sciences, Beijing International Studies University, 10024, Beijing, China.

E-mail address: dengning@bisu.edu.cn (N. Deng).

Contents lists available at ScienceDirect

Journal of Destination Marketing & Management

journal homepage: www.elsevier.com/locate/jdmm

https://doi.org/10.1016/j.jdmm.2021.100632

Received 11 December 2020; Received in revised form 16 June 2021; Accepted 17 June 2021

(2)

Research has shown that photos containing people’s faces are roughly 38% more popular than those with no people (Bakhshi et al., 2014a). Photos featuring people’s faces account for nearly 50% of tourists’ pictures. Recently, a popular travel behavior called ‘punching the card” has increased travelers’ likelihood of taking photos that include scenery and people. Tourists tend to choose classic destination sites or a representative characteristic and unique environment when taking photos (Bin & Qiaolin, 2015). The photo background a tourist chooses can convey their instincts, feelings, recognition, and admiration of the pictured destination (Pan et al., 2014). Compared with photos of scenery, portraits appear more useful in studies of the tourist experience and of tourist gaze theory: portraits describe the production and con- sumption of symbols that are meaningful to tourists and can reflect the most valuable, vivid parts of a destination (Pearce & Wang, 2019).

Among the scarce research on photos featuring tourists themselves, most work has focused on the photographer’s behavior (Dinhopl & Gretzel, 2016).

The main research question guiding the study pertains to the re- lationships among tourists’ basic profiles and their travel behavior, namely their chosen POIs and photographic backgrounds. Most photo- based tourism research has involved content analysis, namely manual coding; however, this method is ineffective when dealing with tens of thousands of UGC images. Convolutional neural networks, a machine learning technique, have high accuracy in image content recognition (T.

Chen, D. Borth, T. Darrell, & S. F. Chang, 2014). In this study, we applied object and facial recognition techniques to analyze more than 10,000 UGC photos taken by inbound tourists visiting Beijing, China. Based on UGC photos from Instagram, we used big data and an image processing method to demonstrate that tourists with different demographic features have diverse preferences when selecting POIs or photographic back- grounds to take selfies or tourism photos.

The results make three contributions to the literature: (1) analyzing photos of destination sites that include human faces and classifying them via machine learning; (2) exploring variations in tourists’ chosen POIs and photo background selections based on visitors’ demographics; and (3) providing actionable, content-based suggestions for DMOs targeting specific consumer groups.

2. Literature review 2.1. Tourism destination image

Tourism destination image is a common topic in tourism research and is related to destination marketing (Andreu et al., 2000; Baloglu &

Mccleary, 1999), which can directly affect potential tourists’ travel de- cisions. Crompton (1979) defined destination image as "the sum of a person’s all beliefs, ideas and impressions of a destination”. Scholars have since introduced multiple classifications of tourism destination image, most often consisting of projected (supply side) and perceived (demand side) destination image (Yueqian et al., 2009). The projected image refers to a destination’s marketing image created by DMOs, whereas the perceived image represents tourists’ feelings about a destination (Pan & Li, 2011).

Web 2.0 has altered destinations’ communication channels and context. Researchers have proposed the idea of received image (Chen &

Tsai, 2007) from an image communication perspective to distinguish it from perceptive image, which is more concerned with how communi- cation avenues, the media, and content influence image projection.

Other studies have revealed a semantic gap between a destination’s projected and received image (Chon, 1991; Crompton & J, L 1979).

More specifically, the preceptor’s image is derived from an intentional, conscious projected image and an unintentional projected image, which is objectively formed within one’s external environment (Kim & Lehto, 2013). This phenomenon has become even more apparent given the leading roles of the Internet and Web 2.0 in communication settings: on one hand, tourists are affected by DMOs’ projected images; on the other

hand, travelers actively project their own perceived image to potential tourists via text, photos, videos, and other UGC.

In terms of perceived image, the most widely accepted classification includes cognitive image and affective image (Deng & Li, 2018). A destination’s cognitive image refers to an individual’s explicit impres- sions and knowledge of a destination (manifest), such as its weather, scenery, traffic, and food; the affective image represents the strength and tendency of a person’s emotions and feelings about a destination (e.g.

relaxed, happy, sleepy). Studies suggest that perceived image directly shapes the overall destination image of tourists (or potential tourists) and indirectly affects their affective image of a destination (Ji, 2011).

Scholars have also put forth the concept of conative image (Amaro et al., 2016; Anand et al., 1988) on the bases of perception and emotion, pointing out that this type of image is the product of a destination’s perceived and affective image and directly affects tourists’

decision-making process.

Regarding cognitive image, studies have tested a destination’s commercialization and divided it into an organic and induced image (Agapito et al., 2013). A destination’s organic image is unrelated to marketing materials and comes instead from non-marketing content, including movies, news, online word-of-mouth (WOM), books, maga- zines, and other sources. A destination’s induced image is deliberately crafted by a destination and often presented through promotional movies, travel brochures, and posters. Some scholars have further sub- divided a commercialized image into eight levels according to the strength of commercialization, covering three major categories:

‘organic”, ‘autonomous”, and ‘induced” (Chen & Tsai, 2007). A desti- nation’s autonomous image is based on relevant information released by a neutral third party, such as through films, books, reports, and other publications. This type of image is characterized by its wide range of influence, serves no commercial marketing purpose, and garners po- tential tourists’ trust. It has also elicited close attention in the destina- tion image projection process (Agapito et al., 2013). Non-DMO destination photos, such as those evaluated in this paper, also repre- sent this image form.

Ateljevic once proposed another concept called the complex image, which emerges after a trip. This image arises from the experiences that tourists had and the people and places they saw while visiting the destination (Ateljevic, 2000). Successful marketing (e.g. from adver- tisers) thus depends on the extent to which a projected image matches a tourist’s desired image (Huang et al., 2009b). Pictorial elements are introduced during this process. For example, Ryan (1991) investigated how pictorial elements in advertisements contribute to destination image formation. The visuals projected through websites can convey either multiple destination aspects or only favorable characteristics likely to be appreciated by tourists (Olson et al., 1986). In this study, destination images projected through Instagram represent destinations’ pictorial images.

2.2. Tourism research involving photos

Tourists’ photos are more than simply representations of places;

these pictures are infused with personal meaning (Stylianou-Lambert, 2012). Photos have been intrinsically linked with tourism, as if one cannot travel without engaging in some form of photography. This ritualistic or routinized nature of tourist photography is delineated by Urry’s (2011) concept of the tourist gaze. Audience awareness also contributes to what and how travel experiences are shared in photos (Markwell, 1997). The photos tourists post online are closely linked to impressions as well. This process suggests that a group of tourists from similar demographic, travel, and cultural backgrounds will likely react similarly to accepted social norms (Lo & McKercher, 2015). Tourism scholars have applied content analysis to pictorial material for quite some time. Tourists’ photos can reflect their unique personal motiva- tions (Chalfen, 1979) and emotions. Urry (2011) found that taking photos entails a process of accumulating personal and family memories.

(3)

Photos posted on social media with accompanying captions can capture tourists’ emotions and experiences (Conti & Lexhagen, 2020). It has been further argued that one’s motivations for taking photos and the content of these photos are fundamental aspects of a visual culture and modern society (Robinson, 2009). Parker (2009) introduced the use of photographs in early management research. Tyson (2009) addressed the roles of photos from photographers’ perspectives, indicating that pho- tographs highlight what photographers deem important (Tyson, 2009).

In short, the analysis of photo material is essential to examining the representations of a place and how tourists associate with that place.

Portrait photos are also a common focus of tourism scholarship.

These photos attract more attention than pictures that do not feature people (Bakhshi et al., 2014b). Portraits often showcase individual at- tributes and body language (Pearce et al., 2015). Studies of touristic photos featuring people initially appeared in anthropology (Dinhopl &

Gretzel, 2016), with researchers considering photographers’ poses or intentions and motivations for sharing selfies. Some studies revealed that tourists’ selfie-sharing was indicative of narcissistic tendencies (Christou et al., 2020). Meanwhile, marketing scholars have frequently studied UGC photos containing people. In the field of general con- sumption, Hartmann et al. (2020) discussed the effects of selfies on brand building relative to consumer goods. Dinhopl and Gretzel (2016) proposed that in a tourist gaze context, selfies include unique perfor- mance modes that lead to consumption behavior while becoming scenic spots in and of themselves (Dinhopl & Gretzel, 2016). Photos have also been classified based on the pictured individual’s posture and style.

Many factors, such as the pictured person’s nationality and gender, can affect how self-portraits are taken and may reflect cultural differences (Nicola et al., 2015).

Personal factors refer to psychological characteristics such as an in- dividual’s values, motivations, personality, or lifestyle along with soci- odemographic features (Sonnleitner, 2011). Tourists can be grouped by social factors (e.g. age, education, and marital status) as well as by gender, social class, family lifecycle, or place of residence (Sonnleitner, 2011). These personal factors influence individuals’ cogni- tion—including their perceptions of the environment and resulting im- ages. People from different backgrounds will naturally perceive places differently (Sonnleitner, 2011). As for the selection of photographic backgrounds in tourism destinations, some researchers have claimed that gender and age do not always significantly influence photographic content (Garrod, 2009). For example, the imagery that DMOs use to draw tourists becomes the object of the tourist gaze and thus the target of tourists’ photographs. Still, most relevant studies have sought to determine how photography that excludes a portrait can affect or convey a place’s image (Deng et al., 2018).

However, the above-mentioned analyses of individuals’ character- istics and behavior in photos were mostly qualitative. People’s attributes were therefore broadly summarized using descriptions, content analysis, and encoding without considering group attributes or broader tourist characteristics. The current study adopted deep learning technology for facial recognition, tourist behavior analysis, and destination image perception to evaluate destinations from the perspectives of marketing and tourist behavior.

2.3. Tourism big data research based on UGC photos

Traditional destination image assessment in tourism largely involves qualitative and quantitative analyses (Pearce & Wang, 2019), with questionnaire surveys being the most popular quantitative method.

However, the subjectivity of questionnaire surveys is an inherent problem in quantitative studies; destination image assessment is no exception.

Research on destination image in tourism contexts based on photos has proceeded through several major stages, from the early use of experimental means, such as hiring volunteers to take a limited number of photos (i.e., visitor-employed photography; VEP) (Sun et al., 2014) to

using UGC images to obtain travelers’ perceived destination image (James, 1988), conducting comparative analyses of DMO and UGC im- ages (Urry, 2011), and exploring cultural differences (Mackay &

Couldwell, 2004). Before 2010, destination images were most often extracted from tourists’ blogs (Ayeh et al., 2013; Gartner, 1993). Online textual reviews constituted the most important type of UGC for research purposes.

A growing number of researchers have striven to expand the scope and depth of data acquisition. Using UGC to extract destination images has become a key means of assessing destinations in recent years. Ad- vances in mobile Internet have brought UGC beyond simple text and in a richer visual direction. Visual media resources, represented by photos and videos, have gradually become indispensable data sources in tourism studies (Fatanti & Suyadnya, 2015; Pang et al., 2011). In particular, perceptions of destination image based on UGC images are currently the most important information transmitter in the travel pro- cess (Hunter & Cannon, 2016). As a direct product of a destination’s tourist gaze (Jani, 2011), photos directly convey visitors’ views of a destination and offer ideal data to discern travelers’ perceived destina- tion image.

The amount of available public data has recently increased expo- nentially due to the implementation of open data initiatives worldwide in the public sector; examples include Open Street Maps and social networks such as Flickr, which provide researchers application pro- gramming interfaces (APIs) or open-source datasets (P´erez & Quint´ans, 2019). The magnitude of image analysis has also shifted from hundreds to thousands of images or more, providing a data foundation for desti- nation image extraction from a vast number of UGC images. UGC photos can now be automatically analyzed via machines (Arefieva et al., 2021) to extract cognitive and affective images (Deng & Li, 2018). Large-scale metadata from UGC photos also helps researchers to trace tourists’

footprints and temporal-spatial behavior (Zhang et al., 2019; Vu et al., 2015). Most digital photos are ‘geotagged” and ordered chronologically.

Accordingly, big data analysis and corresponding visualization tech- niques enable researchers to reconstruct the research paradigm of tourist-oriented studies (Deng, Liu, Niu, & Ji, 2019).

2.4. Tourism research based on deep learning 2.4.1. The principle of deep learning

As a major branch of machine learning, deep learning has profoundly influenced the direction of machine learning and has played a main role in the development of artificial intelligence. Deep learning refers to a learning process consisting of hierarchical feature extractions that result in a multi-layer implicit neural network model. Specifically, deep learning uses massive data to train the model’s features to extract the most favorable parameters and then abstracts simple feature combina- tions into high-level features to achieve abstract expressions of data or actual objects (Fu et al., 2018).

As an exemplar of deep learning, convolutional neural networks (CNNs) (Browne & Ghidary, 2003) represent an established approach to pattern recognition and image processing. CNNs make full use of data’s local characteristics via local perception, shared weights, and pooling along temporal or spatial dimensions to optimize the network structure.

These networks also ensure a certain degree of displacement and deformation invariance. CNNs are trained through supervised learning, which mainly includes two stages: forward and back propagation.

During the training process, parameters can be continuously optimized to achieve optimal simulation. Several optimized and improved struc- tural models have emerged from CNN explorations (e.g. the fully con- volutional neural network [FCNN] and deep convolutional neural network) (Paliy, 2008).

CNNs were first applied in image recognition. They have been found useful in facial recognition and license plate recognition, including via the LeNet model, GoogLeNet model, and FCN model, all of which are typical CNNs (Paliy, 2008). The emergence of deep learning has led to

(4)

breakthroughs in facial recognition technology as well.

2.4.2. Deep learning adoption in tourism research

Deep learning, especially CNNs, are mainly applied to two topics in tourism research: tourist flow prediction and image content mining. The basic idea of prediction is to adjust and calibrate the parameters of a prediction function through unsupervised learning of historical data to obtain more accurate results. In a meta-analysis of tourism prediction, Kim and Schwartz found that predictive accuracy was closely related to data characteristics. They also forecasted inbound international tourism demand from all tourist markets in Catalonia using an artificial neural network model (Kim & Schwartz, 2013).

Photo content analysis has mostly been adopted to discern destina- tion image from massive UGC visual content, such as photos and videos.

Deng, Liu, Dai, and Li (2019) first used a deep learning tool called DeepSentiBank to extract the destination image from a massive set of UGC photos. Content analysis of tourism pictures offers another effective way to explore tourism behavior (McKercher & Wong, 2004) and tour- ists’ cognitive perceptions of a destination (Wang & Hsu, 2010). Zhang et al. (2019) adopted deep learning to collapse 35,356 Flickr photos from Beijing visitors into 103 scenes. They then compared the behavior and perceptions of visitors from different countries via statistical anal- ysis (Zhang et al., 2019). Amid the development of CNNs, image-based tourism research has come to focus more on latent content such as emotion and ambience. Discussions of the relationship between indoor ambience and consumer behavior have followed (Luo & Xu, 2019;

Rahimi et al., 2017). The sharing economy and restaurants have already adopted related techniques to understand the aesthetics and appeal of online images (Li et al., 2018).

Currently, the CNN method is frequently adopted to recognize ob- jects in tourism research, including an image’s element constitution.

Other important aspects of image recognition, facial patterns, and tourists’ profile extraction can facilitate understanding of the relation- ships among tourist group patterns and travelers’ behavior. Photo-based big data research is typically conducted using open-source tools, such as Amazon Rekognition (Bermudez et al., 2013) and ImageAI (Zheng et al., 2020). These tools have demonstrated accuracy rates of more than 90%.

However, when dealing with more abstract tourism destination concepts and information (e.g. landmarks, buildings, culture, and customs), more suitable tourism- and destination-related models must be trained (Sheng et al., 2020).

3. Methods 3.1. Target destination

As the center of Chinese politics, culture, science, technology, and international communication, Beijing—with its increasingly powerful international influence—plays a crucial role in China’s inbound tourism market. The city has been a hub of modern Chinese culture since the Yuan dynasties. It is home to many historical sites and cultural land- scapes, including seven world heritage sites and the largest number of Chinese cultural heritage projects in the world. For the past eight cen- turies, nearly all major buildings in Beijing have held indelible historical significance; the city hosts many historic sites such as the Forbidden City, the Temple of Heaven, the Summer Palace, the Winter Palace, and Beihai Park. Hutongs and siheyuans, typical residential forms from old Beijing, have also become prominent cultural and historical symbols of this city. According to 2018 data from the Beijing Culture and Tourism Bureau, Beijing received 4.04 million inbound tourists, an increase of 79,000 visitors (2.0%) over 2017. The 2008 Beijing Olympic Games promoted the city’s image as an international metropolis to the world;

the 2022 Beijing Winter Olympic Games will likely highlight its inter- national features once again, as the city becomes the first ever to host both the Winter and Summer Olympic Games.

3.2. Data source

Facebook acquired Instagram, the world’s main image-sharing platform, in 2012. According to statistics from 2019, Instagram was producing more than 100 million photos and videos every day and had more than 1 billion active users per month. The study also referred to the Flickr platform during the tourism photo research. The platform offers an open API and allows for the analysis of digital footprints. However, photos on Flickr are more professional than those posted on Instagram.

We focused on Instagram because posted photos are generally taken by ordinary people. Instagram users are more active than those on Flickr, and Instagram’s image contents better approximates actual tourists’

perspectives.

Users can add hashtags and text descriptions when uploading photos to Instagram. Users browsing these photos can then ‘like” or comment on them, as shown in Fig. 1. A single Instagram image might include several content-related hashtags, denoted by ’#’; the more tags a photo in- cludes, the more likely it is to appear in users’ search results. We ob- tained images and metadata on Beijing using InstaLooter, an Instagram image download tool. InstaLooter can download all photos with a user’s hashtags and provides several kinds of metadata per image: (1) caption;

(2) hashtag(s); (3) comments; (4) number of likes; and (5) time the photo was taken. When considering Beijing-related photos, we had to differ- entiate photos taken by tourists from those taken by local residents.

Therefore, we downloaded photos with the hashtag ’#VisitBeijing’. The probability of extracting tourist-taken photos then increased greatly, as verified by photos’ content (Fig. 2). We downloaded all photos bearing this hashtag posted between January 1, 2015 and February 1, 2020.

Ultimately, 14,886 photos comprised our basic dataset.

3.3. Characteristics of portrait photos

3.3.1. Attention to portrait photos on the internet

The popularity of images on Instagram can be conveyed by likes and comments. The number of likes reflects users’ interest in the posted content, and the number of comments shows the quality and level of discussion on the platform. To detect differences between portrait photos and those containing no people, we calculated the number of likes and comments for both groups of images. Portrait images had a total of 2,451,454 likes (M =495) and 39,651 comments (M =8); non- portrait images garnered 1,390,310 likes (M =258) and 23,153 (M =4) comments. Portrait photos were therefore considerably more popular and aroused more viewer interest than non-portrait images.

Fig. 1. Typical Instagram photo display.

(5)

3.3.2. Proportion of photos with faces in them

We focused specifically on selfies and tourists’ photos containing people. Images with people were further split into the following cate- gories: (1) a large number of people in the focus of a tourist’s camera, excluding the tourist themselves (Fig. 3[a]); (2) photos taken by tourists themselves using a selfie stick or by holding the camera (Fig. 3[b]); and (3) photos of tourists taken by others (Fig. 3[c]). We sought to analyze tourists’ choice of scenery, background, or place where a photo was taken from their own perspective. The first photo category did not meet our requirements; as such, we focused on the latter two photo types. We next used Amazon’s facial recognition tool, Rekcognition (Nicola et al., 2015), to gather data on the number of people, photo angles, gender, age, skin color, and other attributes per image. Then, we determined whether each photo met the requirements of this study. The facial recognition technology and tools are discussed further in the following section. A total of 8704 portrait photos met the requirements of Groups 2 and 3 above based on image screening and manual verification; this photo subset accounted for 60% of all tourism photos in original sample.

Further, to ensure photos’ quality and relevance, we first used the hashtag #visitBeijing to download Beijing tourism–related photos. Non- travel-related images from residents were largely excluded. We also adopted Amazon Rekognition to identify the number of people in each

image when capturing the photographers. Photos with vague or dusky image output, or those in which nobody was pictured, were filtered accordingly. Lastly, we completed a manual review to confirm that all retained photos met our basic standards for further analysis (see Table 1).

3.3.3. Composition of portrait photos

According to the results of Rekognition analysis, among portrait photos, 3289 showed one face (29%), 1135 showed two faces (13%), 527 showed three faces (6%), 336 showed four faces (4%), 228 showed five faces (3%), and 6640 showed more than five faces (46%). As listed in Table 2, the male/female ratio was close to 1:1. As such, potential interference from gender differences could be eliminated in subsequent analysis.

People of different genders and ages appeared in the background of the same scene in multi-faced photos. This composition could interfere with destination identification and image analysis, rendering it impos- sible to determine which person in the photo chose the pictured site.

Therefore, based on the classification of the number of faces in photos, only images containing one face were analyzed.

3.3.4. Age and gender composition of people in photos

To make the findings more precise, all photos containing more than one face were excluded. Among photos containing one face, the number of people in each age group varied. The same Instagram ID could also be used to post multiple photos. After removing repeat users, we re- calculated the number of photos containing people of different ages as shown in Table 3.

Among the 3289 single-faced photos, some users had the same ID and multiple photos; 1455 IDs remained after again deleting repeat user IDs. Table 4 presents the results of our comparison of the average number of photos featuring men and women after deleting redundant IDs.

As noted, the numbers of likes and comments showed that Instagram users preferred portrait photos to non-portrait ones. The fact that 60% of all tourism photos contained people also suggests that portrait photos Fig. 2. Photos of people traveling to Beijing with ‘VisitBeijing’ hashtag.

Fig. 3.Three types of tourists’ portrait photos.

Table 1

User engagement with portrait and non-portrait photos.

Type Total

number of likes

Average number of likes

Total number

of comments Average number of comments Portrait

photos 2,451,454 495 39,651 8

Non- portrait photos

1,390,310 258 23,153 4

(6)

accounted for a substantially larger proportion of tourists’ photos than non-portrait images. Tourists therefore tended to take photos including themselves during their trips, and such photos can be expected to shape potential tourists’ destination image formation. Thus, these photos were strongly correlated with tourism and enabled us to assess tourists’ in- tentions and behavior.

3.4. Analytical framework

Fig. 4 illustrates the analytical framework of the study. After pre- processing in which we selected only those photos containing one face, all photos were retained for further analysis. Profile-related portrait information (e.g. the pictured person’s age and gender) was also obtained through this step. Then, the portrait photos were pro- cessed using DeepSentiBank to extract their background content in the form of adjective +noun pairs (ANPs). Photo location information (i.e., where each photo was taken) is usually recorded as a user tag in the metadata. We therefore extracted each POI to identify where tourists preferred to take photos. This analytical procedure resulted in three important pieces of information: hashtags from metadata, facial

information, and photographed content revealed through machine- based object recognition. Hashtags contained descriptions about each photo, and POI-related information appeared frequently as well. Facial recognition provided information including the pictured tourist’s age and gender. We also used a deep learning algorithm to extract the ob- jects in each photo. Then, POI information (based on photos’ hashtags and object-related information from photos’ content) was integrated with the facial recognition results to evaluate different tourist groups’ travel and photography patterns.

3.5. Facial recognition method

In this study, we used automatic facial recognition technologies to identify tourists’ faces, genders, ages, and expressions. Amazon Rekog- nition, launched in 2016, is an image recognition service based on deep learning; it can detect objects, scenes, and faces. The service provides fast and accurate face searches, enabling identification of a person in a photo or video using a private repository of face-based images. Rekog- nition is based on mature, highly scalable deep learning technologies developed by Amazon and can analyze billions of photo images per day.

An improved image audit model is now available that reduces the false positive rate by an average of 40%.

The main facial recognition technology extracts local image features and textures using local binary patterns (LBPs) (Turk & Pentland, 1991).

After an image is processed by the LBP operator, a statistical histogram is obtained to form a feature vector (i.e., the LBP texture feature vector of the entire image) (Turk & Pentland, 1991). Then the support vector machine (SVM) approach or other machine learning algorithm can be used for classification.

Gender classification is a complex, large-scale quadric pattern recognition problem. The classifier can input data and divide it into male or female. The most common gender recognition methods are currently a gender recognition algorithm based on facial features, a gender recognition method based on the Fisher criterion (Belhumeur et al., 1997), and a face and gender classification algorithm based on Ada- Boost +SVM (Jianwei & Yiquan, 2020).

Age recognition is more complicated than gender recognition and is generally divided into two stages. In the prediction stage, the skin texture of a face in a photo is extracted, and the age range is roughly evaluated to determine a specific age group. In the detailed evaluation Table 2

Information in photos.

Dimension Number

Number of faces 1 2 3 4 5 5+

Number of photos 3289 1135 527 336 228 6640

Proportion of total number 29% 13% 6% 4% 3% 46%

M:F 1319:1970 839:847 453:424 309:291 222:203 858:848

M:F (Total) 4000:4380

Table 3

Different age groups of people in portrait photos.

People of different ages in portrait photos

Age 0–18 19–29 30–39 40–49 50–59 60+

Number of photos 439 1605 645 371 193 36

People of different ages in portrait photos after deleting redundant IDs

Age 0–18 19–29 30–39 40–49 50–59 60+

Number of photos 136 696 350 168 61 13

Table 4

Number of men and women in portrait photos.

Number of men and women in portrait photos

Gender Male Female

Number of photos 1319 1970

Number of men and women in portrait photos after deleting redundant IDs

Gender Male Female

Number of photos 757 943

Fig. 4. Analytical framework.

(7)

stage, by using SVM, multiple model classifiers corresponding to mul- tiple age groups are established, and the appropriate model is selected for matching. When the content of photos is clear and there are obvious people in them, such as 100 photos of famous people or in Twitter datasets, the accuracy rate is 99%; when photos are not clear, such as in the IMDb wiki dataset, the accuracy rate is 76.8%.

3.6. Image content recognition and analysis

To explore the destination image–related cognition of tourists from different cultural backgrounds, we used DeepSentiBank (Chen et al., 2014), an image emotion and content analysis tool based on image deep learning, which can effectively extract the content and emotional key- words from a photo. The tool was developed by Columbia University’s D. ort team based on a CNN. To achieve high emotion-based accuracy, DeepSentiBank output appears as ANPs (Chen et al., 2014). The content analysis procedure is depicted in Fig. 5. In this case, DeepSentiBank could parse a photo related to the Forbidden City in Beijing into a set of data results containing 2089 ANPs (the figure lists only the first five items). The results were next sorted according to correlations between ANPs and the photo’s content. The higher the value of the top-ranked ANP, the more related it is to the photo’s content. DeepSentiBank then transformed the image information into text and constructed the basis for UGC analysis of tourists’ destination image.

Studies have shown that DeepSentiBank, compared with previous SVM-based classification models, performs significantly better in both annotation and retrieval (Chen et al., 2014). Recent tourism-related photo research has also indicated that DeepSentiBank performs adequately in terms of object recognition and especially sentiment analysis (He et al., 2021). Only nouns were of interest in our experiment, as we solely considered photos’ cognitive images and backgrounds in this study.

4. Results

4.1. Group comparative analysis based on ANPs

4.1.1. Comparison of cognitive images of different gender groups Cognitive features, including destinations’ architecture, natural landscape, culture, and traditions, can be regarded as photo content appropriate for assessing a tourism destination’s cognitive image. Ac- cording to Echtner and Ritchie (2003), cognitive characteristics are composed of three dimensions: ‘functional – psychological’, ‘universal – unique’, and ‘holistic – individual’ (Echtner et al., 2003). Stepchenkova and Morrison (2006) divided destination image attributes into natural ecology, people, original places, wild animals, traditional clothing, art life, local facilities, and local life. Mak and Athena (2017) classified destination attributes as reflecting a destination’s natural environment, facilities, culture and art, special activities, food and beverage, animals and plants, human beings, transportation, information, accommoda- tions, and tourist attractions.

Baoqing et al. (2018) divided destination cognitive image into nine dimensions: scenic spots, food, people and history, local atmosphere, environment, facilities and services, leisure and entertainment, educa- tion, and places. Deng and Li (2018) categorized this form of image

based on people, nature, transportation facilities, activities, buildings, culture, and places. As the object of the study is portrait photos, we eliminated the ‘people’ dimension and divided Beijing’s cognitive image across seven dimensions: natural environment, food, facilities, leisure and entertainment, culture, art and history, architecture, and urban life (Table 5).

By taking the above-mentioned categories and combining them with the unique characteristics of Beijing, we obtained eight dimensions:

natural environment; food; facilities; leisure and entertainment; culture, art, and history; architecture; urban life and ‘other’. In the analysis re- sults, we classified the top 100 high-frequency nouns used by tourists of different genders and related to destinations’ cognitive image based on dimensions; nouns related to people were deleted. After combining singular and plural nouns (e.g. ‘animal’ and ’animals’, ‘tree’ and ’trees’), the proportion of each dimension was counted according to the pro- portion of all dimensions in each photo. Findings are displayed in Table 6 and Table 7.

Among the eight dimensions, ‘natural environment’ and ‘urban life’

and ‘architecture’ each accounted for a large proportion. ‘Natural environment’ (35%) and ‘urban life’ (33%) ranked first. Among photos taken by women, the top dimension was ‘architecture’ (32%), and the second was ‘natural environment’. The leading dimension in women’s photos, ‘architecture’ accounted for only 6% among men. ‘Food’, ‘lei- sure and entertainment’ and ‘culture, art, and history’ were rarely seen in photos.

An independent t-test was conducted on the eight dimensions of photos taken by people of different genders, including 1042 men and 1611 women. No significant difference emerged in the mean values for

‘natural environment’, ‘food’ and ‘leisure and entertainment’ by gender.

However, significant differences were observed between men and women in terms of ’facilities’ (0.000), ‘culture, art, and history’ (0.010),

‘urban life’ (0.000) and ‘architecture’ (0.000). No significant differences

Fig. 5.ANP analysis of Forbidden City photo content.

Table 5

Destination cognitive image dimensions.

Dimension Content

Natural environment Natural scenery; weather; plants (e.g. flowers, grass, trees, leaves); animals; streams, lakes, and other water-related scenery; mountains; and natural scenery at tourist attractions.

Food Food and beverage, food names, etc.

Facilities Public facilities, business facilities, other urban facilities, transportation facilities and other infrastructure, accommodation, scenic spots, restaurants, signs, publication materials, and other tourism facilities.

Leisure and

entertainment Recreational activities such as traveling, dancing, partying;

festivals such as Christmas, Halloween; sports, rowing, watching performances, performing, etc.

Culture, art, and

history Beijing’s unique cultural symbols, traditional costumes (e.g.

chi-pao), handicrafts, painting and museum art, cultural relics in museum collections, goods related to art and history, cultural symbols with historical significance, etc.

Architecture Buildings and their surroundings, decorations, business buildings, urban buildings, historical buildings, relics, monuments, royal palaces, etc.

Urban life Urban views, urban landscapes, daily life of local residents, urban environment, security issues, etc.

(8)

manifested between genders on dimensions such as ‘natural environ- ment’, ‘food’ and ‘leisure and entertainment’ in terms of the chosen background for taking photos. However, selections varied for di- mensions such as ’facilities’, ‘culture, art, and history’ and ‘urban life’:

men preferred to take photos with backgrounds related to ‘natural environment’, ’facilities’ and ‘urban life’ whereas women favored photos including an ‘architecture’ background.

4.1.2. Relationships between cognitive images and people of different ages We used Amazon Rekogniton to identify people’s age in 2653 photos and analyzed the significance of different ages across eight dimensions.

A linear regression analysis was performed with the eight dimensions as dependent variables. Results showed that ’facilities’, ‘architecture’ and

‘urban life’ each had a linear regression relationship with age as well as significant differences. The number of photos containing ’facilities’,

’buildings’ and ‘urban life’ gradually increased with tourists’ age. With age, especially in adolescence and young adulthood, people are more drawn to urban settings. Conversely, marketers of Norwegian fjords have found that older tourists visit the country mainly to enjoy natural scenic spots, beautiful scenery, and a comfortable environment, whereas younger tourists seek excitement (Chang et al., 2010). ‘Urban life’

included images of urban views, urban landscapes, the daily life of local residents, and the urban environment; ‘architecture’ included buildings and their surroundings as shown in Table 5. Urban landscapes such as parks generally feature attractive scenery and a comfortable environ- ment, which influenced our findings to some extent.

4.1.3. Comprehensive analysis of people of different genders and ages As noted, by taking the aforementioned categories and combining them with the unique characteristics of Beijing, we obtained eight di- mensions: natural environment, food, facilities, leisure and entertain- ment, culture, art and history, architecture, urban life, and ‘other’. We then conducted linear regression analysis on age groups and these eight dimensions while accounting for people’s gender.

The four dimensions of ‘natural environment’, ‘food’, ’facilities’ and

‘urban life’ demonstrated no significant difference among men of

different ages, although a significant difference appeared among women of different ages. As women aged, the coefficient of the four dimensions corresponding to the background of photos gradually increased. We also identified significant differences between men and women of different ages with respect to ‘architecture’. Specifically, architectural elements in photos’ backgrounds appeared more often as tourists’ ages increased.

No significant differences manifested for image backgrounds involving

‘leisure and entertainment’, ‘culture, art, and history’ or ‘other’ between men and women of different ages.

Compared with Table 8, ‘architecture’ and ‘urban life’ for each gender had a linear regression relationship with age and exhibited sig- nificant differences. In particular, we observed a significant difference in

‘architecture’ among men and women but no significant difference be- tween men and women in ‘urban life’; thus, women had a greater in- fluence on the age difference in terms of ‘urban life’ (see Table 9).

We did not identify any significant differences among the three di- mensions of ‘natural environment’, ‘food’ and ’facilities’ in terms of tourists’ ages. However, significant differences emerged based on women’s ages, such that women’s photos included more elements of these three dimensions based on age. No significant differences appeared among men.

4.2. Point of interest (POI) comparison of different tourist groups 4.2.1. POIs of tourist groups by age

Instagram users often add hashtags and descriptions to photos when they publish them, such as ’#Beijing tourism’ and ’#Forbidden City’. By extracting the hashtags and descriptions added to images on Instagram, we obtained the top 20 POIs for tourist groups of different ages. Results are shown in Table 10. According to the statistics, the Forbidden City, Summer Palace, Temple of Heaven, and Mutianyu Great Wall were tourists’ most frequently visited scenic spots in Beijing. Accordingly, these scenic spots were most representative of ‘punching the card’ (i.e., must-see places). Other scenic spots, such as the Tian’anmen Square and the Lama Temple, which represent Beijing’s traditional culture, were also popular. The city’s traditional architecture and culture held strong appeal for visitors as well. The modern buildings and business districts represented by museums, the 798 art district, Sanlitun, and Wangfujing were other major attractions in Beijing. In addition, tourists aged 19–29 occupied the largest proportion of visitors, whereas the number of tourists over age 50 was relatively small.

4.2.2. Comparison of POIs among male and female tourists

By extracting the hashtags and descriptions added to images on Instagram, we obtained the top 20 POIs of tourists by gender (Table 11).

The Forbidden City, Summer Palace, Temple of Heaven, and Mutianyu Great Wall remained tourists’ top choices.

4.2.3. POI selections of tourist groups with different ages and genders Tourists were divided into 12 groups according to gender and age, Table 6

Destination cognitive image dimensions classified by gender.

No. Type Male Female

1 Natural environment 35% 27%

2 Food 1% 0

3 Facilities 23% 15%

4 Leisure and entertainment 1% 1%

5 Culture, art, and history 0.4% 0.3%

6 Architecture 6% 32%

7 Urban life 33% 23%

8 Other 1% 1%

Table 7

Significance analysis of destination cognitive images among men and women N1 =1042; N2 =1611.

Dimension Male

Mean value and standard deviation

Female Mean value and standard deviation

Significance

Natural

environment 0.86/0.107 0.81/0.112 0.246

Food 0.019/0.074 0.01/0.060 0.114

Facilities 0.58/0.903 0.44/0.781 0.000***

Leisure and

entertainment 0.02/0.031 0.02/0.039 0.127

Culture, art, and

history 0.011/0.229 0.013/0.021 0.010**

Architecture 0.14/0.196 0.94/0.151 0.000***

Urban life 0.81/0.867 0.69/0.882 0.001***

Other 0.03/0.041 0.04/0.049 0.000***

Table 8

Significance analysis of destination cognitive images among people of different ages.

Dimension R Coefficient of non- standardization Constant/slope

Significance (two- tailed)

Natural environment 0.001 0.076/0.000 0.241

Food 0.001 0.023/0.000 0.078

Facilities 0.003 0.038/0.000 0.007**

Leisure and

entertainment 0.001 0.027/0.000 0.054

Culture, art, and

history 0.000 0.011/0.000 0.276

Architecture 0.024 0.039/0.002 0.000***

Urban life 0.008 0.053/0.001 0.000***

Other 0.001 0.038/0.000 0.075

(9)

and the ranking of the top 10 POIs in Instagram images was calculated.

Results are shown in Table 12. The Forbidden City ranked first among all groups; that is, the primary cognition of different tourist groups regarding Beijing’s scenic spots was the Forbidden City. Women of different ages demonstrated great changes in their choice of the Summer Palace, with women between 19 and 29 years old being the most prominent and those over 30 years old showing a downward trend.

Men’s ranking was stable with no significant changes. Different from

women, the number of men under 29 years old taking photos in front of the Temple of Heaven and Mutianyu Great Wall was greater than that taking pictures in front of the Summer Palace. Among the modern art buildings and business districts represented by ‘798and ‘Sanlitun’, the number of women aged 19–29 was significantly greater than that of men, indicating that women in this age group were more interested in art buildings and business districts.

According to the classification of Beijing’s tourism resources, we categorized the top 30 POIs into six categories: history and culture, heritage sites, natural scenery, urban parks, business streets, and dis- tricts and modern buildings.

The people in 3290 photos were divided into six age groups: 0–18, 19–29, 30–39, 40–49, 50–59, and 60+. SPSS was used to analyze their gender, age, and POI clusters as listed in Table 13. There were 439 people aged 0–18, 1606 aged 19–29, 644 aged 30–39, 371 aged 40–49, 193 aged 0–59, and 36 over age 60. The dataset included 481 photos (14.6%) of historical and cultural places, 145 photos (4.4%) of relics, five photos (0.2%) of natural scenery, 202 photos (6.1%) of urban parks, 95 photos (2.9%) of business streets and districts, and 183 photos (5.6%) of modern buildings.

A binary linear regression analysis was conducted for different genders in six POI categories. Table 14 indicates significant differences between men and women in the three categories of ‘history and culture’,

‘heritage sites’, and ‘urban parks’. In the categories of ‘historical culture’

and ‘heritage sites’, men’s photos were 22% and 37% less frequent than women’s images, respectively; ‘urban parks’ appeared 39% more among men compared with women’s photos.

According to previous studies (Meng & Uysal, 2008), women pay more attention to most destinations than men, especially in terms of natural scenery and recreational activities such as attending festi- vals/museums, visiting historical sites, sightseeing, and shopping. By contrast, men value challenging outdoor activities, including canoeing, hiking, skiing, horseback riding, hunting, and fishing. Men also pay more attention to holiday facilities and relevant activities (e.g. golf and tennis).

The results of binary linear regression analysis are shown in Table 15. We identified significant differences between men and women aged 0–18 and 19–28 in the category of ‘heritage sites’, with men’s photos being 0.344 times and 0.581 times more common than those taken by women, respectively. As both genders demonstrated significant differences in ‘heritage sites’ for full age ranges, we found that tourists aged 0–28 had a greater influence on the whole group; that is, this group showed a large distinction by gender. A significant difference also manifested between men and women aged 30–39 in the category of

‘business streets and districts’, with men’s photos 4.200 times more common than those taken by women. However, we found no significant difference between genders in ‘business streets and districts’ across all age ranges, suggesting that other age groups had a greater impact on the entire group.

Table 9

Significance analysis of destination cognitive images among men and women of different ages.

Dimensions Male Female

Coefficient of non-standardization

Constant/slope Significance Coefficient of non-standardization

Constant/slope Significance

Natural environment 0.093/0.000 0.440 0.064/0.001 0.033*

Food 0.026/0.000 0.272 0.024/0.000 0.033*

Facilities 0.063/0.000 0.544 0.024/0.001 0.000***

Leisure and entertainment 0.025/0.000 0.363 0.029/0.000 0.159

Culture, art, and history 0.008/0.000 0.071 0.012/0.000 0.574

Architecture 0.092/0.001 0.006** 0.013/0.003 0.000***

Urban life 0.073/0.000 0.267 0.040/0.001 0.000***

Other 0.030/0.000 0.949 0.041/0.000 0.190

Table 10

Top 20 POIs among tourists of different ages.

No. POI 0–18 19–29 30–39 40–49 50–59 60+

1 Forbidden City 42 197 78 29 27 5

2 Summer Palace 14 89 45 25 18 6

3 Temple of Heaven 21 84 41 16 4 0

4 Mutianyu 14 69 29 3 3 0

5 Tiananmen 8 37 16 18 27 5

6 798 6 45 13 13 5 0

7 Museum 13 37 11 11 1 0

8 Gubei 9 32 14 8 1 0

9 Sanlitun 5 31 9 8 3 0

10 Jinshanling 9 23 12 1 1 0

11 Beihai 6 17 4 8 5 0

12 Lama Temple 4 14 8 4 7 0

13 Badaling 3 21 6 3 1 0

14 Jingshan 3 13 7 2 1 0

15 Wangfujing 13 8 1 2 0 0

16 Bird’s Nest 0 19 3 2 0 0

17 Jiankou 3 9 5 2 1 0

18 Simatai 3 8 7 2 0 0

19 Shichahai 5 9 2 1 0 0

20 CCTV 2 6 7 0 0 0

Table 11

Top 20 POIs by tourists’ gender.

POI Male Female

Forbidden City 218 160

Summer Palace 125 72

Temple of Heaven 82 84

Mutianyu 53 65

Tiananmen 36 75

798 54 28

Museum 37 36

Gubei 48 16

Sanlitun 34 22

Jinshanling 30 16

Beihai 30 10

Lama Temple 17 20

Badaling 22 12

Jingshan 22 4

Wangfujing 12 12

Bird’s Nest 8 16

Jiankou 12 8

Simatai 11 9

Shichahai 12 5

CCTV 5 10

Referensi

Dokumen terkait

5, 2014 ラン藻の代謝改変によるバイオプラスチック増産 ラン藻代謝工学の新展開 ラン藻は,酸素発生型の光合成を行う細菌である(図 1).淡水,海水,土壌から深海や温泉に至るまで,あら ゆる環境に生育していることが知られている.光合成を 行うことで,光エネルギーと大気中の二酸化炭素の利用 が可能であることから,ラン藻を用いたバイオエネル

From the analysis related to brand awareness consisting of brand recognition and brand recall as well as the analysis related to brand image consisting of attributes, benefits, brand