• Tidak ada hasil yang ditemukan

Persistent Homology in Tourism: Unlocking the Possibilities

N/A
N/A
Hensley Hernandez

Academic year: 2023

Membagikan "Persistent Homology in Tourism: Unlocking the Possibilities "

Copied!
35
0
0

Teks penuh

Recent examples of the use of principal component analysis include Brida et al. 2018) is one of the few articles using a complete data approach. For an algebraic discussion of the methodology and its theoretical underpinnings, see Carlsson (2009), Edelsbrunner and Morozov (2013), and others. As we increase the level of filtration, we get to panel (b) and the circles on the right-hand side meet two points.

The filtration circles are for illustration only and do not represent specific filtration parameter values. In the bivariate case, this would be the radius of the circles drawn around the points.

Persistent Homology or Clustering?

If the filtration is set too high, all the gaps will be filled and only a very limited number of features will be identified. Between the extremes, the number of holes will vary as small holes are created and filled and larger holes are created. These results will provide a clear picture of how the sample represents the population.

This is an emerging field in PH research, but here we limit ourselves to the operation of the technique and simple applications as an avenue for its use. In marketing and destination management it is useful to store all the information and ensure the maximum possible knowledge even in the smallest groups. Finally, while hierarchical clustering is intuitive when considered as observations that join together, or a grand coalition of all observations that split, the arbitrary nature of the clipping process raises questions.

Increasing the filtering in hierarchical clustering would reduce the number of clusters problem, but would lose the detail that PH holes provide. Thus, PH stands as the best way to identify the smallest clusters and true hierarchy of data, and to do so without any requirement for model assumption. PH, as developed here, is different because it does not use any parameters beyond the filtering range, and hence the barcodes ensure that the researcher is aware of the consistency of the resulting results.

For the purposes of this paper, deciles are used to segment the market, given that a large data set is important to get a strong sense of the biggest spenders.

Data

Implementation

The figures are based on the IPS sample of tourists and differ from the total number when other visit purposes are taken into account. The individual in the younger age group refers to traveling alone, while the party indicates that the respondent was part of an organized trip with others. Whichever software package is used, the dataset must be provided in a way that allows the construction of a point cloud.

In our case, we have a serial number to identify individual respondents, but we cannot provide it to homology, as it is not a true variable associated with an individual; this must be removed before running the code. Having the results depend solely on the highest level of filtration, max adds robustness compared to techniques that require more choices to be made by the researcher. In fact, the homology calculation goes from zero to the maximum, which only requires that the upper limit has meaning for barcodes.

In the Tausz et al. 2014) implementation cluster membership is only available for the maximum and therefore the decision here that facilitates the reported groups; we can easily perform the homology again to identify clusters at an alternative level.

Results

Numbers report the mean value for each variable in a group, and the number in parentheses is the standard deviation for that variable and group. Notes: The horizontal axis shows the degree of filtration and ranges from 0 to 1.5, which is used in the homology that follows. Sizes range from as few as 5 to 25 members, with most higher-spending groups having more than 10.

In this paper, 1% of the total sample will be about 16 members, and thus the total cluster membership is approximately 10%. For a large data set, we would expect the majority of respondents to behave similarly and that there are strong correlations between observations.

Bivariate and Trivariate Visualisations

Example Clusters

Cluster numbering is provided by the software and corresponds to the full set of clusters listed in the supplementary material. Consequently, much of what PH identifies contradicts the market segmentation regression results. Our 10% spending clusters, listed in Table 5, also include both genders and both departure mode options.

All are either solo travelers or have at most one companion, there is great consistency within these characteristics. There are differences in that group 4 has some lower group sizes and consists of air travelers. In such cases, any conclusion will only apply to the specific data studied in the groups.

Once again there is variation between length of stay and age, with the clustering mainly focusing on group size. For age, there is little significant prediction of being in the top 10% and therefore little can be said about the consistency of the clusters in this dimension. Men are significantly more likely to be in the higher spending group, as are air travel, so in this respect cluster 17 goes against what regression models would predict.

All other group sizes are less likely to be in the top 10% relative to solo travelers and therefore there is greater consistency with these two groups.

Summary

Not surprisingly, these nationalities are associated with higher expenditure by the logistic regressions, as are stays between 3 and 18 days. Our empirical work is inevitably limited by the data set; there are a number of other areas that could be of interest, such as income, which are not available in passenger surveys. High income leads to high spending, but again there is the potential for PH to identify high-income, frugal individuals who then fall into the bottom 10%; such a grouping would be an ideal target for promotions to encourage increased spending.

Therefore, we challenge the research community to develop more informative datasets by providing a tool that can unleash their full informative capabilities. The IPS data is tasked with understanding travel to and from the UK, meaning that many questions that can help understand motivations can be used. There is also an opportunity to extend the analysis to regional and accommodation variables which are available for a subset of respondents.

We've focused on vacationers for the example, but the work can easily be expanded to cover business travelers and those visiting friends and family. Such purposes are typically assumed to lead to higher and lower spending, respectively, but again we can expect PH to identify groups of low-spending business travelers or high-spending relative visitors. In our example, we considered the top 10% and bottom 10% of spending among inbound tourists to the UK.

Beyond IPS data there are applications for understanding the drivers of demand for specific visitor attractions, hotels, destinations, etc.

Theoretical Implications

It is the ability of topographic data analysis techniques such as PH to break down the complexity of the data structure without relying on imposed relationships that contributes most to the development of theory in the field of tourism management. PH provides a laboratory in which data changes can be tried and evaluated by perturbations in the cloud. Rather than the identified clusters per se, the theoretical advance of our work lies in the area of ​​marketing, where identifying individuals and reshaping the data set by changing characteristics for certain observations has great potential.

For the bottom 10%, we imagine we could successfully create a campaign that removes one of the clusters from the bottom decile of spend. Such a move would then ensure that homology would identify new individuals as the gap inevitably widens. At the same time, new observations would become part of the 10%, potentially further modifying the homology and creating new features.

As a loop, this process has the ability to raise the bottom 10% threshold while indicating marketing effectiveness. Furthermore, we consider the trajectory of attributes targeted by homology changes as a means of evaluating marketing effectiveness. Such an analysis has not yet been attempted and PH itself has yet to make an appearance, but is introduced here as a theoretical advance arising from our work.

Our literature review on PH also highlighted areas such as time series analysis and understanding social networks where topological data analysis could easily be used by active tourism research fields.

Managerial Implications

The seeming disregard for small details in the important clustering phase of target marketing dictates that results-born strategies are empirically limited in their directional capabilities. One of the key features of PH is that it can exploit the geometric information embedded in the dataset to produce better quality quantitative modelling. We propose that the topological connectivity of tourism behavior elements can provide a natural segmentation for tourists' consumption behavior.

Hence the quality of the marketing activities of travel service providers, destinations and policy makers. These close matches within the data cloud were demonstrated to provide a detailed picture despite the coarseness of the data. Clusters that emerge are clearly identified and provide exactly the concise summary that managers crave and enable the marketing of the right content to the right people at the right time.

In addition, a more sophisticated representation of consumption groups can provide a more accurate assessment of the tourism market. In the current literature, most of the successful applications of PH have been limited to analysis and hardly bring benefits to the next stage of the business decision-making process. Using PH managers can identify stable patterns in the data that are considered precursors to periods of high demand.

Persistent Homology (PH) is an example of topographical analysis, which identifies patterns in data that statistical methods typically used in tourism would not. Despite the criticisms of the dataset that have been leveled against all analyzes of the IPS dataset, PH will continue to bring the benefits of understanding that we have shown here. A meeting of the minds: Exploring the core-periphery structure and retrieval paths of destination images using social network analysis.

Referensi

Dokumen terkait