Souâd Demigha
5. Design of the CBR Cloud Computing System
5.1 General architecture of the system
Figure 3 illustrates the general architecture of the CBR “Cloud Computing” system in radiology.
Figure 3: General architecture of the CBR “Cloud Computing” System in radiology
5.2 Design of the CBR Cloud Computing System
Before developing the CBR “Cloud Computing” system, we have analyzed users’ requirements and designed the system by building a “case base” storing the radiological knowledge. We have organized data of patients under clinical cases according to the Case-Based Reasoning (CBR) approach. CBR offers a natural translation of Cloud status information into formal knowledge representation and an easy integration with the MAPE phases). CBR is the process of solving new problems based on the solutions of similar past problems and structured as cases.
Data are structured as cases. A case is a couple of a problem and a solution. In medicine and radiology, the problem is the diagnosis and the solution is the problem to be diagnosed. It has been proven that students learn best when they are presented with examples of problem-solving knowledge and are then required applying the knowledge to real situations. The “knowledge base” of examples and exercises capture realistic problem-solving
Souâd Demigha
situations and present them to the students as virtual simulations (Demigha, 2015). CBR is appropriated to medical domain.
5.3 Methodology
CBR is a general problem-solving or decision-making framework, which revolves the processes of case retrieval, reuse, retention, and maintenance.
The first step in the CBR process is similar to the knowledge creation phase in the KM process, “building the actual case base”. The “case base design” is characterized by a considerable knowledge engineering effort involving specialists and knowledge engineers. Once the design is completed, the CBR process is controlled by the processes case retrieval, reuse, and retention, and case base maintenance, (Demigha, 2018).
The ability of case-based reasoning to reason from individual examples and its inertia-free learning makes it appear a natural approach to be applied to big-data problems such as predicting from very large example sets.
Likewise, if CBR systems had the capability to handle very large data sets, such a capability could facilitate CBR research on very large data sources already identified as interesting to CBR.
5.3.1 Case organization and modeling
Figure 4 illustrates the hierarchical organization of breast cancer knowledge.
Figure 4: A shared feature network for breast cancer
The cloud model provides the ability to rapidly acquire, provision, and deploy new IT platforms, services, applications, and test environments. With cloud capabilities, months-long IT hardware procurement processes can be eliminated, reducing time spent on such tasks to a matter of hours or even minutes. The cloud model also helps ensure that university networks are available and secure, regardless of the circumstances.
“Like Data Bases search, retrieval of cases from a case library can be seen as a massive search problem - but with a twist (Kolodner, 1993).” No case in the case library can be ever expected to match a new situation exactly, so search must result in retrieval of a close partial search. Partial match algorithms are quite expensive. Because of the expense, retrieval must be directed in some way so that matching is only attempted on those cases with some potential relevance in the new situation.
Because partial matching is so important, the algorithms used to search a database won’t in general, work for searching a case library. Database algorithms require the fields of a query to match items in the database exactly or to be instances of the types specified in the query.
There are several different organizations for cases, and along with each, the algorithms required for retrieval and update and its advantages and disadvantages. Some structures are hierarchical; others are more flat. Some structures discriminate coarsely; others discriminate more finely; some algorithms are inherently parallel; others are serial.
Souâd Demigha We distinguish six types of organizational structures:
Flat memory; serial each (optionally augmented by shallow indexing or partitioning);
Shared feature networks, breadth-first graph search;
Prioritized discrimination networks, depth-first graph search;
Redundant discrimination networks, breadth-first graph search;
Flat memory, parallel search;
Hierarchical memory, parallel search.
Organization of the cases in the case base has a direct effect on the complexity and response time of the case- based recommendation system. With the fact that the market of cloud services is rapidly growing, which implies a fast growth in the case base, we need a case organization that can support efficient retrieval from large case bases (Soltani, 2016). The flat memory model, (Bichindaritz, 2008), (Tsatsoulis et al., 2000) is the simplest one as it organizes all the cases in the same level. It is a good choice when the number of cases in the case base is relatively small, since during retrieval the CBR engine compares the problem case with each of the cases in the case base. This model provides maximum accuracy, easy maintenance and easy retention, which explains its wide use in many applications.
When a case library is large such as in radiology, there is a need to organize cases hierarchically, we adopt the hierarchical organizations of cases: Shared-Feature Networks for our system design. Only some subset needs to be considered during retrieval. This subset, however, must be likely to have the best-matching or most useful cases in it.
5.3.2 Similarity functions
Matching is the process of comparing two cases to each other and determining their degree of match. Ranking is the process of ordering partially-matching cases according to goodness of match or usefulness. When we match cases, we can produce a score that signifies degree of match, or we can simply determine if yes, a case matches sufficiently, or no, it doesn’t. The main idea is: “if you can cluster together cases that are similar to one another and figure out which cluster best matches the new situation, then only items in that cluster need ne considered in finding a best-matching case.” Hierarchies are found when clusters are broken down into subclusters and so on, (Kolodner, 1993).
Inductive clustering methods generally look for similarities over a series of instances and from categories that based on those similarities. Share-feature networks provide a means of clustering cases so that cases that share many features are clustered together. Each internal node of a shared-feature network holds features shared by the cases below it. Item without those features live in or below that node’s siblings. Leaf nodes hold cases themselves. In (Demigha, 2015b, 2015c), we have presented concepts and techniques used to develop a Data Mining system particularly in medical field and imaging. The application of information mining techniques to the medical domain are very helpful in extracting medical knowledge for diagnosis, decision-making, screening, monitoring, therapy support and patient management.
To retrieve a case from a shared-feature network, a sort of breadth first search is done. The input (new situation) is matched against the contents of each node at the highest level in the graph. The best –matching node is chosen. If it’s a case the case is recurred. Otherwise, if it is an internal node, the same thing is repeated among its descendants. This continues until a case is returned. Table 1 shows the algorithm. After clustering the cases this way, the tree can be incrementally updated using the algorithm (see table 2) as new cases are added in the case library.
Table 1: Retrieving a case from a shared-feature network Let N = the top node.
Repeat until N is a case.
Find the node under N that best matches the input Return N
Souâd Demigha
Table 2: Clustering a shared-feature network Choose a clustering method.
Create a top node for the tree. Call it N.
Let C be the set of cases needing organization.
Put any features shared by all the cases in C into N.
Partition C using the clustering method. Create a node for each partition, attaching each as a successor to node N.
For each partition.
Create a node Ni.
If it contains more than one case, then repeat step 4, with N= Ni, C = the cases in the partition.
Else, put the features of the one case into node Ni.
For breast cancer, data items may be grouped according to logical relationships or senologist (expert-radiologist in breast cancer) affinities or preferences. Due to the complexity of medical data, it will be better in certain projects or diagnoses to adapt existing algorithms or optimize their use to obtain better results (Iavindrasana et al., 2009. The heterogeneity of the medical data such as: volume and complexity, physician’s interpretation, poor mathematical categorization and canonical form motivates medical data miners to develop new approaches to analyze data, (Iavindrasana et al., 2009).
To remediate to these deficiencies it will be advisable to create standard vocabularies, interfaces between different sources of data integrations, and design of electronic patient records. In (Jesneck et al., 2006), the authors propose a strategy “decision fusion” for the classification of imaging data from multiple modalities, multiple sources and having various types of features (Tusch et al., 2008).
5.3.3 Knowledge acquisition
We distinguish between 5 categories of data: Clinical features, Radiological features, Histological features, Image Data features and Digital image features.
5.3.4 Knowledge representation
Object-oriented based retrieval represent one way to represent cases is in the form of objects where each of the attributes could be of a simple type, like integer or string, or of type object. This forms a hierarchy of the object structure within which cases in the same classes of the hierarchy can be compared. The issue with this type of structure is when the target case and the case in the case base are not objects of the same class. Using this type of retrieval, not all the cases are compared to the target case, so it is faster than K-NN. Also this method is tolerant to missing attributes. If values are missing for the target case, the higher part of the hierarchy is searched, resulting in more retrieved cases.
We have organized data using CBR. We have structured the fifth categories of features as cases (CBR). We have modeled these cases with the object modeling, (Bergmann, 1998).
5.3.5 Illustration
Table 3 is an example of a scenario proposed for a training session for junior radiologists. This scenario will place the junior-radiologist in a situation where they will perform a learning session. It will require them to learn targeted knowledge and skills, (Demigha, 2015a).
Table 3: A scenario example
The junior radiologist is provided with a case library of videos of experts telling their stories, strategies, and perspectives that might help them with their task.
When they achieve their goal, they ask a question of the case library, and an appropriate video is retrieved and shown.
A story proposes a topic to radiologists (juniors) they should learn more about or a skill they need to learn.
A story tells how that expert dealt with some difficult issue the student is addressing.