• Tidak ada hasil yang ditemukan

Scheduling Workflows with Minimizing Energy Consumption for Big Data Applications on Cloud Using Modified PCP Algorithm

N/A
N/A
Protected

Academic year: 2024

Membagikan "Scheduling Workflows with Minimizing Energy Consumption for Big Data Applications on Cloud Using Modified PCP Algorithm"

Copied!
5
0
0

Teks penuh

(1)

_______________________________________________________________________________________________

Scheduling Workflows with Minimizing Energy Consumption for Big Data Applications on Cloud Using Modified PCP Algorithm

1Aishwarya, 2Deepika, 3Divya, 4Meenakshi, 5Kiranbala

1,2,3,4Student, Department of Computer science and Engineering, K.Ramakrishnan College of Engineering, Trichy.

5Assistant Professor, Department of Computer science and Engineering, K.Ramakrishnan College of Engineering, Trichy.

Abstract - Cloud computing plays an important role in Big Data application and provides facility for virtualized resources. Energy constraint is a main drawback which is highly present in today’s scenario which should be taken into account for workflow scheduling in Big Data applications on cloud. In this project, we present Energy Efficient Scheduling using Modified PCP (Partial Critical Path) Algorithm to minimize the energy consumption and also we present Privacy aware scheduling using Elliptic Curve Cryptography (ECC) Algorithm to maximize the privacy of social environment.

Experimental evaluations validate the efficiency and effectiveness of our proposed method.

Key terms: Cloud computing, workflow scheduling, Modified PCP, Elliptic Curve Cryptography.

I. INTRODUCTION

Today, almost everyone is connected to the Internet and uses different Cloud solutions to store, deliver, and process data. Cloud computing assembles large networks service. The use of cloud resources by end users is made in an asynchronous way and in many cases using mobile devices over different types of networks. Interoperability for such type of systems with the main aim to ensure dependability and resilience is one of the major challenges for heterogeneous distributed systems.

While cloud computing optimizes the use of resources, it does not (yet) provide an effective solution for processing complex applications described by workflows. Some example of such applications is hosting multimedia content-driven, and process tsunami (often in real-time) of content from heterogeneous sources, such as surveillance cameras, medical imaging devices, etc.

The current need is an optimal and validated middleware framework and that can support end-to-end life-cycle operations of different multimedia content-driven applications on more standard cloud infrastructures.

Cloud computing is a powerful technology that can provide humorous cloud services for the customers everywhere through the Internet. Currently, data privacy issues have received a lot of attention due to the increasing concern of the privacy and the data value protections, since the individuals often suffer heavy blows from privacy leaks. How to protect data and improve the security level of private data has become a hot topic of cloud computing. Generally, it is an effective way to place these datasets in the private cloud;

thus hybrid cloud for big data storage should be taken into consideration for privacy-aware applications.

In these techniques, we present Energy Efficient Scheduling using Modified PCP (Partial Critical Path) Algorithm to minimize the energy consumption. And also present Privacy aware scheduling using Elliptic Curve Cryptography (ECC) Algorithm to maximize the privacy of social environment.

II. LITERATURE REVIEW

This section provides the basic significance of workflow scheduling in cloud. It also provides the numerous methodology to energy efficiency and privacy aware techniques. This web development has resulted in huge usage of many applications and other service-oriented applications.

A Privacy Leakage Upper Bound Constraint-Based Approach for Cost-Effective Privacy Preserving of Intermediate Data Sets in Cloud-. In this work, we propose a novel upper bound privacy leakage constraint- based approach to identify which intermediate data sets need to be encrypted and which do not, so that privacy- preserving cost can be saved while the privacy requirements of data holders can still be satisfied.

Evaluation results demonstrate that the privacy- preserving cost of intermediate data sets can be significantly reduced with our approach over existing ones where all data sets are encrypted.

Taxonomies of Workflow Scheduling Problem and Techniques in the Cloud-. In this work, taxonomies of cloud workflow scheduling problem and techniques are proposed based on analytical review. We identify and

(2)

_______________________________________________________________________________________________

explain the aspects and classifications unique to workflow scheduling in the cloud environment in three categories, namely, scheduling process, task and resource. Lastly, review of several scheduling techniques are included and classified onto the proposed taxonomies. We hope that our taxonomies serve as a stepping stone for those entering this research area and for further development of scheduling technique.

A Resource Co-Allocation method for load-balance scheduling over big data platforms- The cloud computing scheme provides a model of utility computing, which is often implemented in the form of big data centers. As HPC applications are usually resource intensive, how to exploit economies of scale to harmonize the resource allocation among large-scale HPC applications under performance constraints often poses a significant challenge on the load-balance scheduling policies in big data platforms. In view of this challenge, in this paper we propose a Resource C o- Allocation method, named RCA, for load-balance scheduling. Technically, the method is promoted by four steps that are enacted in a hierarchical control framework.

Cost and Energy Aware Scheduling Algorithm for Scientific Workflows with Deadline Constraint in Clouds. - We present a cost and energy aware scheduling (CEAS) algorithm for cloud scheduler. The CEAS algorithm consists of five sub-algorithms. First, we use the VM selection algorithm which applies the concept of cost utility to map tasks to their optimal virtual machine (VM) types by the sub-make span constraint. Then, two tasks merging methods are employed to reduce cost and energy consumption of workflow.

Further, In order to reuse the idle VM instances which have been leased, the VM reuse policy is also proposed.

Finally, the scheme of slack time reclamation is utilized to save energy of leased VM instances. According to the time complexity analysis, we conclude that the time complexity of each sub-algorithm is polynomial.

Multi-objective workflow scheduling in cloud system based on cooperative multi-swarm optimization algorithm. - In order to improve the performance of multi-objective workflow scheduling in cloud system, a multi-swarm multi- objective optimization algorithm (MSMOOA) is proposed to satisfy multiple conflicting objectives.

Inspired by division of the same species into multiple swarms for different objectives and information sharing among these swarms in nature, each physical machine in the data center is considered a swarm and employs improved multi-objective particle swarm optimization to find out non-dominated solutions with one objective in MSMOOA. The particles in each swarm are divided into two classes and adopt different strategies to evolve cooperatively.

One class of particles can communicate with several swarms simultaneously to promote the information sharing among swarms and the other class of particles can only exchange information with the particles located in the same swarm. Furthermore, in order to avoid the influence by the elastic available resources, a manager server is adopted in the cloud data center to collect the available resources for scheduling.

III. METHODOLOGIES

We are facing with a strong development of various technologies leading to complex applications, which are able to process Big Data sets and execute different experiments/tasks on distributed systems. These applications are covering important aspects of everyday life: health, education, astronomy, research engineering, etc. and are described by a number of interdependent tasks called workflows. Scientific workflows represent the automation of a scientific process in which tasks are organized based on their control and data dependency.

In this techniques, we present Energy Efficient Scheduling using Modified PCP (Partial Critical Path) Algorithm to minimize the energy consumption. And also present Privacy aware scheduling using Elliptic Curve Cryptography (ECC) Algorithm to maximize the privacy of social environment.

3.1 Work Flow Management System

This module is the process of Work flow Management System (WfMS), getting the services details from several Cloud Service Providers (CSP) and the user requirements. Utility Grid model needs workflow management system, and several Cloud Service Providers (CSPs), each of which provides some services to the users. Users submit their workflows to the WfMS to be executed. The WfMS acts as a broker between users and CSPs, i.e., retrieves the required information, schedules work flow tasks on suitable services, and makes advance reservations of services.

Finally, dispatches tasks to the CSPs to be executed.

Each CSP has to register itself and its services, so that it can present and sell its services to users.

3.2 Deadline distribution phase

(3)

This module identifies the overall deadline of the workflow is distributed across individual tasks. First, the algorithm tries to assign sub deadlines to all tasks of the (overall) critical path (CP) of the workflow such that it can complete before the user’s deadline and its execution cost is minimized. The critical path of a workflow is the longest execution path in that workflow.

Then, it finds the partial critical path to each assigned task on the critical path and executes the same procedure in a recursive manner.

3.3 Planning phase

This module is the planner that selects the cheapest service for each task such that the task finishes before its sub deadline. In the planning phase, we try to select the best service for each task of the workflow to create an optimized schedule that ends before the deadline and has the minimum overall cost.

In the deadline distribution phase, each task was assigned a sub deadline. If we schedule each task such that it finishes before its sub deadline, then the whole workflow will finish before the user’s deadline. Our algorithm is based on a Greedy strategy that tries to create an optimized global solution by making optimized local decisions. At each stage it selects a ready task, i.e., a task all of whose parents have already been scheduled, and then assigns it to the cheapest service which can execute it before its sub deadline.

3.4 Rescheduling the workflow

A common method to continue the workflow execution is Rescheduling. Fortunately, the PCP algorithm can easily reschedule the work flow by calling the Planning procedure. If a CSP fails to execute a task on time, the PCP algorithm can call a modified version of the Planning procedure to reschedule the successors of the failed task on faster services. First it tries to reschedule the immediate successors of the failed task on services which can finish them before the actual start time of their children. The procedure stops for the successful successors, but for the unsuccessful ones, it schedules them on the fastest available services (similar to (7)) and continues with their immediate successors. A similar process can be done for an entirely failed task, except that the failed task itself should be rescheduled too.

3.5 Privacy aware Scheduling

This work aims at improving cloud computing within Cloud Organizations with encryption awareness based on Elliptic Curve Cryptography.

To secure data, most systems use a combination of techniques, including:

1. Encryption, which means they use a complex algorithm to encode information. To decode the encrypted files, a user needs an encryption key. While it’s possible to crack encrypted information, most hackers don’t have access to the amount of computer processing power they would need to decrypt information.

2. Authentication processes, which require creating a user name and password.

3. Authorization practices -- the client lists the people who are authorized to access information stored on the cloud system.

IV. RESULTS

In this section, the overall setup of our experiment and the results obtained from it is described to validate the proposed Modified PCP and ECC as Hybrid Scheduling

(4)

_______________________________________________________________________________________________

Algorithm (HSA). In our experiment, two well-known workflow applications, MMOPSO and Multi Objective Privacy-Aware workflow scheduling algorithm (MOPA), are chosen as test cases.

4.1 Scheduling time

4.2 Scheduling Cost

V. CONCLUSION

In this work, we investigate the current solutions for managing workflow applications in clouds, we present Energy Efficient Scheduling using Modified PCP (Partial Critical Path) Algorithm to minimize the energy consumption. And also present Privacy aware scheduling using Elliptic Curve Cryptography (ECC) Algorithm to maximize the privacy of social environment. The proposed HSA can find out better non-dominated solutions effectively, which has been proved by experiences. The experimental results show that the HSA algorithm can achieve better solutions than other ones.

Future, investigate privacy aware efficient scheduling of intermediate data sets in cloud by taking privacy preserving as a metric together with other metrics such as storage and computation. Optimized balanced scheduling strategies are expected to be developed toward overall highly efficient privacy aware data set scheduling.

REFERENCES

[1] W.C. Dou, X.Y. Zhang, J.X. Liu, J.J. Chen, Hiresome-II: towards privacy-aware cross-cloud service composition for big data applications, IEEE Trans. Parallel Distrib. Syst. 26 (2) (2015) 455–466.

[2] X.Y. Zhang, C. Liu, S. Nepal, S. Pandey, J.J.

Chen, A privacy leakage upper-bound constraint based approach for cost-effective privacy preserving of intermediate datasets in cloud, IEEE Trans. Parallel Distrib. Syst. 24 (6) (2013) 1192– 1202.

[3] T. Shah, A. Yavari, K. Mitra, S. Saguna, P.P.

Jayaraman, F. Rabhi, R. Ranjan, Remote healthcare cyber-physical-system: quality of service challenges and opportunities, IET Cyber- Phys. Syst. Theory Appl. (2016) 1–9.

[4] L.Z. Wang, R. Ranjan, Processing distributed internet of things data in clouds, IEEE Cloud Comput. 2 (1) (2015) 76–80.

[5] M. Masdari, S. ValiKardan, Z. Shahi, S.I. Azar, Towards workflow scheduling in cloud computing: a comprehensive analysis, J. Netw.

Comput. Appl. 66 (2016) 64–82.

[6] S. Smanchat, K. Viriyapant, Taxonomies of workflow scheduling problem and techniques in the cloud, Future Gener. Comput. Syst. 52 (2015) 1–12.

[7] E. Alkhank, S. Lee, S. Khan, Cost-aware challenges for workflow scheduling approaches in cloud computing environments: taxonomy and opportunities, Future Gener. Comput. Syst. 50 (9) (2016) 3–21.

[8] W.C. Dou, X.L. Xu, X. Liu, T.Y. Laurence, Y.P.

Wen, A resource co-allocation method for load- balance scheduling over big data platforms, Future Gener.

[9] Z.J. Li, J.D. Ge, H.Y. Hu, W. Song, H. Hu, B.

Luo, Cost and energy aware scheduling algorithm for scientific workflows with deadline constraint in clouds, IEEE Trans. Serv. Comput. (2018) in

press. Bookmark: http://doi.

ieeecomputersociety.org/10.1109/TSC.2015.2466 545.

[10] Z.M. Zhu, G.X. Zhang, M.Q. Li, X.H. Liu, Evolutionary multi-objective workflow scheduling in cloud, IEEE Trans. Parallel Distrib.

Syst. 27 (5) (2016) 1344–1357.

[11] S. Yassa, R. Chelouah, H. Kadima, B. Granado, Multi-objective approach for energy-aware workflow scheduling in cloud computing environments, Sci. World J. (2013) 1–13.

[12] Z.J. Wu, X. Liu, Z.W. Ni, D. Yuan, Y. Yang, A market-oriented hierarchical scheduling strategy in cloud workflow systems, J. Supercomput. 63 (1) (2013) 256–293.

(5)

[13] J.J. Durillo, R. Prodan, Multi-objective workflow scheduling in Amazon EC2, Cluster Comput. 17 (2) (2014) 169–189.

[14] M. Rahman, R. Hassan, R. Ranjan, R. Buyya, Adaptive workflow scheduling for dynamic grid and cloud computing environment, Concurr.

Comput.: Pract. Exper. 25 (13) (2013) 1816–

1842.

[15] I.C. Lopez, J. Taheri, R. Ranjan, L. Wang, A.Y.

Zomaya, A balanced scheduler with data reuse and replication for scientific workflows in cloud computing systems, Future Gener. Comput. Syst.

74 (2017) 168–178.

[16] J.J. Durillo, R. Prodan, J.G. Barbosa, Pareto tradeoff scheduling of workflows on federated commercial clouds, Simul. Model. Pract. Theory 58 (2015) 95–111.

[17] Z.J. Li, J.D. Ge, C.Y. Li, H.J. Yang, H.Y. Hu, B.

Luo, V. Chang, Energy cost minimization with job security guarantee in internet data center, Future Gener. Comput. Syst. 73 (2017) 63–78.

[18] S. Song, K. Hwang, Y.K. Kwok, Risk-resilient heuristics and genetic algorithms for security- assured grid job scheduling, IEEE Trans.

Comput. 55 (6) (2006) 703–719.

[19] T. Xie, X. Qin, Scheduling security-critical real- time applications on clusters, IEEE Trans.

Comput. 55 (7) (2006) 864–879.

[20] R. Kashyap, D. Vidyarth, Security driven scheduling model for computational grid using NSGA-II, J. Grid Comput. 11 (2013) 721–734.

[21] X. Tang, K. Li, Z. Zeng, B. Veeravalli, A novel security-driven scheduling algorithm for precedence constrained tasks in heterogeneous distributed systems, Trans. Comput. 60 (7) (2011) 1017–1029.

[22] L.F. Zeng, B. Veeravalli, X.R. Li, SABA: a security-aware and budget-aware workflow scheduling strategy in clouds, J. Parallel Distrib.

Comput. 75 (1) (2015) 141–151.

[23] Z.J. Li, J.D. Ge, H.J. Yang, L.G. Huang, H.Y.

Hu, H. Hu, B. Luo, A security and cost aware scheduling algorithm for heterogeneous tasks of scientific workflow in clouds, Future Gener.

Comput. Syst. 65 (2016) 140–152.

[24] M. Ali, S.U. Khan, A.V. Vasilakos, Security in cloud computing: opportunities and challenges, Inform. Sci. 305 (2015) 357–383.

[25] S.A. Hussain, M. Fatima, A. Saeed, I. Raza, R.K.

Shahzad, Multilevel classification of security concerns in cloud computing, Appl. Comput.

Inform. 13 (1) (2017) 57–65.



Referensi

Dokumen terkait