Computer Engineering
Computer Engineering
Department
Department
Research Profile
Research Profile
Computer Engineering
Computer Engineering
Department
Department
Research Profile
Research Profile
Dr. Sadiq M. Sait
Computer Engineering Department
King Fahd University of Petroleum & Minerals
Dr. Sadiq M. Sait
Computer Engineering Department
Computer Engineering Faculty
Computer Engineering Faculty
Computer Engineering Faculty
Computer Engineering Faculty
20 Professorial Rank faculty members
•
2 Full Professor•
2 Associate Professor•
16 Assistant Professor 6 lecturers 20 Professorial Rank faculty members
•
2 Full Professor•
2 Associate Professor•
16 Assistant ProfessorCOE Research Areas
COE Research Areas
COE Research Areas
COE Research Areas
Data Communications & Computer Networks.
Computer Applications: Robotics, Interfacing, Data
acquisition, Machine learning, Data Mining.
Digital Design Automation & VLSI System Design &
Test.
Computer Architecture & Parallel Processing.
Computer Arithmetic & Cryptography.
Data Communications & Computer Networks.
Computer Applications: Robotics, Interfacing, Data
acquisition, Machine learning, Data Mining.
Digital Design Automation & VLSI System Design &
Test.
COE Recent Research Projects: Data
COE Recent Research Projects: Data
Communications & Computer Networks
Communications & Computer Networks
COE Recent Research Projects: Data
COE Recent Research Projects: Data
Communications & Computer Networks
Communications & Computer Networks
Wireless Multi-hop Voice over IP over Wi-Fi using
Client-Server UDP.
Mobile Patient using sensor network.
Wireless Local Area Networks Integration for Mobile
Networks Operators.
E-Tourism Promoter – An Internet Assisted Location
Tracker and Map Reader for Tourists.
A Framework for Integration of Web-based Network
Management and Management by Delegation.
Radio Resource Management and QoS Control for
Wireless Integrated Services Networks.
Adaptive TCP Mechanisms for Wireless Networks.
Engineering Modern Iterative Heuristics to Solve Hard
Computer Network Design Problems.
Wireless Multi-hop Voice over IP over Wi-Fi using
Client-Server UDP.
Mobile Patient using sensor network.
Wireless Local Area Networks Integration for Mobile
Networks Operators.
E-Tourism Promoter – An Internet Assisted Location
Tracker and Map Reader for Tourists.
A Framework for Integration of Web-based Network
Management and Management by Delegation.
Radio Resource Management and QoS Control for
Wireless Integrated Services Networks.
Adaptive TCP Mechanisms for Wireless Networks.
Engineering Modern Iterative Heuristics to Solve Hard
Objectives
Setting up and implementing a
wireless mobile ad hoc infrastructure-less environment.
Imitating the cellular network
topology (Virual Base Stations) Engergy Aware protocols
Maximizing the number of hops
within the 200 ms constraint
IP telephony (protocols, basic and
advanced services)
Reconfiguring IEEE 802.11 wireless
cards
Impelementing H 323 protocol in
MANET
Sending data and voice over UDP/IP
Wireless Multi-hop Voice over IP
Wireless Multi-hop Voice over IP
H.323 Service
Energy Aware Gateways
Compression
Input voice
Output voice
. . . . . .
≥ 200 ms
Decompression
Intermediate nodes
Source
Source
Dest
Dest
Wireless
Mobile Patient
Mobile Patient
Mobile Patient
Mobile Patient
Objectives
•
Introduce mobile health as the future of medicine.•
Cost effective solution.•
Facilitate the use of bothmedical sensors and wireless mobile network in health
applications.
•
Building a sensor network to monitor patients effectively:• Providing doctors with easy access to the database
• Providing immediate help to patients
Objectives
•
Introduce mobile health as the future of medicine.•
Cost effective solution.•
Facilitate the use of bothmedical sensors and wireless mobile network in health
applications.
•
Building a sensor network to monitor patients effectively:• Providing doctors with easy access to the database
Compatible with most existing microcontrollers 1. Uses 3-wire serial interface (SPI)
2. Low voltage/low power consumption 3. SPI programmable
1. Full TCP/IP v4/v6 support 2. 10/100 Base-T Ethernet MAC 3. Built in OS
4. Three serial ports 5. Java programmable
6. Built in web-server and FTP
MS5536
Mobile Patient
Mobile Patient
Mobile Patient
Wireless Local Area Networks Integration
Wireless Local Area Networks Integration
for Mobile Networks Operators
for Mobile Networks Operators
Wireless Local Area Networks Integration
Wireless Local Area Networks Integration
for Mobile Networks Operators
for Mobile Networks Operators
Motivation
•
Address capacity requirements in “hotspot” areas.•
Provide seamless service continuity. Objective•
Integrate WLANs with 3G wireless data networks leading to hybrid mobile data networks.• Describe possible architectures and integration solutions
relevant to existing and future Saudi Telecom Company (STC) wireless networks.
• Present a typical deployment scenario of a WLAN into an STC wireless network
• Specify required network elements
• Provide corresponding commercially available products
Motivation
•
Address capacity requirements in “hotspot” areas.•
Provide seamless service continuity. Objective
•
Integrate WLANs with 3G wireless data networks leading to hybrid mobile data networks.• Describe possible architectures and integration solutions
relevant to existing and future Saudi Telecom Company (STC) wireless networks.
• Present a typical deployment scenario of a WLAN into an STC wireless network
• Specify required network elements
WLAN Integration: Methodology &
WLAN Integration: Methodology &
Planned Deliverables
Planned Deliverables
WLAN Integration: Methodology &
WLAN Integration: Methodology &
Planned Deliverables
Planned Deliverables
Survey and classify existing solutions Identify most suitable solution or introduce
new solution
Tailor the solution to suit the particular local
environment and network
Survey and classify existing solutions
Identify most suitable solution or introduce
new solution
Tailor the solution to suit the particular local
environment and network
Survey Solutions
Analyze & Evaluate Solutions
Pick Best Solution (novel/existing)
Apply to Case Study
Phases of Project
• Case study: Provide a typical deployment
scenario in King Fahd Airport in Dammam
•
Specify required network elements and thecorresponding commercial available product
Example of WLAN Integration
UMTS IP backbone 3G
SGSN
3G GGSN RNC
HLR
Node B Node B Node B
Node B
WLAN IWU
AAA
Wireless ISP server
E-Tourism Promoter – An Internet Assisted
E-Tourism Promoter – An Internet Assisted
Location Tracker and Map Reader for Tourists
Location Tracker and Map Reader for Tourists
E-Tourism Promoter – An Internet Assisted
E-Tourism Promoter – An Internet Assisted
Location Tracker and Map Reader for Tourists
Location Tracker and Map Reader for Tourists
Motivation
•
Vehicle tracking and path identification•
E-directory (databased) for public service areas and centers of interest (hotels, hospitals, police stations, etc.)•
Encouragement and promotion of tourism Functionality•
Advanced Arabic/English-based graphical user interface•
Real-time display of information and manipulation of map•
Communication with infrastructure networks (PSTN or GSM/GPRS) using email/SMS/voice•
Possible extensions: service/product advertisements, shortest path instructions, etc. Motivation
•
Vehicle tracking and path identification•
E-directory (databased) for public service areas and centers of interest (hotels, hospitals, police stations, etc.)•
Encouragement and promotion of tourism Functionality
•
Advanced Arabic/English-based graphical user interface•
Real-time display of information and manipulation of map•
Communication with infrastructure networks (PSTN orGSM/GPRS) using email/SMS/voice
E-Tourism: Specification, Methodology and
E-Tourism: Specification, Methodology and
Planned Product
Planned Product
E-Tourism: Specification, Methodology and
E-Tourism: Specification, Methodology and
Planned Product
Planned Product
• Integrated database system to position
on the map and query service centers and areas of interest
• Minimum requirement is to identify
these facilities of interest and display their attributes (phone #, email, website, etc.)
• Communication using PSTN, mobile
GSM network, or GPRS/internet
• Additional features may include
identification of shortest path
• Integrated database system to position
on the map and query service centers and areas of interest
• Minimum requirement is to identify
these facilities of interest and display their attributes (phone #, email, website, etc.)
• Communication using PSTN, mobile
GSM network, or GPRS/internet
• Additional features may include
identification of shortest path
Project Architecture
• Development: JAVA and .NET
framework
• GPS communication System • Map application
• Bluetooth serial support port SSP • GSM/GPRS modem support
• An integrated software/hardware
A Framework for Integration of Web-based Network
A Framework for Integration of Web-based Network
Management and Management by Delegation
Management and Management by Delegation
A Framework for Integration of Web-based Network
A Framework for Integration of Web-based Network
Management and Management by Delegation
Management and Management by Delegation
Network Management is mainly based on a centralized
architecture. This causes the manager and its segment to
become a bottleneck. The most widely used protocol is SNMP, which lacks flexibility and efficiency.
Network Management is mainly based on a centralized
architecture. This causes the manager and its segment to
become a bottleneck. The most widely used protocol is SNMP, which lacks flexibility and efficiency.
Objective: Develop an
XML-based network management system using JPVM to
dynamically distribute the management load across multiple
XML/SNMP gateways.
Objective: Develop an
XML-based network management system using JPVM to
dynamically distribute the management load across multiple
XML/SNMP gateways.
XML provides a more flexible
and standard representation and exchange of data. Load
balancing techniques provide a more efficient data processing – How can these techniques
improve existing network management systems?
XML provides a more flexible
and standard representation and exchange of data. Load
balancing techniques provide a more efficient data processing – How can these techniques
- A master gateway distributes the management load across multiple XML/SNMP gateways
- A gateway translates XML to SNMP and SNMP to XML.
- XML-based network management approaches using Java Parallel Virtual Machine (JPVM) :
- Dynamic load balancing
- Adaptive load balancing
- Static weighted load balancing
- Equal work non-weighted load balancing
Achievements :
Standard representation and exchange of data
Efficient distribution of tasks: adaptive and dynamic
Delegation of tasks to other gateways
Increased processing efficiency of management data
Decreased communication time between the manager and the agents
Approach: Integration of XML and Different
Load Balancing Techniques
Proposed Work related to Intel’s R&D
Proposed Work related to Intel’s R&D
Proposed Work related to Intel’s R&D
Proposed Work related to Intel’s R&D
The project is being extended through the
investigation of the following issues:
•
New load balancing approaches that adapt dynamically in function of the network load•
Reliable and fault-tolerant network management•
Hierarchical network management Adaptive Distributed and Reliable XML-based Network
Management can prove useful in areas such as:
•
Automated and remote provisioning techniques•
Remote and reliable operations such as secure reset and power cycling•
Remote and distributed network management applications such as monitoring, control, topology discovery, andperformance evaluation
The project is being extended through the
investigation of the following issues:
•
New load balancing approaches that adapt dynamically in function of the network load•
Reliable and fault-tolerant network management•
Hierarchical network management Adaptive Distributed and Reliable XML-based Network
Management can prove useful in areas such as:
•
Automated and remote provisioning techniques•
Remote and reliable operations such as secure reset and power cycling•
Remote and distributed network management applications such as monitoring, control, topology discovery, andComputer Applications: Embedded Systems,
Computer Applications: Embedded Systems,
Machine Learning, Robotics
COE Recent Research Projects: Computer
COE Recent Research Projects: Computer
Applications
Applications
COE Recent Research Projects: Computer
COE Recent Research Projects: Computer
Applications
Applications
Design of a wireless safety system for smart
kitchen.
Predicting log properties from seismic data using
abductive networks.
Design of an Intelligent Telerobotic System.
Designing and building a mobile emergency warning
system for patients under health care.
Context aware energy management system.
Design of a wireless safety system for smart
kitchen.
Predicting log properties from seismic data using
abductive networks.
Design of an Intelligent Telerobotic System.
Designing and building a mobile emergency warning
system for patients under health care.
Designing and Implementing a Safety & Health
Designing and Implementing a Safety & Health
Check System for Home Environment
Check System for Home Environment
Designing and Implementing a Safety & Health
Designing and Implementing a Safety & Health
Check System for Home Environment
Check System for Home Environment
Motivation
•
Preventing children accidents.•
Keep a record of children encounters: access to hazardous appliances.•
Provide immediate help for children when needed.•
Utilize the advances in Web technology to keep an eye on children.•
Enhance the level of safety at home. Objectives
•
Introduce a safety system for the kitchen environment.•
Cost effective solution.•
To inform child’s parents of his status in case of emergency.•
Building a sensor’s network to gather real time Information of events in the kitchen:• Detection of hazards and generating Alarms
• Identification of children to disable access to hazardous appliances and tools
Motivation
•
Preventing children accidents.•
Keep a record of children encounters: access to hazardous appliances.•
Provide immediate help for children when needed.•
Utilize the advances in Web technology to keep an eye on children.•
Enhance the level of safety at home. Objectives
•
Introduce a safety system for the kitchen environment.•
Cost effective solution.•
To inform child’s parents of his status in case of emergency.•
Building a sensor’s network to gather real time Information of events in the kitchen:• Detection of hazards and generating Alarms
Proposed Solution
Proposed Solution
Proposed Solution
Proposed Solution
Contributions:
Contributions:
•Remote live alert of hazardous Remote live alert of hazardous
gas.
gas.
•Low cost smart kitchen for safety Low cost smart kitchen for safety
of children.
of children.
•Controlled access to hazardous Controlled access to hazardous
kitchen tools.
kitchen tools.
•Web enabled solution for Web enabled solution for
monitoring of children in real time.
Predicting Log Properties from Seismic
Predicting Log Properties from Seismic
Data using Abductive Networks
Data using Abductive Networks
Predicting Log Properties from Seismic
Predicting Log Properties from Seismic
Data using Abductive Networks
Data using Abductive Networks
Modeling well log parameters
in terms of seismic data gives
a more complete picture of
rock properties over a
reservoir.
A large number of seismic
features exist- Which ones are
relevant?
Objective: Use abductive
networks to select an optimum
subset of seismic attributes
and model rock porosity.
Modeling well log parameters
in terms of seismic data gives
a more complete picture of
rock properties over a
reservoir.
A large number of seismic
features exist-
Which ones are
relevant?
Objective:
Use abductive
networks to select an optimum
subset of seismic attributes
-
Easier to train: Self organization
- Algorithm selects: Significant inputs, Function elements, Connectivity,
Coefficients
- Automatic stopping criteria with complexity control
- More transparent models. Analytical input-output relationships
Achievements :
Several porosity models at various degrees of complexity. Accuracy comparable with previous neural network models
obtained using much larger datasets.
Significant reduction in the number of input features
needed.
y = w0 + w1 x1 + w2 x2 + w3 x12 + w4 x22 + w5 x1 x2 + w6 x13 + w7 x23
Approach: Self-Organizing Abductive
Approach: Self-Organizing Abductive
(Polynomial) Networks
(Polynomial) Networks
Approach: Self-Organizing Abductive
Approach: Self-Organizing Abductive
Proposed Work related to Intel’s R&D
Proposed Work related to Intel’s R&D
Proposed Work related to Intel’s R&D
Proposed Work related to Intel’s R&D
COE has a wide experience in abductive network
modeling for science, engineering, medical informatics, and the environment.
A project is being initiated for FPGA realization of
abductive networks.
VLSI implementations of abductive models should
prove useful in areas such as:
•
Intelligent processing in communication networks•
Intelligent health care monitoring and control•
Environmental and weather monitoring and forecasting•
Inferential monitoring and control of industrial processes•
Predictive maintenance for machinery COE has a wide experience in abductive network
modeling for science, engineering, medical informatics, and the environment.
A project is being initiated for FPGA realization of
abductive networks.
VLSI implementations of abductive models should
prove useful in areas such as:
•
Intelligent processing in communication networks•
Intelligent health care monitoring and controlPROPOSED RESEARCH
Supervisory and automatic inspection in oil
exploration and drilling (OED).
Develop standards in computer and
software architectures for telerobotics.
Handles real-world and communication
uncertainties and respond to expected and unexpected events.
Develop a universal master workstation
and application-oriented slave robots.
Develop computer-aided telerobotic tools to
promote man-machine interfacing and quality of telerobotic work.
Use of inexpensive, light, easily
maintainable, telerobotic systems.
PROPOSED RESEARCH
Supervisory and automatic inspection in oil
exploration and drilling (OED).
Develop standards in computer and
software architectures for telerobotics.
Handles real-world and communication
uncertainties and respond to expected and unexpected events.
Develop a universal master workstation
and application-oriented slave robots.
Develop computer-aided telerobotic tools to
promote man-machine interfacing and quality of telerobotic work.
Use of inexpensive, light, easily
maintainable, telerobotic systems.
TELEROBOTICS FOR OIL
TELEROBOTICS FOR OIL
EXPLORATION AND DRILLING
EXPLORATION AND DRILLING
TELEROBOTICS FOR OIL
TELEROBOTICS FOR OIL
EXPLORATION AND DRILLING
EXPLORATION AND DRILLING
TELEROBOTICS FOR OIL
TELEROBOTICS FOR OIL
EXPLORATION AND DRILLING
EXPLORATION AND DRILLING
TELEROBOTICS FOR OIL
TELEROBOTICS FOR OIL
EXPLORATION AND DRILLING
EXPLORATION AND DRILLING
Standards hardware and software architectures
New Processor architecture (effective multithreading)
•
Multithreading and multi-streaming•
Interfacing: sensing (mobility, video, force, etc) and 3D visualization•
Communication: real-time wireless networking Software architecture (uncertainties in task and communication)
•
Reactivity: supervisory and linguistic control, supervised-autonomy, shared-control,cooperative and collaborative control
•
Advanced real-time motion coordination and mobility•
Task planning: graphical modeling/simulation using VR and augmented reality•
Reliability and exception handling: agent-based reactive behavior using multi-sensor fusionStandards hardware and software architectures
New Processor architecture (effective
multithreading)
•
Multithreading and multi-streaming•
Interfacing: sensing (mobility, video, force, etc) and 3D visualization•
Communication: real-time wireless networking Software architecture (uncertainties in task and
communication)
•
Reactivity: supervisory and linguistic control, supervised-autonomy, shared-control,cooperative and collaborative control
•
Advanced real-time motion coordination and mobility•
Task planning: graphical modeling/simulation using VR and augmented reality•
Reliability and exception handling: agent-based reactive behavior using multi-sensorApplications
Scaled telerobotics: a manned station
teleoperating (wired or wirelss) a scaled slave robot for remote operations and routine
maintenance in inaccessible areas like tubes, pipes, equipment, well, drilling hole, etc.
Hazardous: a manned station teleoperating a
stationary or mobile robot (vehicle) for remote operations and inspection in harsh environments like high temperature, high pressure, poisonous (gaz or other), high pollution, underground,
underwater, etc.
Tight safety: a manned station teleoperating a
slave robot carrying out security patrol tasks, disposal of dangerous material, rescue, fire fighting and clearance, oil platform inspection and repair, operating in emergency cases as surveillance and reconnaissance, etc.
Applications
Scaled telerobotics: a manned station
teleoperating (wired or wirelss) a scaled slave robot for remote operations and routine
maintenance in inaccessible areas like tubes, pipes, equipment, well, drilling hole, etc.
Hazardous: a manned station teleoperating a
stationary or mobile robot (vehicle) for remote operations and inspection in harsh environments like high temperature, high pressure, poisonous (gaz or other), high pollution, underground,
underwater, etc.
Tight safety: a manned station teleoperating a
slave robot carrying out security patrol tasks, disposal of dangerous material, rescue, fire fighting and clearance, oil platform inspection and repair, operating in emergency cases as surveillance and reconnaissance, etc.
TELEROBOTICS FOR OIL
TELEROBOTICS FOR OIL
EXPLORATION AND DRILLING
EXPLORATION AND DRILLING
TELEROBOTICS FOR OIL
TELEROBOTICS FOR OIL
EXPLORATION AND DRILLING
EXPLORATION AND DRILLING
Warning and Monitoring Medical System:
Warning and Monitoring Medical System:
Designing and implementation
Designing and implementation
Warning and Monitoring Medical System:
Warning and Monitoring Medical System:
Designing and implementation
Designing and implementation
Motivation
•
The remote system ensures high quality service to the patients.•
Keep a record of your patient encounters.•
Improve information efficiency and manageability of knowledge sources.•
Overcome nurse negligence.•
Provide better health care and medication support. Objectives•
Introduce mobile health as the future of medicine.•
Cost effective solution.•
To facilitate improved productivity using mobile health solutions.•
Building a sensor network to monitor patients effectively:• Providing doctors with easy access to the database • Providing immediate help to patients
Motivation
•
The remote system ensures high quality service to the patients.•
Keep a record of your patient encounters.•
Improve information efficiency and manageability of knowledge sources.•
Overcome nurse negligence.•
Provide better health care and medication support. Objectives
•
Introduce mobile health as the future of medicine.•
Cost effective solution.•
To facilitate improved productivity using mobile health solutions.•
Building a sensor network to monitor patients effectively: • Providing doctors with easy access to the databaseProposed Solution
Proposed Solution
Proposed Solution
Proposed Solution
Blood Pressure
Blood Pressure
Sensor Circuit
Digital Design Automation & VLSI System Design & Test
COE Recent Research Projects: Design
COE Recent Research Projects: Design
Automation & VLSI System Design & Test.
Automation & VLSI System Design & Test.
COE Recent Research Projects: Design
COE Recent Research Projects: Design
Automation & VLSI System Design & Test.
Automation & VLSI System Design & Test.
Iterative Heuristics for Timing & Low Power VLSI
Standard Cell Placement.
Parallelization of Iterative Heuristics for Low Power
VLSI Standard Cell Placement.
Efficient Test Relaxation Based Static Test Compaction
Techniques for Combinational and Sequential Circuits.
Efficient Test Data Compression Techniques for
Testing Systems-on-Chip.
Segmented Addressable Scan Architecture for
Effective Test Data Compression.
Iterative Heuristics for Timing & Low Power VLSI
Standard Cell Placement.
Parallelization of Iterative Heuristics for Low Power
VLSI Standard Cell Placement.
Efficient Test Relaxation Based Static Test Compaction
Techniques for Combinational and Sequential Circuits.
Efficient Test Data Compression Techniques for
Testing Systems-on-Chip.
Segmented Addressable Scan Architecture for
COE Recent Research Projects: Design
COE Recent Research Projects: Design
Automation & VLSI System Design & Test.
Automation & VLSI System Design & Test.
COE Recent Research Projects: Design
COE Recent Research Projects: Design
Automation & VLSI System Design & Test.
Automation & VLSI System Design & Test.
Development of Digital Circuit Techniques for Clock
Recovery and Data Re-Timing for High Speed NRZ Source-Synchronous Serial Data Communications.
Fast context switching configurable architectures
supporting dynamic reconfiguration for computation intensive applications.
Development of Integrated Micro-electronic Heavy
Metal Sensors for Environmental Applications.
Multi-objective Finite State Machine Encoding using
Non-Deterministic Evolutionary Algorithms targeting area, low power and testability.
Design and Implementation of Scalable Interconnect
Efficient LDPC Error Correcting Codes.
Development of Digital Circuit Techniques for Clock
Recovery and Data Re-Timing for High Speed NRZ Source-Synchronous Serial Data Communications.
Fast context switching configurable architectures
supporting dynamic reconfiguration for computation intensive applications.
Development of Integrated Micro-electronic Heavy
Metal Sensors for Environmental Applications.
Multi-objective Finite State Machine Encoding using
Non-Deterministic Evolutionary Algorithms targeting area, low power and testability.
Design and Implementation of Scalable Interconnect
Parallelizing Non-Deterministic Iterative
Parallelizing Non-Deterministic Iterative
Heuristics to Solve VLSI CAD Problems
Heuristics to Solve VLSI CAD Problems
Parallelizing Non-Deterministic Iterative
Parallelizing Non-Deterministic Iterative
Heuristics to Solve VLSI CAD Problems
Heuristics to Solve VLSI CAD Problems
CAD Problems such as Floorplanning, Placement,
Routing, Scheduling, etc., require an enormous amount of computation time.
Iterative Heuristics such as Genetic Algorithms, Tabu
Search, Simulated Evolution, and others have been
found effective in solving several NP-hard optimization problems.
Objective: To use a cluster of PCs to solve
multi-objective VLSI CAD problems in order to improve quality and reduce run-time.
CAD Problems such as Floorplanning, Placement,
Routing, Scheduling, etc., require an enormous amount of computation time.
Iterative Heuristics such as Genetic Algorithms, Tabu
Search, Simulated Evolution, and others have been
found effective in solving several NP-hard optimization problems.
Objective: To use a cluster of PCs to solve
Approach: To employ a Cluster of PCs to
Approach: To employ a Cluster of PCs to
Distribute Computationally Intensive Tasks
Distribute Computationally Intensive Tasks
Approach: To employ a Cluster of PCs to
Approach: To employ a Cluster of PCs to
Distribute Computationally Intensive Tasks
Distribute Computationally Intensive Tasks
Clusters of low end PCs are easy to build.
Tools such as MPI and PVM are available for message
passing.
Tools such as gprof, Intel’s VTUNE Performance
Analyzer, etc., are used for generating profiles for
serial codes and determining the part of the code that has the bottlenecks.
Iterative algorithms are non-deterministic, and dividing
work load, i.e. partitioning the search space, is a challenge.
The parallelizing model (i.e., Partitioning,
Communication, Agglomeration and Mapping) is very well-defined for numerical problems, which are mostly deterministic. This is not the case for Iterative
heuristics, which are non-deterministic.
Clusters of low end PCs are easy to build.
Tools such as MPI and PVM are available for message
passing.
Tools such as gprof, Intel’s VTUNE Performance
Analyzer, etc., are used for generating profiles for
serial codes and determining the part of the code that has the bottlenecks.
Iterative algorithms are non-deterministic, and dividing
work load, i.e. partitioning the search space, is a challenge.
The parallelizing model (i.e., Partitioning,
Communication, Agglomeration and Mapping) is very well-defined for numerical problems, which are mostly deterministic. This is not the case for Iterative
Tools used in our Current Cluster
Tools used in our Current Cluster
Tools used in our Current Cluster
Tools used in our Current Cluster
MPICH Library provides a flexible implementation of MPI
for easier message-passing interface development on multiple network architectures.
Intel® Trace Collector 5.0 applies event-based tracing in
cluster applications with a low-overhead library. Offers performance data, recording of statistics, multi-threaded traces, and automatic instrumentation of binaries on IA-32.
Intel® Trace Analyzer 4.0 provides visual analysis of
application activities gathered by the Intel Trace Collector.
TotalView (MPICH) is also used for observing
communication between processors.
Also used in Condor (for scheduling jobs on the cluster).
MPICH Library provides a flexible implementation of MPI
for easier message-passing interface development on multiple network architectures.
Intel® Trace Collector 5.0 applies event-based tracing in
cluster applications with a low-overhead library. Offers performance data, recording of statistics, multi-threaded traces, and automatic instrumentation of binaries on IA-32.
Intel® Trace Analyzer 4.0 provides visual analysis of
application activities gathered by the Intel Trace Collector.
TotalView (MPICH) is also used for observing
communication between processors.
Relationship to Intel’s R&D
Relationship to Intel’s R&D
Relationship to Intel’s R&D
Relationship to Intel’s R&D
COE Department has faculty experienced in VLSI
Design.
Two books in the area of iterative algorithms and VLSI
Design have been authored by the department faculty.
The Technology Center being proposed in RI will have
the state-of-art tools and equipment.
Faculty and students currently interested in HPC and
parallelization of heuristics can work together to address industrial and real-world problems.
COE Department has faculty experienced in VLSI
Design.
Two books in the area of iterative algorithms and VLSI
Design have been authored by the department faculty.
The Technology Center being proposed in RI will have
the state-of-art tools and equipment.
Faculty and students currently interested in HPC and
Efficient Test Compaction & Compression
Efficient Test Compaction & Compression
Techniques for Comb. & Seq. Circuits
Techniques for Comb. & Seq. Circuits
Efficient Test Compaction & Compression
Efficient Test Compaction & Compression
Techniques for Comb. & Seq. Circuits
Techniques for Comb. & Seq. Circuits
SOC Testing Challenges
•
Reduce amount of test data.•
Reduce time a defective chip spends on a tester. Test Compaction & Compression
•
Reduce the size of a test set as much as possible. Test vector reordering for
combinational circuits.
•
Steepen the curve of fault coverage vs. number of test vectors. SOC Testing Challenges
•
Reduce amount of test data.•
Reduce time a defective chipspends on a tester.
Test Compaction & Compression
•
Reduce the size of a test set as much as possible. Test vector reordering for
combinational circuits.
Efficient Test Compaction & Compression
Efficient Test Compaction & Compression
Techniques for Comb. & Seq. Circuits
Techniques for Comb. & Seq. Circuits
Efficient Test Compaction & Compression
Efficient Test Compaction & Compression
Techniques for Comb. & Seq. Circuits
Techniques for Comb. & Seq. Circuits
Efficient Test Relaxation for
Combinational & Sequential circuits
•
Enabling technology for test Compaction & Compression•
Test power reduction Developed efficient test
compaction techniques based on test relaxation.
Test Vector Decomposition
•
Maximizes test compaction by vector clustering techniques•
Maximizes test width-based compression techniques. Efficient Test Relaxation for
Combinational & Sequential circuits
•
Enabling technology for test Compaction & Compression•
Test power reduction Developed efficient test
compaction techniques based on test relaxation.
Test Vector Decomposition
•
Maximizes test compaction by vector clustering techniques•
Maximizes test width-basedcompression techniques.
110011001 011000110 000110011 101111100 000010001
1X0XX100X X11XX0X10 0XXXX0XX1 XXXX111XX X0X01XXXX Test
Relax.
110011001
Segmented Addressable Scan: Scan Test
Segmented Addressable Scan: Scan Test
Challenges
Challenges
Segmented Addressable Scan: Scan Test
Segmented Addressable Scan: Scan Test
Challenges
Challenges
Test data volume challenge
•
Limited IOs & unlimited increase in transistors•
Exponential increase in test data volume Tester pin count challenge•
Tester cost is almost linear in number of pins Test time challenge•
Critical path•
Hard to parallelize test loading massively Test power challenge•
High activity leading to high power consumption. Test data volume challenge
•
Limited IOs & unlimited increase in transistors•
Exponential increase in test data volume Tester pin count challenge
•
Tester cost is almost linear in number of pins Test time challenge
•
Critical path•
Hard to parallelize test loading massively Test power challenge
Segment 1 Segment 2 Segment M
...
...
O u tp u t C o p m re s s o r Segment(s) AddressTester Channel or Input Decompressor Clock Tree S A S D e c o d e r
Segmented Addressable Scan
Segmented Addressable Scan
Segmented Addressable Scan
Segmented Addressable Scan
Aggressive parallelization of scan chains
Reconfigurable partial compatibilities
Special SAS decoder
Overhead: few gates per
scan chain
Data volume: 10x ~ 20x
compression with small designs for both SAF and TDF
Bigger designs have higher compression
Pin count: 2 log2S +1 pins,
can be reduced to
ONLY 2
Test time: aggressive
parallelization test time
reduction
Power
consumption
selective
Test Data Volume & Test Time (Delay test)
Test Data Volume & Test Time (Delay test)
Test Data Volume & Test Time (Delay test)
Test Data Volume & Test Time (Delay test)
Total Data Volume 98 Mb Compression
Ratio
SAS Data
Volume 32 Segments 7.7 Mb 12x
64 Segments 5.3 Mb 17x
128 Segments 4.5 Mb 22x
256 Segments 3.6 Mb 27x
Computer Architecture & Parallel Processing
COE Recent Research Projects: Computer
COE Recent Research Projects: Computer
Architecture & Parallel Processing
Architecture & Parallel Processing
COE Recent Research Projects: Computer
COE Recent Research Projects: Computer
Architecture & Parallel Processing
Architecture & Parallel Processing
Load Balancing for Parallel Visualization of Blood Head
Vessel Angiography on Cluster of PCs.
Shared Channels in Interconnection Networks.
Study of modified Multistage Interconnection Networks for Networks-on-Chips.
Design of a Simulator for a Class of Dynamic
Execution Processors.
Beyond Instruction-Level Parallelism in Processor
Architecture.
Design and Performance Evaluation of a Distributed Crossbar Scheduler.
Software Pipelining for Reconfigurable Instruction Set
Processors.
Load Balancing for Parallel Visualization of Blood Head
Vessel Angiography on Cluster of PCs.
Shared Channels in Interconnection Networks.
Study of modified Multistage Interconnection Networks for
Networks-on-Chips.
Design of a Simulator for a Class of Dynamic
Execution Processors.
Beyond Instruction-Level Parallelism in Processor
Architecture.
Design and Performance Evaluation of a Distributed
Crossbar Scheduler.
Software Pipelining for Reconfigurable Instruction Set
Scalable Cache Memory Design for
Scalable Cache Memory Design for
Large-Scale SMT Architectures
Large-Scale SMT Architectures
Scalable Cache Memory Design for
Scalable Cache Memory Design for
Large-Scale SMT Architectures
Large-Scale SMT Architectures
Scalable front end
•
Multiple i-caches•
Scalable i-cache capacity•
Scalable i-cache bandwidth One-level scalable
and shareable data cache
•
Split into multiple block-interleaved banks•
Each bank is single-ported and shared by all threads•
Parallel access to different banks throughinterconnect
•
Complexity grows with number of ports and banks Scalable front end
•
Multiple i-caches•
Scalable i-cache capacity•
Scalable i-cache bandwidth One-level scalable
and shareable data cache
•
Split into multiple block-interleaved banks•
Each bank is single-ported and shared by all threads•
Parallel access to different banks throughinterconnect
•
Complexity grows with number of ports and banks Interconnect ... Memory Module Dcache Bank Dcache Bank Dcache Bank ... Memory Module Dcache Bank Dcache Bank Dcache Bank PC Decode & Rename I-cache PC In t Q Registers &Bypass FPULSLS ALUALU FPU
F P Q PC Decode & Rename I-cache PC In t Q Registers &Bypass FPU
LSLS ALUALU FPU
F P Q PC Decode & Rename I-cache PC In t Q Registers &Bypass FPU
LSLS ALU FPU
F P Q PC Decode & Rename I-cache PC In t Q Registers &Bypass FPU
LSLS ALUALU FPU
F
P Q
Modular and Scalable SMT (not a pure SMT)
Most hardware resources have limited thread sharing
• I-caches, Decode logic, Queues, Registers, FUs
Modular and Scalable SMT (not a pure SMT)
Most hardware resources have limited thread sharing
Simulation and Performance
Simulation and Performance
Simulation and Performance
Simulation and Performance
SPEC 2000 Simulation
• 8 simultaneous threads Simulation Parameters
Issue/retirement width: 32 instructions / cycle Scheduling Queue: 128
entries
Load-Store Queue: 64 entries
Other Resources:
• 24 simple ALUs
• 8 fully pipelined FPUs
• 4 cycle-latency for FP add and FP multiply
SPEC 2000 Simulation
• 8 simultaneous threads
Simulation Parameters
Issue/retirement width:
32 instructions / cycle
Scheduling Queue: 128
entries
Load-Store Queue: 64
entries
Other Resources:
• 24 simple ALUs
• 8 fully pipelined FPUs • 4 cycle-latency for FP
add and FP multiply
0 2 4 6 8 10 12 14 16 18 20 22 24
Ideal Mem Ideal Cache Latency 3 Latency 5 Latency 7 Latency 9
In st ru ct io n s P e r C yc le ( IP C )
188.ammp 183.equake 177.mesa 176.gcc 197.parser 255.vortex 175.vpr 181.mcf
23.07
20.46 20.19
19.10 18.39
19.71
Conclusions
• Large-scale SMT can tolerate latencies
• Parallel D-cache banks improve capacity and bandwidth, but
increase hit latency
Conclusions
• Large-scale SMT can tolerate latencies
• Parallel D-cache banks improve capacity and bandwidth, but
increase hit latency
Related Publications
• Mudawar M. and Wani J., One-Level Cache Memory Design for Scalable SMT Architectures, in Proceedings of the 17th ISCA International Conference on Parallel and Distributed Computing Systems, September 15-17 2004, San Francisco, California.
• Mudawar M., Scalable Cache Memory Design for Large-Scale SMT
Architectures, ACM International Conference Proceedings Series, Vol 68; also in Proceedings of the 3rd Workshop on Memory Performance Issues: in conjunction with 31st IEEE/ACM International Symposium on Computer Architecture, June 20-23 2004, Munich, Germany.
Related Publications
• Mudawar M. and Wani J., One-Level Cache Memory Design for Scalable SMT Architectures, in Proceedings of the 17th ISCA International Conference on Parallel and Distributed Computing Systems, September 15-17 2004, San Francisco, California.
• Mudawar M., Scalable Cache Memory Design for Large-Scale SMT
Proposed work related to Intel’s R & D
Proposed work related to Intel’s R & D
Proposed work related to Intel’s R & D
Proposed work related to Intel’s R & D
Wide experience in processor simulation and
evaluation of micro-architectures.
A project is being initiated for the automatic
generation of simulators from the formal description of the instruction set architecture.
We are currently investigating
•
A formal language for the concise description of an instruction set architecture.•
Automatic generation of a simulator from a formal description.•
Generation of an assembler from a formal description. We are considering using this tool in•
Proposing new instruction set architectures for research and development.•
Education in related Computer Architecture courses. Wide experience in processor simulation and
evaluation of micro-architectures.
A project is being initiated for the automatic
generation of simulators from the formal description of the instruction set architecture.
We are currently investigating
•
A formal language for the concise description of an instruction set architecture.•
Automatic generation of a simulator from a formal description.•
Generation of an assembler from a formal description. We are considering using this tool in
•
Proposing new instruction set architectures for research and development.Study of Modified Multistage Interconnection
Study of Modified Multistage Interconnection
Networks for Networks-On-Chips
Networks for Networks-On-Chips
Study of Modified Multistage Interconnection
Study of Modified Multistage Interconnection
Networks for Networks-On-Chips
Networks for Networks-On-Chips
Past Networks-On-Chips (NoCs) Solutions:
•
Reproduce what has been learned in the area of inter-chip networks,•
Focus on the router architecture alone to achieve certain goals in latency•
Asynchronous design of NoCs, mainly GALS•
Circuit switching techniques introduced to provide a certain guarantee for the latency.•
Did not fully take advantage of the fact that the network is on-chip where the main gain is no-pin limitation.•
Router architectures directly derived from inter-chip architectures where the routers were implemented on a single chip. This implies a substantialoverhead.
•
Added complexity to achieve guaranteed latency is an overkill in the on-chip context. Analysis:
•
Low throughput. Means: latency cannot be guaranteed above the maximum throughput levels•
Cannot prevent contention from happening. Contention makes router architectures more complex because they need to integrate buffering and prioritization logic.•
Routers that implement both packet and circuit switching makes the architecture even more complex. Past Networks-On-Chips (NoCs) Solutions:
•
Reproduce what has been learned in the area of inter-chip networks,•
Focus on the router architecture alone to achieve certain goals in latency•
Asynchronous design of NoCs, mainly GALS•
Circuit switching techniques introduced to provide a certain guarantee for the latency.•
Did not fully take advantage of the fact that the network is on-chip where the main gain is no-pin limitation.•
Router architectures directly derived from inter-chip architectures where the routers were implemented on a single chip. This implies a substantialoverhead.
•
Added complexity to achieve guaranteed latency is an overkill in the on-chip context. Analysis:
•
Low throughput. Means: latency cannot be guaranteed above the maximum throughput levels•
Cannot prevent contention from happening. Contention makes router architectures more complex because they need to integrate buffering and prioritization logic.Modified Multistage
Modified Multistage
Modified Multistage
Modified Multistage
Idea
• Contention free
• Router architecture bufferless because no contention = no need to buffer
Which network is almost contention free?
• Crossbar with Virtual Output Queues
• Crossbar non-scalable
What topology resembles a crossbar?
• Banyans or Multistage Interconnection Networks.
• Unidirectional: Wire Routing issue.
• Bidirectional multistage or folded multistage networks: Good
• Bidirectional multistage are two entities:
• The MIN, so-called “fat-tree” network • The butterfly.
• MIN better than butterfly (previous work)
How to modify the MIN so that it becomes contention free?
• Routing in a MIN:
• Going up: adaptive
• Going down: deterministic. Means a high probability of contention.
• Solution: add as many output down links as there is a probability of contention per stage
Preliminary Results: latency is minimal throughput is > 90%
Idea
• Contention free
• Router architecture bufferless because no contention = no need to buffer
Which network is almost contention free?
• Crossbar with Virtual Output Queues • Crossbar non-scalable
What topology resembles a crossbar?
• Banyans or Multistage Interconnection Networks. • Unidirectional: Wire Routing issue.
• Bidirectional multistage or folded multistage networks: Good
• Bidirectional multistage are two entities:
• The MIN, so-called “fat-tree” network
• The butterfly.
• MIN better than butterfly (previous work)
How to modify the MIN so that it becomes contention
free?
• Routing in a MIN: • Going up: adaptive
• Going down: deterministic. Means a high probability of contention.
• Solution: add as many output down links as there is a probability of contention per stage
Preliminary Results: latency is minimal throughput is >
90%
R R R R
R R R R
R R R R
C C C C C C C C
R
C C C C C C C C
R R R
R R R R
R R R R
Regular MIN
Proposed work related to Intel’s R & D
Proposed work related to Intel’s R & D
Proposed work related to Intel’s R & D
Proposed work related to Intel’s R & D
Large Experience in Interconnection Networks
Evaluation.
Large Experience in ASIC/SoC Design.
System C based simulation/performance evaluation
environment under development.
Future investigations
•
Customized automatic generation of topologies and routers that implement bufferless approach.•
Investigate the automatic generation ofbandwidth-asymmetric network for non-equal requirements on the side of the IP-core clients.
•
Mathematical Analysis to Determine Maximum Latency Levels in the context of bufferless architecture routers. Large Experience in Interconnection Networks
Evaluation.
Large Experience in ASIC/SoC Design.
System C based simulation/performance evaluation
environment under development.
Future investigations
•
Customized automatic generation of topologies and routers that implement bufferless approach.•
Investigate the automatic generation ofbandwidth-asymmetric network for non-equal requirements on the side of the IP-core clients.
Computer Arithmetic & Cryptography
COE Recent Research Projects: Computer
COE Recent Research Projects: Computer
Arithmetic & Cryptography
Arithmetic & Cryptography
COE Recent Research Projects: Computer
COE Recent Research Projects: Computer
Arithmetic & Cryptography
Arithmetic & Cryptography
High-Performance Arithmetic for Cryptographic
Applications.
Design of efficient integrated circuits for the inverse
computation in different finite fields.
Design of Elliptic Curve Cryptography Architectures
using parallel multipliers.
Secure reliable storage system.
Design, Analysis, and FPGA prototyping of
High-Performance Arithmetic for Cryptographic Applications.
High-Performance Arithmetic for Cryptographic
Applications.
Design of efficient integrated circuits for the inverse
computation in different finite fields.
Design of Elliptic Curve Cryptography Architectures
using parallel multipliers.
Secure reliable storage system.
Design, Analysis, and FPGA prototyping of
High-Performance Arithmetic Circuitry
High-Performance Arithmetic Circuitry
for Cryptography
for Cryptography
High-Performance Arithmetic Circuitry
High-Performance Arithmetic Circuitry
for Cryptography
for Cryptography
New Modulo Multiplication Algorithm & Circuitry
Patent Application Pending.
New High-Radix Multiplier Divider Algorithm and
Hardware performing the operation (A*B/N) with
hardware complexity close to that of Division operation
Patent in Preparation.
Efficient Parallel Implementations of Elliptic Curve
Cryptosystems.
New Modulo Multiplication Algorithm & Circuitry
Patent Application Pending.
New High-Radix Multiplier Divider Algorithm and
Hardware performing the operation (A*B/N) with
hardware complexity close to that of Division operation
Patent in Preparation.
Efficient Parallel Implementations of Elliptic Curve
Aladdin Modulo Multiplier
Aladdin Modulo Multiplier
Aladdin Modulo Multiplier
Aladdin Modulo Multiplier
Public-Key Encryption/Decryption Algorithms Largely
depend on the computation of Modulo multiplication (AB mod N).
Current Dominant Method is Montgomery’s
Circuitry have been modeled and verified using VHDL
Public-Key Encryption/Decryption Algorithms Largely
depend on the computation of Modulo multiplication (AB mod N).
Current Dominant Method is Montgomery’s
High-Radix Multiplier Divider
High-Radix Multiplier Divider
High-Radix Multiplier Divider
High-Radix Multiplier Divider
Theory Fully Developed
For n-bit operands, with k-bit radix system (k > 8),
computing (A*B/N) requires (n / k) steps instead of n.
Research is continuing to study full exploitation of
such processor for faster performance of:
•
Pure multiplication•
Pure division Theory Fully Developed
For n-bit operands, with k-bit radix system (k > 8),
computing (A*B/N) requires (n / k) steps instead of
n.
Research is continuing to study full exploitation of
such processor for faster performance of:
•
Pure multiplicationECC Efficient Implementations
ECC Efficient Implementations
ECC Efficient Implementations
ECC Efficient Implementations
Efficient High-Speed computation of the Scalar
Multiplication operation (ECC) through exploiting parallelism.
Improved Resistance against Side Channel Attacks
(SCA):
•
Parallelism•
Randomized computation order starting Least-2-Most or Most-2-Least•
Randomized Number of Processors•
Randomized coordinate system (affine vs projective) Hardware will be implemented on an FPGA platform.
Efficient High-Speed computation of the Scalar
Multiplication operation (ECC) through exploiting parallelism.
Improved Resistance against Side Channel Attacks
(SCA):
•
Parallelism•
Randomized computation order starting Least-2-Most or Most-2-Least•
Randomized Number of Processors•
Randomized coordinate system (affine vs projective)Available Experience….
Available Experience….
Available Experience….
Available Experience….
COE has a wide experience in digital circuits and VLSI
design.
Work is verified on FPGA platform but can readily be
ported onto dedicated VLSI processors.
Developed Circuits & Algorithms can be readily used
by INTEL.
Our Needs:
•
Professional CAD Tools for VLSI design, verification, and synthesis tools e.g. Mentor Graphics tools, Cadence, etc. COE has a wide experience in digital circuits and VLSI
design.
Work is verified on FPGA platform but can readily be
ported onto dedicated VLSI processors.
Developed Circuits & Algorithms can be readily used
by INTEL.
Our Needs:
Computer Engineering Faculty
Computer Engineering Faculty
Research Profile
Research Profile
Computer Engineering Faculty
Computer Engineering Faculty
Research Profile
Dr. Sadiq Sait, Professor
Dr. Sadiq Sait, Professor
Dr. Sadiq Sait, Professor
Dr. Sadiq Sait, Professor
Research Interests
•
Digital Design Automation, VLSI System Design, High Level Synthesis, and Iterative Algorithms. Recent Projects
•
Parallelization of Iterative Heuristics for Low Power VLSI Standard Cell Placement', KFUPM, 2003-2005.•
Iterative Heuristics for Timing & Low Power VLSI Standard Cell Placement, KFUPM, 2001-2003. Recent Publications
•
Sadiq M. Sait and Junaid A. Khan, "Simulated Evolution for Timing & Low-Power VLSI Standard Cell Placement", Engineering Applications to Artificial Intelligence (EAAI), Vol. 16, Sep. 2003, pp. 407-423.•
Sadiq M. Sait and H. Youssef. Iterative Computer Algorithms with Applications in Engineering: Solving Combinatorial Optimization Problems. December 1999, IEEE Computer Society Press,California.
•
Sadiq M. Sait and H. Youssef. VLSI Physical Design Automation: Theory and Practice, McGraw-Hill Book Co., Europe, December 1994. Also Co-published by IEEE Press, USA, January 1995 (Hard bound edition). Research Interests
•
Digital Design Automation, VLSI System Design, High Level Synthesis, and Iterative Algorithms. Recent Projects
•
Parallelization of Iterative Heuristics for Low Power VLSI Standard Cell Placement', KFUPM, 2003-2005.•
Iterative Heuristics for Timing & Low Power VLSI Standard Cell Placement, KFUPM, 2001-2003. Recent Publications
•
Sadiq M. Sait and Junaid A. Khan, "Simulated Evolution for Timing & Low-Power VLSI Standard Cell Placement", Eng