From the Blogosphere
The Human Body and Data Center Automation | @CloudExpo #AI #ML #DataCenter
The nervous system has two components: the central nervous system and the peripheral nervous system
Nov. 12, 2017 05:00 PM
The Human Body and Data Center Automation - Part 2
Disclaimer: I am an IT guy and my knowledge on human body is limited to my daughter's high school biology class book and information obtained from search engines. So, excuse me if any of the information below is not represented accurately !!
Human body is the most complex machine ever created. With a complex network of interconnected organs, millions of cells and the most advanced processor, human body is the most automated system in this planet. In this article, we will draw comparisons between working of a human body to that of a data center. We will draw parallels between human body automation to data center automation and explain different levels of automation we need to drive in data centers. This article is divided into four parts covering each of body main functions and drawing parallels on automation. This is the second article in the human body series. Please click here for the link for first article
The nervous system
The nervous system is a complex collection of nerves and specialized cells known as neurons that transmit signals between different parts of the body. It is through the nervous system that we communicate with the outside world and, at the same time, many mechanisms inside our body are controlled. The nervous system takes in information through our senses, processes the information and triggers reactions, such as making your muscles move or causing you to feel pain. The closest comparison to nervous system to what is in our data center is the network. Much like the network connects everything together in the data center, nervous system essentially is the body's network system.
The nervous system has two components: the central nervous system and the peripheral nervous system. The central nervous system is made up of the brain, spinal cord and nerves. The peripheral nervous system consists of sensory neurons, ganglia (clusters of neurons) and nerves that connect to one another and to the central nervous system. Imagine this as core network (central) and data center network (peripheral). However, what is so fascinating about our nervous system is the way it works. Let's take a deep dive inside our body and learn how can we make our networks more efficient.
Image Source: Livescience.com
The nervous system has two main subdivisions: the somatic, or voluntary, component; and the autonomic, or involuntary, component. The autonomic nervous system regulates certain body processes, such as blood pressure and the rate of breathing, that work without conscious effort. It is constantly active, regulating things such as breathing, heart beat and metabolic processes. It does this by receiving signals from the brain and passing them on to the body. It can also send signals in the other direction - from the body to the brain - providing your brain with information about how full your bladder is or how quickly your heart is beating.
Now can you think what system in our data center comes close to autonomic nervous system? It's our monitoring system. The function of monitoring system in data center is to monitor health of various components(hardware/software) in our data center and alert us when the thresholds are breached or an error has occurred. In most of the modern data centers today there is some tool which does this job. Alerts and error logs are collected at all layers and once an error has occurred or particular KPIs crosses certain threshold, an event is generated and humans are notified to take action. However, where the human body defeats any modern monitoring system is ability to take autonomic actions based on situation. Let's imagine you on a treadmill and running. When you are running and as the heart rates goes up, the brain is not just sending out alerts to you indicating your heart rate is going up but it also taking appropriate actions to ensure the body continues to function. The first action is to breakdown glycogen, a form of glucose to give you extra dose of energy.
The second action is to draw more blood towards your muscles which are under stress and away from non-needed functions like digestion (unless you are eating while excising). Since the body needs more oxygen for your muscles it signals your lungs to intake more oxygen and hence your breathing rate goes up. As the body burns more glucose and your body heats up, your brain sends signals to your sweat glands to release moisture to keep the body cool and hence maintains temperature inside the body. All these actions without you telling your body what to do to keep you healthy. Only if the thresholds crosses beyond certain rate and your body is not able to fix you, it will signal us to take action like resting or slowing down. This is exactly how our monitoring system should work. However, what happens in most of the enterprise is a sorry state of affairs.
Let's consider a very common issue most of the enterprise faces - a performance issue. Consider that the mission critical business application running on your server is experiencing performance issues and the users are complaining. In a typical organization, an application user will do first level analysis and based on his/her analysis he will open an incident ticket with command center. Everyone from systems engineer, storage engineer, network engineer and specialized performance engineers are paged to figure out what's happening. Hours are spent to detect where in the fabric there is contention which is leading to performance issue. Once the issue is detected another few hours are spent to finalize the action plan and finally the fix is put in place. Sounds familiar.
Now imagine we can learn from our human body and can design our data center in such a way that the system should automatically detect something is going wrong in the fabric and find out where in the fabric there is issue. Once the issue is detected it identifies appropriate fix and implements the fix. If the system detects performance issue is because of underlying CPU constraint on one of the VMs, the system should either scale up CPU capacity on the VM or automatically horizontally scale the application by adding another VM or container. If the issue was detected at network level, system should be in a position to move entire VLAN to another healthy leaf switch. If the issue happened at DC level, the system should automatically fail over all the impacted applications to another DC. While some of the modern cloud native applications works in similar fashion, the same level of maturity is not seen in traditional applications.
The somatic system consists of nerves that connect the brain and spinal cord with muscles and sensory receptors in the skin. The voluntary nervous system (somatic nervous system) controls all the things that we are aware of and can consciously influence, such as moving our arms, legs and other parts of the body. The nerves (like the network cables in our DC), starts at the brain and central cord and branches out to every part of our body. Neurons (intelligent code) send signals to other cells though thin fibers called axons, which causes chemical known as neuro transmitter to be released at junctions called synapses. A synapse gives command to the cell and the entire communication takes a fraction of a second. Such is the speed of transmission in our human body that our fastest router in the world cannot come close to this.
Let's take an example. Imagine someone tapping you lightly on your shoulder and your immediate reaction is to turn around and see who is doing that. The sensory neurons (cells) on our shoulder transmit the signals to your brain via the nerves at such a fast pace that you immediately react. Now imagine someone tapping at your shoulder and it takes few seconds to a minute for your body to react to the signal J . The way our body reacts to various form of sense (touch, smell, taste, etc.) and the fact we don't have to manage every action indicates how advance is our body's automation system. The body sensor systems are like the sensors in our data center. The role of sensors is to collect the data and send it for further processing. While we have lot of maturity to collect data what we lack is how fast can we analyze the data to take appropriate action. This is where our Brain checkmates even the fastest of all computers including IBM Watson. Our brain is a combination of Big Data system, e.g., Hadoop, the intelligence of IBM Watson and fastest super compute in the world all combined into one. Let's look at our brain.
Image Source: diseasespictures.com
Brain - Our Brain is the intelligence of our body. It controls all actions in our body. It acts as both CPU and memory for our body and without brains you are almost like walking zombie who has no control of his or her actions. Inside the data center, CPU/Memory inside our servers combined with the software which runs on top of these acts like brain. However, we are still far to match the amount of computing capacity our brain has and more important the learning and intuitive capabilities our brain has.
The fastest supercomputer in the world is China's Sunway Taihulight and has a maximum processing speed of 93 petaFLOPS. A petaFLOP is a quadrillion (one thousand trillion) floating point calculations per second. Still this does not come close to processing speed of the human brain. It is postulated that the human brain operates at 1 exaFLOP, which is equivalent to a billion billion calculations per second. While the hardware which is, the muscular structure can be compared with chip set we have in our computers, it's the software which makes the difference.
Our brain controls the nervous system, the muscular system and other vital parts of our body. It also has tremendous learning capability. When a child is born, our brain is almost empty but is in a learning mode. It quickly learns how to interact with the outside world and starts to read data from the sense organs. This is how we react to taste, touch and sound. As we grow we learn how to talk, write and communicate. We learn how to walk, run and jump! In the software world, we call it as AI - Artificial intelligence. Humans have been trying for centuries on how to develop brain like self-learning capabilities and while we now have self-driving cars which is pushing the envelope, we are still far away from truly matching our brain's power.
Now the brain cannot work out of isolation. It gets all the data it needs from our sensory organs - eyes, nose, tongue. We interact with the world with the help of our sensory organs. Eyes gives us visual data , nose gives us smell related data while the tongue allows us to taste. Do you know that our tongue alone has millions of sensors which allows to distinguish various tastes from sour to sweet and from hot to cold! Imagine the status of all restaurants if we did not have these sensors. On the other hand, our nose has sensors to not only detect various smell but also acts as a self-defense organ.
In our data centers, we need similar capabilities. Inside the compute, we have sensors which tells us that the filesystem is getting full or inside the router we can tell if we have packet drops happening. We name them as alerts and in a given day millions of alerts are generated by all the systems running in an enterprise. The difference here is what do we do the alerts? If every alert is dependent on human to take manual intervention it will be as good as our tongue telling you that the coffee you are drinking is very hot and waiting for an action to be decided by you on whether you should stop drinking. Your tongue sends alert to your brain, your brain processes information and decides it's too dangerous and immediately takes action to control your actions. Now you may still drink it cause harm but the body takes immediate action to prevent the harm.
Similarly, with the alerts coming from our systems, we need to develop systems which can take immediate actions (self-heal) and not wait for human intervention all the time. If the filesystem is full, take immediate action to detect and fix what is causing it to be full. If the security intrusion detection system has detected malicious emails, block the emails immediately. If network port is dropping packets, isolate the port and move traffic to alternate port. The more autonomic actions we can take, the better we will be in managing our data center.
To summarize here's what we need in our data center: L-C-C-A
- Lightning fast network of intelligent sensors across the data center stack
- Central monitoring system which can monitor alerts/error logs at all layers from application down to the server
- Co-relation engine which can correlate various alerts and error logs and pin points where in the data center there is issue
- Artificial intelligence (AI) capable run-book automation engine which can trigger autonomous action (self-heal) based on the issue identified and implements the fix
In our next article on human body and data center automation we will focus on our circulator system which is responsible for flow of blood, oxygen and nutrients in our body and we will learn how our data center should learn from same. Until next time.