5G Wireless Meets AI

The technology landscape has been utterly transformed in the past decade by one technology above all others – mobile wireless data services.  It has enabled the global smart phone revolution, with millions of apps and new business models based on ubiquitous high-bandwidth data access, especially using 3G, 4G and WiFi.  It has also transformed computing infrastructure, through the impetus for continuous real-time  applications served up from massive aggregated data of social connections, transportation, commerce, and  crowd-sourced entertainment.

So what’s next?  One direction is obvious and important – yet better wireless data services, especially ubiquitous cellular or 5G wireless.  It is clear that the global appetite for data – higher bandwidth, more reliable, lower latency, more universal data  – is huge.  If the industry can find ways to deliver improved data services at reasonable costs, the demand will likely drive an enormous variety of new applications.  And that trend alone is an interesting fundamental technology story.
The next decade will also witness another  revolution – widespread development and deployment of cognitive computing,  Neural inspired computing methods already play a key role in advanced driver assistance systems, speech recognition and face recognition, and are likely to sweep through many robotics, social media, finance, commerce and health care applications.  Triggered by availability of new data-sets and bigger, better trained neural networks, cognitive computing will likely show rapid improvements in effectiveness in complex recognition and decision-making scenarios.
But how will these two technologies – 5G and cognitive computing – interact?  How will this new basic computing model shift the demands on wireless networks?  Let’s explore a bit.
Let’s start by looking at some of the proposed attributes of 5G:
  • Total capacity scaling via small cells
  • Much larger number of  users and much higher bandwidth per user
  • Reduced latency – down to a few tens of milliseconds
  • Native machine-to-machine connectivity without the capacity, bandwidth and latency constraints of the basestation as intermediary
  • More seamless integration across fiber and wireless for better end-to-end throughput
These functions will not be easy to achieve – sustained algorithmic and implementation innovation in massive MIMO, new modulation and frequency.time division multiplexing, exploitation of millimeter wave frequency bands, and collaborative multi-cell protocols are likely to be needed.  Even the device architectures themselves, to make them more suited for low-latency machine-to-machine connectivity, will be mandatory.
Some of the attributes of 5G are driven by a simple extrapolation from today’s mobile data use models.  For example, it is reasonable to expect that media content streaming, especially video streaming will be a major downlink use-case, driven by more global viewing, and by higher resolution video formats.  However, the number of video consumers cannot grow indefinitely – the human population is increasing at only about 1.2% per year.    So how far can downlink traffic grow if it is driven primarily by conventional video consumption – certainly by an order of magnitude, but probably not by 2 or 3 orders of magnitude,
  On the other hand, we are seeing rapid increase in the number of data sensors, especially cameras,  in the world –  cameras in mobile phones, security cameras, flying drones and toys, smart home appliances, cars, industrial control nodes and other real-time devices.  Image sensors produce so much more data than other common sensor types (e.g. microphones for audio and accelerometers for motion) that virtually 100% of all newly produced raw data is image/video data.  Cameras are increasing a much faster rate than human eyeballs – at a rate of more than 20% per year.  In fact, I estimate that the number of cameras in the world exceeded the number of humans, starting in 2016.  By 2020, we could see more than 20B image sensors active in the world, each theoretically capturing 250MB per second or more of data (1080p60 4:2:2 video capture)
This raises two key questions.
 First, how do we move that much video content.  Even with good (but still lossy) video compression (assuming 6Mbps encoding rate), that number of cameras running continuously implies more than 10^17 bits per second of required uplink capacity.  That translates into a requirement for the equivalent of hundreds of millions of current 4G base stations, just to handle the video uplink.  This implies that 5G wireless needs to look at least as hard at uplink capacity as at downlink capacity.  Of course, not all those cameras will be working continuously, but it is easily possible to imagine that the combination of self-driving cars, ubiquitous cameras in public areas for safety and more immersive social media could increase the total volume by several orders of magnitude.
Second, who is going to look at the output of all those cameras?  It can’t be people – there simple aren’t enough eyeballs – so we need some other audience – “virtual eyeballs” to monitor, and extract the images, events or sequences of particular interest – to automatically make decisions using that image or video flow, or to distill it down to content down just the rare and relevant events for human evaluation.
In many cases, the latency demands on decision making are so intense – in automotive systems for example -that only real-time machine vision will be fast enough to respond.   This implies that computer vision will be a key driving force in distributed real-time systems.  In some important scenarios, the vision intelligence will be implemented in the infrastructure, either in cloud servers or in a new category of cloud-edge computing nodes. This suggests a heavy emphasis on the capacity scaling and latency.  In many other scenarios, the interpretation of the video must be local to the device to make it fast and robust enough for mission-critical applications.  In these cases, the recognition tasks are local, and only the the more concise and abstract  stream of recognized objects or events needs to be communicated over the wireless network.
So let’s make a recap some observations about the implications of the simultaneous evolution of 5G wireless and cognitive computing.
  • We will see a “tug-of-war” between in device-side or cloud-side cognitive computing, based on bandwidth and latency demands, concerns of robustness in the face of network outage, and the risks of exposure of raw data.
    1. Device: low latency, low bandwidth consumption, lowest energy
    2. Cloud: Training, fast model updates, data aggregation, flexibility
  • Current wireless network are not fast enough for cloud-based real-time vision, but 5G capacity gains, especially combined with good cloud edge computing may be close enough.
  • The overwhelming number of image sensors makes computer vision’s “virtual eyeballs” necessary.  This, in turn, implies intense demands on both uplink capacity and local intelligence.  
  • Machine-to-machine interactions will not happen on low-level raw sensor data – cars will not exchange pixel-level information to avoid accidents.  The machines will need to exchange  abstract data in order to provide high robustness and low latency, even with 5G networks.
  • Wireless network operations will be highly complex, with big opportunities for adaptive behavior to improve service, trim costs and reduce  energy consumption.  Sophisticated pattern recognition and response using cognitive computing make ultimately play a significant role in  real-0time network management.

So its time to buckle in and get ready for an exciting ride – a decade of dramatic innovation in systems, especially systems at the confluence of 5G wireless and deep learning.  For more insights, please see my presentation from the IEEE 5G Summit, September 29, 2016: rowen-5g-meets-deep-learning-v2