An Overview of Federated Learning

A look at its history, potential, progress, and challenges

A few weeks ago, I attended the industry workshop Translating AI Technology into Healthcare Solutions organized by AI Singapore (pictured below). Among the many interesting topics discussed was the decentralized collaborative machine learning approach known as federated learning. It piqued my interest and I decided to read more about it.

Workshop panelists from left to right : Dr Stefan Winkler (moderator), Deputy Director, AI Singapore; Trent McConaghy, Founder, Ocean Protocol; Dr Ronald Ling, CEO, Connected Health; Lance Little, Managing Director for Asia Pacific, Roche Diagnostics; Dan Vahdat, Co-founder & CEO, Medopad; Dr Joydeep Sarkar, Chief Analytics Officer, Holmusk; Dr Yu Han, Assistant Professor, Nanyang Technological University; Dr Khung Keong Yeo, Senior Consultant of the Dept of Cardiology, National Heart Centre Singapore


The term Federated Learning was coined by Google in a paper first published in 2016. Since then, it has been an area of active research as evidenced by papers published on arXiv. In the recent TensorFlow Dev Summit, Google unveiled TensorFlow Federated (TFF), making it more accessible to users of its popular deep learning framework. Meanwhile, for PyTorch users, the open-source community OpenMined had already made available the PySyft library since the end of last year with a similar goal (linklink).

What is Federated Learning?

Federated learning is a response to the question: can a model be trained without the need to move and store the training data to a central location?

It is a logical next step in the ongoing integration of machine learning into our daily lives, prompted by existing constraints and also other concurrent developments.

Islands of Data

Much of the world’s data today sit, not on central data centers, but on isolated islands where they are collected and owned. In other words, much potential could be tapped if they could be worked upon where they sit.

Data Privacy

The issue of data privacy has come under the increasing attention of regulators in several jurisdictions in recent years (link). With data availability essential to any machine learning model, creative ways must be devised to circumvent restrictions and enable model training to happen without the data actually having to leave where it is collected and stored.

Computing on the Edge

As will be explained later, federated learning often requires computing on the edge. For edge devices (primarily handsets) that collect and store data, recent advances in custom hardware (for example, Apple’s Neural Engine) have made deep learning on them feasible. This has been the case since the introduction of the Samsung S9 and Apple X series of handsets. As the number of these so-called “AI-ready” handsets in the market grows (link), the potential for federated learning does too.

What Federated Learning Promises

Federated learning also has the potential to act as the impetus to future changes in the industry.

Cloud Computing

Cloud computing is the dominant computing paradigm for machine learning today in a space occupied by the tech giants Google, Amazon, and Microsoft. Without the need to maintain a central data center, new providers will find it easier to offer even more AI services. The fact is, Google has foreseen this democratizing trend and has staked a leading role in the development of federated learning.

Sharing Economy

Google used federated learning to develop its next-word predictor on GBoard. This ability to train a model without compromising users’ privacy should encourage the emergence of other services that rely on data collected through handsets and other IoT devices in a form of sharing economy.

B2B Collaboration

Since data never leaves its original premises, federated learning opens up the possibility for different data owners at the organizational level to collaborate and share their data. In a recent paper, the researchers (Qiang Yang et al.) envision the different configurations in which this can happen.

Horizontal Federated Learning

They coined the terms Horizontal Federated Learningand Vertical Federated Learning.

Take the case of two regional banks. Although they have non-overlapping clientele, their data will have similar feature spaces since they have very similar business models. They might come together to collaborate in an example of horizontal federated learning.

Vertical Federated Learning

In vertical federated learning, two companies providing different services (e.g. banking and e-commerce) but having a large intersection of clientele might find room to collaborate on the different feature spaces they own, leading to better outcomes for both.

In both cases, the data owners are able to collaborate without having to sacrifice their respective clientele’s privacy.

Apart from finance, another industry vertical that could benefit is the healthcare sector (as mentioned in the introduction). Hospitals and other healthcare providers stand to gain if they are able to share patient data for model training in a privacy-preserving manner.

How Federated Learning Works

At the heart of federated learning is the federated averaging algorithm introduced by Google in their original paper (pseudo-code shown below).

The Federated Averaging Algorithm

A typical round of learning consists of the following sequence.

  • A random subset of members of the Federation (known as clients) is selected to receive the global model synchronously from the server.
  • Each selected client computes an updated model using its local data.
  • The model updates are sent from the selected clients to the server.
  • The server aggregates these models (typically by averaging) to construct an improved global model.

Of course, the subset selection step was necessitated by the context in which Google originally applied federated learning: on data collected through millions of handsets in its Android ecosystem.

A variant of this learning sequence which appears in later literature involves sending gradient updates to the server instead of the actual model weights themselves. The common idea is that none of the original data is ever transmitted between parties, only model-related updates. It is also clear now how edge computing plays a role here.

Challenges of Federated Learning

Moving federated learning from concept to deployment is not without challenges. Researchers, including those working independently of federated learning advocates, have contributed to a better understanding of the issues to be considered. While work has been done on the efficiency and accuracy of federated learning, the more important challenges, in my opinion, are the security-related ones mentioned below.

Inference Attacks (linklinklinklinklink). The motivation for federated learning is the preservation of the privacy of the data owned by the clients. Even when the actual data is not exposed, the repeated model weight updates can be exploited to reveal properties not global to the data but specific to individual contributors. This inference can be performed on both the server side as well as (other) client side. An oft-quoted countermeasure is the use of differential privacy to mitigate this risk.

Model Poisoning (linklinklink). Some researchers have investigated the possibility of misbehaving clients introducing backdoor functionality or mounting Sybil attacks to poison the global model.

Targeted poisoning attack in stochastic gradient descent. The dotted red vectors are Sybil contributions that drive the model towards a poisoner objective. The solid green vectors are contributed by honest clients that drive towards the true objective.

To effectively counter such attacks, additional overhead for Sybil detection must be considered.

Looking Ahead

Federated learning holds the promise to make available more data to improve lives. Much research is being undertaken and many challenges remain. Perhaps the following paragraph from Google’s latest paper sums up today’s status.

In this paper, we report on a system design for such algorithms in the domain of mobile phones (Android). This work is still in an early stage, and we do not have all problems solved, nor are we able to give a comprehensive discussion of all required components. Rather, we attempt to sketch the major components of the system, describe the challenges, and identify the open issues, in the hope that this will be useful to spark further systems research.

The original article first appeared here.