Hourglass: Enabling Efficient Split Federated Learning with Data Parallelism

dc.contributor.authorHe, Q.
dc.contributor.authorWang, K.
dc.contributor.authorDong, Z.
dc.contributor.authorYuan, L.
dc.contributor.authorChen, F.
dc.contributor.authorJin, H.
dc.contributor.authorYang, Y.
dc.contributor.conferenceTwentieth European Conference on Computer Systems (EuroSys) (30 Mar 2025 - 3 Apr 2025 : Rotterdam, Netherlands)
dc.date.issued2025
dc.description.abstractFederated learning (FL) has emerged as a promising solution for training machine learning (ML) models with privacy preservation. One of the key challenges is the computational burden on clients caused by training large-sized models. To tackle this challenge, researchers are trying to incorporate split learning into federated learning so that an ML model can be partitioned into two parts, one for training on clients and the other on a cloud server or an edge server. In current split FL systems, each client's server-side model partition is trained with an individual GPU on the fed server before model aggregation. This demands massive GPU resources and does not scale in real-world scenarios. This paper presents Hourglass, a new split FL system that trains clients' server-side model partitions on multiple GPUs with data parallelism. Unlike existing systems that maintain one model partition for each client and pass clients' intermediate features through corresponding model partitions, Hourglass maintains model partitions shared by clients and passes their intermediate features through GPUs in groups based on their differences. In this way, Hourglass prevents the overhead incurred by swapping model partitions in and out of GPUs and improves knowledge sharing between clients. Extensive experiments are conducted on four widely-used public datasets to evaluate the performance of Hourglass. The results demonstrate that, compared to state-of-the-art systems, Hourglass accelerates model convergence by up to 35.2x, and improves model accuracy by up to 9.28%.
dc.description.statementofresponsibilityQiang He, Kaibin Wang, Zeqian Dong, Liang Yuan, Feifei Chen, Hai Jin, Yun Yang
dc.identifier.citationProceedings of the 20th European Conference on Computer Systems (EuroSys 2025), 2025, pp.1317-1333
dc.identifier.doi10.1145/3689031.3717467
dc.identifier.isbn979-8-4007-1196-1
dc.identifier.urihttps://hdl.handle.net/2440/145988
dc.language.isoen
dc.publisherAssociation for Computing Machinery (ACM)
dc.publisher.placeNew York, NY, USA
dc.relation.granthttp://purl.org/au-research/grants/arc/DP200102491
dc.rights© 2025 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution 4.0 International License.
dc.source.urihttps://dl.acm.org/doi/proceedings/10.1145/3689031
dc.subjectSplit federated learning; data parallelism; machine learning system
dc.titleHourglass: Enabling Efficient Split Federated Learning with Data Parallelism
dc.typeConference paper
pubs.publication-statusPublished

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
hdl_145988.pdf
Size:
16.39 MB
Format:
Adobe Portable Document Format
Description:
Published version

Collections