Hourglass: Enabling Efficient Split Federated Learning with Data Parallelism
dc.contributor.author | He, Q. | |
dc.contributor.author | Wang, K. | |
dc.contributor.author | Dong, Z. | |
dc.contributor.author | Yuan, L. | |
dc.contributor.author | Chen, F. | |
dc.contributor.author | Jin, H. | |
dc.contributor.author | Yang, Y. | |
dc.contributor.conference | Twentieth European Conference on Computer Systems (EuroSys) (30 Mar 2025 - 3 Apr 2025 : Rotterdam, Netherlands) | |
dc.date.issued | 2025 | |
dc.description.abstract | Federated learning (FL) has emerged as a promising solution for training machine learning (ML) models with privacy preservation. One of the key challenges is the computational burden on clients caused by training large-sized models. To tackle this challenge, researchers are trying to incorporate split learning into federated learning so that an ML model can be partitioned into two parts, one for training on clients and the other on a cloud server or an edge server. In current split FL systems, each client's server-side model partition is trained with an individual GPU on the fed server before model aggregation. This demands massive GPU resources and does not scale in real-world scenarios. This paper presents Hourglass, a new split FL system that trains clients' server-side model partitions on multiple GPUs with data parallelism. Unlike existing systems that maintain one model partition for each client and pass clients' intermediate features through corresponding model partitions, Hourglass maintains model partitions shared by clients and passes their intermediate features through GPUs in groups based on their differences. In this way, Hourglass prevents the overhead incurred by swapping model partitions in and out of GPUs and improves knowledge sharing between clients. Extensive experiments are conducted on four widely-used public datasets to evaluate the performance of Hourglass. The results demonstrate that, compared to state-of-the-art systems, Hourglass accelerates model convergence by up to 35.2x, and improves model accuracy by up to 9.28%. | |
dc.description.statementofresponsibility | Qiang He, Kaibin Wang, Zeqian Dong, Liang Yuan, Feifei Chen, Hai Jin, Yun Yang | |
dc.identifier.citation | Proceedings of the 20th European Conference on Computer Systems (EuroSys 2025), 2025, pp.1317-1333 | |
dc.identifier.doi | 10.1145/3689031.3717467 | |
dc.identifier.isbn | 979-8-4007-1196-1 | |
dc.identifier.uri | https://hdl.handle.net/2440/145988 | |
dc.language.iso | en | |
dc.publisher | Association for Computing Machinery (ACM) | |
dc.publisher.place | New York, NY, USA | |
dc.relation.grant | http://purl.org/au-research/grants/arc/DP200102491 | |
dc.rights | © 2025 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution 4.0 International License. | |
dc.source.uri | https://dl.acm.org/doi/proceedings/10.1145/3689031 | |
dc.subject | Split federated learning; data parallelism; machine learning system | |
dc.title | Hourglass: Enabling Efficient Split Federated Learning with Data Parallelism | |
dc.type | Conference paper | |
pubs.publication-status | Published |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- hdl_145988.pdf
- Size:
- 16.39 MB
- Format:
- Adobe Portable Document Format
- Description:
- Published version