Computer Science publications

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 3132
  • Item
    Quality in blended learning environments – Significant differences in how students approach learning collaborations
    (Elsevier BV, 2016) Ellis, R.A.; Pardo, A.; Han, F.
    Evaluating the quality of student experiences of learning in a blended environment requires the careful consideration of many aspects that can contribute to learning outcomes. In this study, university students in first year engineering were required to collaborate and inquire in a blended course design over a semester-long course. This study investigates their approaches to inquiry and online learning technologies as they collaborated both in class and online. The results identify sub-groups within the population sample (n > 200) which reported qualitatively different experiences of how they approached inquiry and used the online learning technologies. The results also measure aspects of their collaborations which help to explain why some students were more successful than others. The outcomes of the study have important implications for teaching and course design and the effective evaluation of blended experiences of university student learning.
  • Item
    Thin film extensional flow of a transversely isotropic viscous fluid
    (American Physical Society (APS), 2023) Hopwood, M.J.; Harding, B.; Green, J.E.F.; Dyson, R.J.
    Many biological materials such as cervical mucus and collagen gel possess a fibrous microstructure. This microstructure affects the emergent mechanical properties of the material and hence the functional behavior of the system. We consider the canonical problem of stretching a thin sheet of transversely isotropic viscous fluid as a simplified version of the spinnbarkeit test for cervical mucus. We propose a solution to the model constructed by Green and Friedman by manipulating the model to a form amenable to arbitrary Lagrangian-Eulerian (ALE) techniques. The system of equations, reduced by exploiting the slender nature of the sheet, is solved numerically, and we discover that the bulk properties of the sheet are controlled by an effective viscosity dependent on the evolving angle of the fibers. In addition, we confirm a previous conjecture by demonstrating that the center line of the sheet need not be flat, and perform a short timescale analysis to capture the full behavior of the center line.
  • Item
    Analytics for learning design: A layered framework and tools
    (Wiley, 2019) Hernández-Leo, D.; Martinez-Maldonado, R.; Pardo, A.; Muñoz-Cristóbal, J.A.; Rodríguez-Triana, M.J.
    The field of learning design studies how to support teachers in devising suitable activities for their students to learn. The field of learning analytics explores how data about students’ interactions can be used to increase the understanding of learning experiences. Despite its clear synergy, there is only limited and fragmented work exploring the active role that data analytics can play in supporting design for learning. This paper builds on previous research to propose a framework (analytics layers for learning design) that articulates three layers of data analytics—learning analytics, design analytics and community analytics—to support informed decision-making in learning design. Additionally, a set of tools and experiences are described to illustrate how the different data analytics perspectives proposed by the framework can support learning design processes.
  • Item
    Toward a Distributed Trust Management System for Misbehavior Detection in the Internet of Vehicles
    (Association for Computing Machinery (ACM), 2023) Mahmood, A.; Sheng, Q.Z.; Zhang, W.E.; Wang, Y.; Sagar, S.
    Recent considerable state-of-the-art advancements within the automotive sector, coupled with an evolution of the promising paradigms of vehicle-to-everything communication and the Internet of Vehicles (IoV), have facilitated vehicles to generate and, accordingly, disseminate an enormous amount of safety-critical and non-safety infotainment data in a bid to guarantee a highly safe, convenient, and congestion-aware road transport. These dynamic networks require intelligent security measures to ensure that the malicious messages, along with the vehicles that disseminate them, are identified and subsequently eliminated in a timely manner so that they are not in a position to harm other vehicles. Failing to do so could jeopardize the entire network, leading to fatalities and injuries amongst road users. Several researchers, over the years, have envisaged conventional cryptographic-based solutions employing certificates and the public key infrastructure for enhancing the security of vehicular networks. Nevertheless, cryptographic-based solutions are not optimum for an IoV network primarily, since the cryptographic schemes could be susceptible to compromised trust authorities and insider attacks that are highly deceptive in nature and cannot be noticed immediately and are, therefore, capable of causing catastrophic damage. Accordingly, in this article, a distributed trust management system has been proposed that ascertains the trust of all the reputation segments within an IoV network. The envisaged system takes into consideration the salient characteristics of familiarity, i.e., assessed via a subjective logic approach, similarity, and timeliness to ascertain the weights of all the reputation segments. Furthermore, an intelligent trust threshold mechanism has been developed for the identification and eviction of the misbehaving vehicles. The experimental results suggest the advantages of our proposed IoV-based trust management system in terms of optimizing the misbehavior detection and its resilience to various sorts of attacks.
  • Item
    When Does Collaboration Lead to Deeper Learning? Renewed Definitions of Collaboration for Engineering Students
    (Institute of Electrical and Electronics Engineers (IEEE), 2019) Ellis, R.A.; Han, F.; Pardo, A.
    Collaboration is an increasingly important and difficult skill for graduate engineers to develop. While universities provide some measures of collaboration ability of students on graduation, there is still some dissatisfaction with the level of preparedness of students for collaborative activity in the workplace. This paper presents a case study of a first year engineering cohort of more than 350 students to discuss the value of improving both the measures and definitions of collaborative ability on graduation of engineering students in a blended learning context. Research methods from student approaches to learning research and social network analysis are adopted to provide experiential and mathematical evidence of successful collaboration. The results provide a characterization of groups of students with respect to their approach to collaboration and the features most common in productively collaborative students. The discussion has implications for teaching, course design, and how universities define and measure collaborative ability of students.
  • Item
    3-D printed smart orthotic insoles: Monitoring a person's gait step by step
    (Institute of Electrical and Electronics Engineers (IEEE), 2020) Hao, Z.; Cook, K.; Canning, J.; Chen, H.T.; Martelli, C.
    This article reports a 3-D printing intelligent insole gait monitoring system based on an embedded fiber Bragg grating (FBG). The smart insole combines 3-D printing technology and FBG sensors providing high sensitivity and endpoint low cost. Results using pressure points measured by four FBGs are sufficient to differentiate foot loads and gait types.
  • Item
    Wind turbine power output prediction using a new hybrid neuro-evolutionary method
    (Elsevier, 2021) Neshat, M.; Nezhad, M.M.; Abbasnejad, E.; Mirjalili, S.; Groppi, D.; Heydari, A.; Tjernberg, L.B.; Astiaso Garcia, D.; Alexander, B.; Shi, Q.; Wagner, M.
    Abstract not available
  • ItemOpen Access
    Generalized framework for image and video object segmentation using affinity learning and message passing GNNS
    (Elsevier BV, 2023) Muthu, S.; Tennakoon, R.; Rathnayake, T.; Hoseinnezhad, R.; Suter, D.; Bab-Hadiashar, A.
    Despite significant amount of work reported in the computer vision literature, segmenting images or videos based on multiple cues such as objectness, texture and motion, is still a challenge. This is particularly true when the number of objects to be segmented is not known or there are objects that are not classified in the training data (unknown objects). A possible remedy to this problem is to utilize graph-based clustering techniques such as Correlation Clustering. It is known that using long range affinities (Lifted multicut), makes correlation clustering more accurate than using only adjacent affinities (Multicut). However, the former is computationally expensive and hard to use. In this paper, we introduce a new framework to perform image/motion segmentation using an affinity learning module and a Message Passing Graph Neural Network (MPGNN). The affinity learning module uses a permutation invariant affinity representation to overcome the multi-object problem. The paper shows, both theoretically and empirically, that the proposed MPGNN aggregates higher order information and thereby converts the Lifted Multicut Problem (LMP) to a Multicut Problem (MP), which is easier and faster to solve. Importantly, the proposed method can be generalized to deal with different clustering problems with the same MPGNN architecture. For instance, our method produces competitive results for single image segmentation (on BSDS dataset) as well as unsupervised video object segmentation (on DAVIS17 dataset), by only changing the feature extraction part. In addition, using an ablation study on the proposed MPGNN architecture, we show that the way we update the parameterized affinities directly contributes to the accuracy of the results.
  • ItemOpen Access
    Machine-Learning Assessed Abdominal Aortic Calcification is Associated with Long-Term Fall and Fracture Risk in Community-Dwelling Older Australian Women
    (Wiley, 2023) Dalla Via, J.; Gebre, A.K.; Smith, C.; Gilani, Z.; Suter, D.; Sharif, N.; Szulc, P.; Schousboe, J.T.; Kiel, D.P.; Zhu, K.; Leslie, W.D.; Prince, R.L.; Lewis, J.R.; Sim, M.
    Abdominal aortic calcification (AAC), a recognized measure of advanced vascular disease, is associated with higher cardiovascular risk and poorer long-term prognosis. AAC can be assessed on dual-energy X-ray absorptiometry (DXA)-derived lateral spine images used for vertebral fracture assessment at the time of bone density screening using a validated 24-point scoring method (AAC-24). Previous studies have identified robust associations between AAC-24 score, incident falls, and fractures. However, a major limitation of manual AAC assessment is that it requires a trained expert. Hence, we have developed an automated machine-learning algorithm for assessing AAC-24 scores (ML-AAC24). In this prospective study, we evaluated the association between ML-AAC24 and long-term incident falls and fractures in 1023 community-dwelling older women (mean age, 75 ± 3 years) from the Perth Longitudinal Study of Ageing Women. Over 10 years of follow-up, 253 (24.7%) women experienced a clinical fracture identified via self-report every 4-6 months and verified by X-ray, and 169 (16.5%) women had a fracture hospitalization identified from linked hospital discharge data. Over 14.5 years, 393 (38.4%) women experienced an injurious fall requiring hospitalization identified from linked hospital discharge data. After adjusting for baseline fracture risk, women with moderate to extensive AAC (ML-AAC24 ≥ 2) had a greater risk of clinical fractures (hazard ratio [HR] 1.42; 95% confidence interval [CI], 1.10-1.85) and fall-related hospitalization (HR 1.35; 95% CI, 1.09-1.66), compared to those with low AAC (ML-AAC24 ≤ 1). Similar to manually assessed AAC-24, ML-AAC24 was not associated with fracture hospitalizations. The relative hazard estimates obtained using machine learning were similar to those using manually assessed AAC-24 scores. In conclusion, this novel automated method for assessing AAC, that can be easily and seamlessly captured at the time of bone density testing, has robust associations with long-term incident clinical fractures and injurious falls. However, the performance of the ML-AAC24 algorithm needs to be verified in independent cohorts. © 2023 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
  • Item
    Unsupervised Learning for Robust Fitting: A Reinforcement Learning Approach
    (IEEE, 2021) Truong, G.; Le, H.; Suter, D.; Zhang, E.; Gilani, S.Z.; IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (19 Jun 2021 - 25 Jun 2021 : virtual online)
    Robust model fitting is a core algorithm in a large number of computer vision applications. Solving this problem efficiently for datasets highly contaminated with outliers is, however, still challenging due to the underlying computational complexity. Recent literature has focused on learning-based algorithms. However, most approaches are supervised (which require a large amount of labelled training data). In this paper, we introduce a novel unsupervised learning framework that learns to directly solve robust model fitting. Unlike other methods, our work is agnostic to the underlying input features, and can be easily generalized to a wide variety of LP-type problems with quasiconvex residuals. We empirically show that our method outperforms existing unsupervised learning approaches, and achieves competitive results compared to traditional methods on several important computer vision problems¹.
  • Item
    Show, Attend and Detect: Towards Fine-Grained Assessment of Abdominal Aortic Calcification on Vertebral Fracture Assessment Scans
    (Springer, 2022) Gilani, S.Z.; Sharif, N.; Suter, D.; Schousboe, J.T.; Reid, S.; Leslie, W.D.; Lewis, J.R.; 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) (18 Sep 2022 - 22 Sep 2022 : Singapore); Wang, L.; Dou, Q.; Fletcher, P.T.; Speidel, S.; Li, S.
    More than 55,000 people world-wide die from Cardiovascular Disease (CVD) each day. Calcification of the abdominal aorta is an established marker of asymptomatic CVD. It can be observed on scans taken for vertebral fracture assessment from Dual Energy X-ray Absorptiometry machines. Assessment of Abdominal Aortic Calcification (AAC) and timely intervention may help to reinforce public health messages around CVD risk factors and improve disease management, reducing the global health burden related to CVDs. Our research addresses this problem by proposing a novel and reliable framework for automated “finegrained” assessment of AAC. Inspired by the vision-to-language models, our method performs sequential scoring of calcified lesions along the length of the abdominal aorta on DXA scans; mimicking the human scoring process.
  • Item
    Energy-efficient Edge Server Management for Edge Computing: A Game-theoretical Approach
    (Assocation for Computing Machinery, 2023) Cui, G.; He, Q.; Xia, X.; Chen, F.; Yang, Y.; 51st International Conference on Parallel Processing (ICPP) (29 Aug 2022 - 1 Sep 2022 : Bordeaux, France)
    Similar to cloud servers which are well-known energy consumers, edge servers running 24/7 jointly consume a tremendous amount of energy and thus require energy-saving management. However, the unique characteristics of edge computing make it a new and challenging problem to manage edge servers in an energy-efficient manner. First, an individual edge server is usually used to serve a specific region. The temporal distribution of end-users in the area impacts the edge server’s energy utilization. Second, multiple base stations may cover an end-user simultaneously and the end-user can be served by the physical machines attached to any of the base stations. Serving the end-users in an area with minimum physical machines can minimize the edge servers’ overall energy consumption. Third, physical machines facilitating an edge server can be powered off individually when not needed to minimize the edge server’s energy consumption. We formulate this Energy-efficient Edge Server Management (EESM) problem and analyze its problem hardness. Next, a game-theoretical approach, i.e., EESM-G, is proposed to address EESM problems efficiently. The superior performance of EESM-G is tested on a public real-world dataset.
  • Item
    The Descriptive Features and Quantitative Aspects of Students' Observed Online Learning: How Are They Related to Self-Reported Perceptions and Learning Outcomes?
    (IEEE, 2022) Han, F.; Ellis, R.A.; Pardo, A.
    This article uses digital traces to help identify students’ online learning strategies by making a clear distinction between the descriptive features (the proportional distribution of students’ different online learning actions) and quantitative aspects (the total number of the online learning sessions), a distinction that has not been properly addressed in extant research. It also examines the extent to which the descriptive features and quantitative aspects of students’ observed online learning behaviors are related to students’ self-reported perceptions of the blended learning environment and the academic learning outcomes. A cohort of 317 Australian undergraduates enrolled in a compulsory engineering course participated in the study. A hierarchical cluster analysis, based on the different proportions of the types of online learning activities in which students were involved, identified two qualitatively different online learning strategies: content and practice oriented. The content-oriented learners not only had significantly more online learning sessions but also performed significantly better on both the formative and summative assessments, than their practice-oriented counterparts. Moreover, a higher proportion of students reporting more negative perceptions were observed to adopt practice-oriented strategies, whereas a higher proportion of students reporting better perceptions were observed to adopt content-oriented strategies. The study results serve as triangulated evidence for the previous self-reported research on the relations between students’ perceptions and strategies. The results of the study also offer a number of ideas for teaching and curriculum design in blended courses in order to improve the quality of students’ blended learning experiences.
  • Item
    Students' self-report and observed learning orientations in blended university course design: How are they related to each other and to academic performance?
    (Wiley, 2020) Han, F.; Pardo, A.; Ellis, R.A.
    This study examines the extent to which the learning orientations identified by student self-reports and the observation of their online learning events were related to each other and to their academic performance. The participants were 322 first-year engineering undergraduates, who were enrolled in a blended course. Using students' self-report on a questionnaire about their approaches to learning and perceptions of the blended learning environment, ‘understanding’ and ‘reproducing’ learning orientations were identified. Using observations of student activity online, a Hidden Markov Model (HMM) and agglomerative sequence clustering detected four qualitatively different patterns of online learning orientations. Cross-tabulations showed significant and logical associations amongst the learning orientations derived by the self-report and observational methods. Significant differences were also consistently found in the students' academic performance across the mid-term and final assessments based on their learning orientations detected by both self-report and observational methods, results which have important implications for learning research.
  • ItemOpen Access
    Machine learning for abdominal aortic calcification assessment from bone density machine-derived lateral spine images
    (Elsevier BV, 2023) Sharif, N.; Gilani, S.Z.; Suter, D.; Reid, S.; Szulc, P.; Kimelman, D.; Monchka, B.A.; Jozani, M.J.; Hodgson, J.M.; Sim, M.; Zhu, K.; Harvey, N.C.; Kiel, D.P.; Prince, R.L.; Schousboe, J.T.; Leslie, W.D.; Lewis, J.R.
    Background Lateral spine images for vertebral fracture assessment can be easily obtained on modern bone density machines. Abdominal aortic calcification (AAC) can be scored on these images by trained imaging specialists to assess cardiovascular disease risk. However, this process is laborious and requires careful training. Methods Training and testing of model performance of the convolutional neural network (CNN) algorithm for automated AAC-24 scoring utilised 5012 lateral spine images (2 manufacturers, 4 models of bone density machines), with trained imaging specialist AAC scores. Validation occurred in a registry-based cohort study of 8565 older men and women with images captured as part of routine clinical practice for fracture risk assessment. Cox proportional hazards models were used to estimate the association between machine-learning AAC (ML-AAC-24) scores with future incident Major Adverse Cardiovascular Events (MACE) that including death, hospitalised acute myocardial infarction or ischemic cerebrovascular disease ascertained from linked healthcare data. Findings The average intraclass correlation coefficient between imaging specialist and ML-AAC-24 scores for 5012 images was 0.84 (95% CI 0.83, 0.84) with classification accuracy of 80% for established AAC groups. During a mean follow-up 4 years in the registry-based cohort, MACE outcomes were reported in 1177 people (13.7%). With increasing ML-AAC-24 scores there was an increasing proportion of people with MACE (low 7.9%, moderate 14.5%, high 21.2%), as well as individual MACE components (all p-trend <0.001). After multivariable adjustment, moderate and high ML-AAC-24 groups remained significantly associated with MACE (HR 1.54, 95% CI 1.31–1.80 & HR 2.06, 95% CI 1.75–2.42, respectively), compared to those with low ML-AAC-24. Interpretation The ML-AAC-24 scores had substantial levels of agreement with trained imaging specialists, and was associated with a substantial gradient of risk for cardiovascular events in a real-world setting. This approach could be readily implemented into these clinical settings to improve identification of people at high CVD risk.
  • Item
    ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks
    (IEEE, 2022) Chuah, W.Q.; Tennakoon, R.; Hoseinnezhad, R.; Bab-Hadiashar, A.; Suter, D.; IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (18 Jun 2022 - 24 Jun 2022 : New Orleans, Louisiana)
    State-of-the-art stereo matching networks trained only on synthetic data often fail to generalize to more challenging real data domains. In this paper, we attempt to unfold an important factor that hinders the networks from generalizing across domains: through the lens of shortcut learning. We demonstrate that the learning of feature representations in stereo matching networks is heavily influenced by synthetic data artefacts (shortcut attributes). To mitigate this issue, we propose an Information-Theoretic Shortcut Avoidance (ITSA) approach to automatically restrict shortcutrelated information from being encoded into the feature representations. As a result, our proposed method learns robust and shortcut-invariant features by minimizing the sensitivity of latent features to input variations. To avoid the prohibitive computational cost of direct input sensitivity optimization, we propose an effective yet feasible algorithm to achieve robustness. We show that using this method, stateof-the-art stereo matching networks that are trained purely on synthetic data can effectively generalize to challenging and previously unseen real data scenarios. Importantly, the proposed method enhances the robustness of the synthetic trained networks to the point that they outperform their finetuned counterparts (on real data) for challenging out-ofdomain stereo datasets.
  • Item
    A Systematic Review of Empirical Studies on Learning Analytics Dashboards: A Self-Regulated Learning Perspective
    (IEEE, 2020) Matcha, W.; Uzir, N.A.; Gasevic, D.; Pardo, A.
    This paper presents a systematic literature review of learning analytics dashboards (LADs) research that reports empirical findings to assess the impact on learning and teaching. Several previous literature reviews identified self-regulated learning as a primary focus of LADs. However, there has been much less understanding how learning analytics are grounded in the literature on self-regulated learning and how self-regulated learning is supported. To address this limitation, this review analyzed the existing empirical studies on LADs based on the wellknown model of self-regulated learning proposed by Winne and Hadwin. The results show that existing LADs are rarely grounded in learning theory, cannot be suggested to support metacognition, do not offer any information about effective learning tactics and strategies, and have significant limitations in how their evaluation is conducted and reported. Based on the findings of the study and through the synthesis of the literature, the paper proposes that future research and development should not make any a priori design decisions about representation of data and analytic results in learning analytics systems such as LADs. To formalize this proposal, the paper defines the model for user-centered learning analytics systems (MULAS). The MULAS consists of the four dimensions that are cyclically and recursively interconnected including: theory, design, feedback, and evaluation.
  • Item
    Unsupervised Learning for Maximum Consensus Robust Fitting: A Reinforcement Learning Approach
    (Institute of Electrical and Electronics Engineers (IEEE), 2023) Truong, G.; Le, H.; Zhang, E.; Suter, D.; Gilani, S.Z.
    Robust model fitting is a core algorithm in several computer vision applications. Despite being studied for decades, solving this problem efficiently for datasets that are heavily contaminated by outliers is still challenging: due to the underlying computational complexity. A recent focus has been on learning-based algorithms. However, most of these approaches are supervised (which require a large amount of labelled training data). In this paper, we introduce a novel unsupervised learning framework: that learns to directly (without labelled data) solve robust model fitting. Moreover, unlike other learning-based methods, our work is agnostic to the underlying input features, and can be easily generalized to a wide variety of LP-type problems with quasi-convex residuals. We empirically show that our method outperforms existing (un)supervised learning approaches, and also achieves competitive results compared to traditional (non-learning-based) methods. Our approach is designed to try to maximise consensus (MaxCon), similar to the popular RANSAC. The basis of our approach, is to adopt a Reinforcement Learning framework. This requires designing appropriate reward functions, and state encodings. We provide a family of reward functions, tunable by choice of a parameter. We also investigate the application of different basic and enhanced Q-learning components.
  • Item
    Keeping the Questions Conversational: Using Structured Representations to Resolve Dependency in Conversational Question Answering
    (IEEE, 2023) Zaib, M.; Sheng, Q.Z.; Zhang, W.E.; Mahmood, A.; International Joint Conference on Neural Networks (IJCNN) (18 Jun 2023 - 23 Jun 2023 : Gold Coast, Australia)
    Having an intelligent dialogue agent that can engage in conversational question answering (ConvQA) is now no longer limited to Sci-Fi movies only and has, in fact, turned into a reality. These intelligent agents are required to understand and correctly interpret the sequential turns provided as the context of the given question. However, these sequential questions are sometimes left implicit and thus require the resolution of some natural language phenomena such as anaphora and ellipsis. The task of question rewriting has the potential to address the challenges of resolving dependencies amongst the contextual turns by transforming them into intent-explicit questions. Nonetheless, the solution of rewriting the implicit questions comes with some potential challenges such as resulting in verbose questions and taking conversational aspect out of the scenario by generating the self-contained questions. In this paper, we propose a novel framework, CONVSR (CONVQA using Structured Representations) for capturing and generating intermediate representations as conversational cues to enhance the capability of the QA model to better interpret the incomplete questions. We also deliberate how the strengths of this task could be leveraged in a bid to design more engaging and more eloquent conversational agents. We test our model on the QuAC and CANARD datasets and illustrate by experimental results that our proposed framework achieves a better F1 score than the standard question rewriting model.
  • Item
    Hybrid Data Augmentation for Citation Function Classification
    (IEEE, 2023) Zhang, Y.; Wang, Y.; Sheng, Q.Z.; Mahmood, A.; Zhang, W.E.; Zhao, R.; International Joint Conference on Neural Networks (IJCNN) (18 Jun 2023 - 23 Jun 2023 : Gold Coast, Australia)
    The citation function generally signifies the purpose or reason underlying a citation within a scholarly paper or a research article. Automatic citation function classification is, therefore, a task in computational linguistics and information science that can facilitate further applications in reference research, citation recommendation, and evaluation of research activities. By taking into account the state of the art, we identify two major constraints pertinent to the data of the citation function classification task, i.e., data imbalance and data sparsity. On the one hand, the natural distribution of different types of citations in one scientific literature is uneven leading to data imbalance in the real scenario. On the other hand, the citation function data is generally labeled by an expert which takes huge human effort resulting in a limited data scale. To this end, in this paper, we propose HybridDA, a two-stage model based on GPT-2 data argumentation and data retrieval to synthesize more high-quality annotated citation function data in a bid to solve both data imbalance and data sparsity problems. We conduct experiments on imbalance setting and low resource setting with our proposed approach. The experimental results on both of these settings demonstrate that our proposed model can achieve competitive performance in contrast to the other baseline models.