Our ability to generate, store, and analyze data is increasing at an exponential speed – we create over 400 million terabytes of data each day, and approximately 30% of that is from the healthcare industry. With this flood of information, hospitals and healthcare providers are increasingly leaning on data and artificial intelligence (AI) to enhance patient care and streamline their processes. These advancements offer incredible potential to improve healthcare outcomes, but they also bring a host of responsibilities, especially regarding the quality, security, and reliability of clinical data.
What is data stewardship and why does it matter?
We have all heard the saying, “garbage in, garbage out,” and this is very pertinent to all artificial intelligence (AI) models. Data stewardship is the careful management, protection, and utilization of data throughout its lifecycle. In healthcare, data stewardship ensures that patient data are accurate, reliable, appropriate, and securely stored with limited access. In research and clinical care, high quality data are paramount. Predictive algorithms and machine learning models are not sentient – the quality and applicability of the input data that are used to train the models dictates the accuracy and reliability of the output. For example, a postoperative complication model generated from a dataset of a healthy, homogenous group of patients would likely perform poorly in a more diverse population of geriatric, frail patients. This is an example of input data that are not fit for the purpose for which they were chosen. In some cases, the data are not only inappropriate but are inaccurate or incomplete. For IBM Watson, the limited source dataset did not fully capture the complexity and variability of real cancer cases, and the resulting model had low clinical utility, making it ungeneralizable and unscalable.
Why should surgeons and surgical researchers care about data quality and governance?
As surgeons, our role in data generation and governance is much larger than we think. We generate massive amounts of unstructured and structured clinical data, from detailed clinical notes to real-time data from robotic systems and patient wearables. These rich resources are invaluable for observational studies and real-world evidence generation. However, if the data aren’t accurate or aren’t documented in a standardized way, their value diminishes. We need to take an active role in ensuring our clinical documentation and data capture at the point of care are intentional, thorough, and consistent as these practices build a strong foundation for future AI applications in surgery and healthcare.
How do we become better stewards of data and what can we do to ensure a safe data culture for our patients and clinicians?
Institutions are beginning to prioritize data governance by forming dedicated committees and involving diverse stakeholders, including IT, data scientists, and clinicians. For surgical data, it’s essential that surgeons are present at the table. Surgeons need to help define the types of data that are collected and ensure that data standards are maintained across the board. Important questions include: Who own the data? How are data standardized for interoperability and reuse? How do we sustain high-quality data practices over time? By actively participating in these discussions, surgeons can ensure that surgical data are both meaningful and applicable across different research and clinical contexts.
As exciting as AI may be, we must remember that models don’t create themselves – they’re built on data and the quality of that data determines a model’s effectiveness. For AI to revolutionize surgical care, we need strong data foundations that truly reflect the realities of clinical practice. We, as surgeons, are in a unique position to influence this transformation and by being diligent stewards of our data, we can help pave the way for AI-drive innovations that can improve patient care and clinical workflows.