Trusted AI needs trusted data
In the buzz around AI, let’s not ignore the role of data for developing AI we can trust, says one Notre Dame computational scientist.
Two years ago, Notre Dame launched the Trusted AI project with collaborators at Indiana University, Purdue University, and the Naval Surface Warfare Center Crane Division (NSWC Crane).
In the time since the project launched, AI has undergone an important change. Not only has it developed technologically, but it has also become a matter of public interest. For example, since it launched less than a year ago, Open AI’s chatbot ChatGPT has gained over 100 million users, and six-in-ten U.S. adults say they know about the technology.
According to Charles Vardeman, a computational scientist and research assistant professor at the University of Notre Dame’s Center for Research Computing, the buzz surrounding generative artificial intelligence (AI) can sometimes draw attention away from a crucial element that determines whether we can trust AI or not: data.
“In AI development, much of the attention is given to fine-tuning model architectures, experimenting with different layers, activation functions, and optimization algorithms,” Vardeman explains. “While this approach has driven many innovations, it may sometimes overlook the central role of data in AI success.”
Vardeman illustrates the problem with a cartoon called “Flawed Data” by XKCD:
For Vardeman, the message is simple: Trusted AI needs trusted data, and “irrespective of their complexity, models are only as good as the data they learn from.”
Vardeman leads the Frameworks Project, a part of the larger Trusted AI project that provides research, methodologies, and tools to enable trusted and ethical AI. He says he and his collaborators find inspiration in the Data Centric AI movement, championed by tech leaders like Andrew Ng and others. The movement emphasizes the importance of tools and practices that systematically improve the data used to build an AI system.
“By prioritizing data quality, encompassing aspects such as cleanliness, relevance, diversity, and contextual richness, AI systems can achieve better performance with potentially simpler architectures,” Vardeman says. He points out that the shift toward a data-centric approach is “not just a technical reorientation; it’s a recalibration of AI development’s very essence, placing data at the heart of the innovation process.”
According to Vardeman, a data-centric approach aligns well with the military’s requirements.
“In complex environments where variability and uncertainty are common, the depth and quality of data can make a significant difference,” he explains. “A well-annotated dataset that captures the intricacies of real-world scenarios allows AI models to learn more effectively and generalize better to unseen situations.”
The Frameworks Project commits to data-centric goals that align with the Department of Defense’s data goals and strategies. It focuses on making data:
- Visible: Ensuring that data is discoverable by those who need it fosters transparency and helps build a foundation of trust in AI systems.
- Accessible: Making data readily available to authorized users enhances the efficiency and effectiveness of AI, providing the right information at the right time.
- Understandable: Clear documentation and metadata contribute to the explainability of AI, a core dimension of trust.
- Linked: Connecting related data sets facilitates more coherent AI analysis, ensuring robustness and reliability in decision-making.
- Trustworthy: Maintaining the integrity and quality of data ensures that AI systems are dependable, echoing the Trustworthy dimension in our trust framework.
- Interoperable: Enabling data to be used across different systems and platforms fosters collaboration and integration, key aspects of our community engagement efforts.
- Secure: Implementing strong data security measures safeguards privacy and aligns with the ethical considerations central to trusted AI.
Vardeman says that by recognizing data as a strategic asset and prioritizing these seven goals, the Frameworks Project is “nurturing an environment where trusted AI can flourish.”
As the Frameworks Project enters its third year, it will continue to emphasize data quality, Vardeman says, and by leveraging the momentum of the data-centric AI movement, the next phase of the project will focus on further refining data quality and context.
For the Trusted AI (TAI) Frameworks Project, a data-centric philosophy serves as an essential cornerstone, recognizing that robust data management practices can lead to more reliable, interpretable, and trusted AI systems. This perspective does not diminish the value of innovative modeling but places it in the context of a balanced and well-considered AI development strategy, where data and models work in harmony.
Vardeman says, “The data-driven journey of the Trusted AI Frameworks Project illuminates the vital role of data in enhancing the trustworthiness and success of AI deployment within the Navy and Marine Corps. With a vision rooted in collaboration, ethical excellence, and technological innovation, the path forward promises to be an exciting and transformative adventure.”
“Together,” he says, “academia, industry, and the military will continue to forge a data-centric future that not only serves the needs of the present but anticipates the challenges and opportunities of tomorrow.”
Contact
Brett Beasley / Writer and Editorial Program Manager
Notre Dame Research / University of Notre Dame
bbeasle1@nd.edu / +1 574-631-8183
research.nd.edu / @UNDResearch
About the Center for Research Computing
The Center for Research Computing (CRC) at the University of Notre Dame is an innovative and multidisciplinary research environment that supports collaboration to facilitate multidisciplinary discoveries through advanced computation, software engineering, artificial intelligence, and other digital research tools. The Center enhances the University’s innovative applications of cyberinfrastructure, provides support for interdisciplinary research and education, and conducts computational research. Learn more at crc.nd.edu.
About Notre Dame Research
The University of Notre Dame is a private research and teaching university inspired by its Catholic mission. Located in South Bend, Indiana, its researchers are advancing human understanding through research, scholarship, education, and creative endeavor in order to be a repository for knowledge and a powerful means for doing good in the world. For more information, please see research.nd.edu or @UNDResearch.
Originally published by crc.nd.edu on September 19, 2023.
atLatest Research
- Big Tech privacy policies limit ad variety, reducing performance and revenuePrivacy-preserving policies that shorten the retention period of consumer data can reduce ad variety in multi-product ads, ultimately impacting ad performance and platform revenues, according to new research from Shijie Lu, the Howard J. and Geraldine F. Korth Associate Professor of Marketing.
- Indiana Justice Project and the Notre Dame Clinical Law Center publish a report on the future of eviction record sealing in IndianaIndiana Justice Project (IJP) and the Notre Dame Clinical Law Center have issued a report on policy options to consider…
- Notre Dame President Emeritus Rev. John I. Jenkins, C.S.C., visits key sites in Lviv, UkraineRev. John I. Jenkins, C.S.C., president emeritus of the University of Notre Dame, recently visited the Ukrainian Catholic University (UCU) and key sites in Lviv, Ukraine, as a sign of Notre Dame’s continued support for the university and its students. It was his first international trip on behalf of Notre Dame since stepping down from the presidency at the end of the 2023-24 academic year.
- Notre Dame marks another year of unprecedented research successDuring the 2024 fiscal year, researchers at the University of Notre Dame submitted 1,310 proposals for external research funding for a total amount of $1.016 billion — the first time the University has surpassed the billion-dollar mark for proposals. In addition, the University received 829 separate awards — the largest number on record. With $223 million in total funding, these awards propelled the University past the $200 million mark for the fourth straight year.
- Chile provides fruitful opportunities for collaboration and researchAll over the world, Notre Dame faculty are carrying out research and making exciting discoveries. For most, their work would not be possible if not for partnerships with global universities, programs, and individuals. One such example is the research produced by Yamil J. Colón-Rodríguez, Ph.D., assistant…
- Notre Dame startups Grannus Therapeutics and TayCo Brace gain recognition and opportunities at Rally pitch competitionElevate Ventures' second annual Rally IN-Prize pitch contest, held on August 27 and 28 in Indianapolis, is touted as one of the world’s largest startup investment pitch competitions, offering up to $1 million in investment financing…