Navigating Contractual Limits in Data Sets

by Zachary Barlow

November 18, 2025

Data collection and analytics have grown exponentially since the advent of the internet. As AI tools become more effective, quality datasets are driving value like never before. However, many non-public datasets may be bound by contractual provisions that prevent them from being analyzed by AI systems. A recent Debevoise & Plimpton memo discusses the challenge of contractual use limitations on data sets. It provides the following example of how this problem can unfold in practice:

“To illustrate the point, suppose an insurance company wants to use AI to re-price its auto insurance in a particular city, neighborhood by neighborhood, using a vast quantity of data that may be relevant for accident or theft claims in each location. They have collected or purchased data relating to weather conditions, road construction, crime statistics, past insurance claims, telematics, vandalism frequency by make and model of car, drone footage with analytics, etc. Each dataset may be subject to multiple contracts, and therefore multiple possible restrictions, which may differ by provider, by time period, and by location.”

The memo notes that many of these provisions were written in 2023 and 2024. Originally intended as a means of keeping private data from spilling into AI training data, they provide a blanket restriction on using AI with the covered data set. These provisions didn’t anticipate closed AI systems that mitigate such security risks. AI developers need to review any restrictions surrounding their datasets and understand what limits are in place before feeding that data into their AI systems.