Excessive Data Exposure: What It Is, How We Can Help


No. 3 on the OWASP API Top 10 vulnerabilities list is excessive data exposure (after BOLA and broken user authentication). OWASP says of this vulnerability, “Looking forward to generic implementations, developers tend to expose all object properties without considering their individual sensitivity, relying on clients to perform the data filtering before displaying it to the user.” 

How Do Excessive Data Exposure Exploits Work? 

Attackers can probe for excessive data exposure in a number of ways. They can analyze legitimate response traffic, looking for exposed sensitive data, or, more commonly, they can look for human patterns – development team practices – that indicate ways to attack an API. 

OWASP gives this example: 

The mobile team uses the /api/articles/{articleId}/comments/{commentId} endpoint in the articles view to render comments metadata. Sniffing the mobile application traffic, an attacker finds out that other sensitive data related to comment’s author is also returned. The endpoint implementation uses a generic toJSON() method on the User model, which contains PII, to serialize the object. 

How to Prevent Excessive Data Exposure 

It’s a common practice when building APIs for developers to simply serialize all the data related to a particular API resource, irrespective of that data’s sensitivity. This practice may seem like a common sense time-saving design pattern, but it can result in an info leak, where sensitive data is exposed to unauthorized clients, or bad actors. A more defensive practice is to clearly classify data in a system, and to define a separate data model for public interfaces such as APIs. 

Bottom line for developers: Be conservative about what data you return in API responses. It might seem like a great idea to “future-proof” an API, making it applicable for applications that were not originally envisioned by the application owner. But, a future fraught with data breaches isn’t on anyone’s bucket list. Instead, be conservative in resource representations and only include data necessary for well-understood use cases. This conservative approach dramatically decreases implementation effort, and also presents…

Source…