Declared Vs Detected, And All The Fraud In Between

I’ve written before that marketers love “more.” In fact, many of the largest marketers buy digital ads as if they were shopping at Costco — i.e. more quantity, at lower unit cost. But what works for toilet paper doesn’t work in digital marketing. That’s because the “more” digital ads you buy, the more exposed you are to ad fraud. There simply aren’t enough humans on earth, let alone ones that spend time online and on mobile devices and apps, to generate the enormous quantities of ads bought and sold through programmatic channels. And you know humans can only watch one streaming show on their connected TV (CTV) at a time, right? There’s no “parallel” consumption that can explain the quantities and growth rate of CTV impressions, even if you take into account more humans streaming more during the pandemic lockdowns. A large part of the CTV ad quantity increase is from bots pretending to be Roku streaming sticks to absorb your lovely ad budgets.

I’ve also written about how bad guys are super efficient in their fraud work. If they can save more costs and do less work, while still making more money, they would do that. For example, the bots that they use are tuned to do just enough to make money, and not waste time doing any unnecessary work. If they just needed to load an ad to make money, that’s all they’d do and move on. This works for CPM (cost per thousand) campaigns where they just need to generate ad impressions. For CPC campaigns (cost per click), they’d also have to click the ad to make the money; so they click and move on. Along the same lines of efficiency, if the bot didn’t have to load the entire webpage, they could save time and bandwidth by just loading the ad itself, without the webpage. We call this form of fraud “naked ad calls” (humans didn’t go to a webpage, and didn’t see those ads). Let’s take it a step further. What if the bots didn’t even need to load the ad itself? If they could just trick the reporting to make it appear that ads ran; they would still get paid anyway and not have to waste any time, money, and bandwidth loading any ads. The marketers just paid for some numbers in an Excel spreadsheet. You don’t think this happens? Then you need to read my other 104 articles on Forbes.

See: Tricking Google Analytics With Phantom Traffic, Geolocations, Referrers [video 1] [video 2]

Aside from the classic example of tricking Google Analytics into thinking there was traffic when no traffic actually occurred on the website, there are many other ways fraudsters can simply create what marketers want, or want to see, out of thin air, and get paid for it. Often this is just a matter of declaring certain things. “Declared” means they can just say what it is, as opposed to “detected” which means some form of analytics independently detected the data.

Geolocation is a good example to use to illustrate this point. In programmatic bid requests, fraudsters can simply “declare” the lat-long location of the user by writing it into the bid request. They can insert random locations; bids with any location data make more money than bids without it. They can also insert specific locations to pretend to be users in specific locations, so they can steal from geotargeted campaigns like political advertising targeted at specific states. All of these data points are declared. Even on mobile devices that have real GPS sensors, mobile apps write declared data into the bid requests instead of asking the GPS for real locations, for speed and for fraud. The GPS sensor data would be the “detected” data; but the apps choose not to use it to speed up the bidding process; further onn most browsers real GPS location data is not used because asking for it triggers a user prompt and that gets in the way of the bid request. So desktop and mobile web impressions virtually never use real detected data, and pass declared locations instead.

A key lesson to learn here is the difference between declared data and detected data, and vast opportunity for fraud created by the pervasive use of declared data in programmatic ad tech. Too few understand this distinction — “declared” vs “detected” — let alone are on the lookout for it. Beyond the locations of users, fraudsters can simply declare the domain/website that an ad was going to appear on. This is called domain spoofing. A fake site, or no site at all, could simply declare it was a mainstream site to disguise themselves. This is done all the time because would not get any bids; but they will get bids when they declare a mainstream domain in the bid request (even though they put their own sellerID, so they can get paid later). This technique was so successful (and ongoing) that it has even tricked two so-called fraud detection companies into thinking there were bots on mainstream sites, when there were not bots and no ads at all, just faked bid requests with declared domains (sports websites) and non-existent webages (404 errors).

What else is declared? All HTTP headers can be declared. That means if a fraudster had a reason to declare something fake to make money, they would. And they do. The simplest example is declaring the HTTP_USER_AGENT, which is the name and identifier of the browser. Bots can simply declare themselves to be a normal browser like Chrome, Safari, or Firefox. They can even declare the version number and other details precisely to get by fraud detection that just checks these HTTP headers. That’s why you need to detect the user agent after an ad loads and compare the detected value to the value declared in the HTTP header. If those don’t match, that’s a red flag. As mentioned before, all bid requests are declared. That means every single variable in it could be falsified. Should you trust any of it, without subsequent verification? That’s why all pre-bid blocking solutions are of limited use — because they rely on declared data, and must decide whether to let it through in milliseconds. They usually let most things through. Don’t even get me started on ads.txt, the “miracle drug” from the IAB that they thought would solve ad fraud. Fraudsters have fully “pwned” them by using fake ads.txt files, renting legit ads.txt from co-conspirators, or declaring their sites to be “direct” (when they weren’t) to absorb ad dollars from marketers buying “direct only.” Fraudsters just “declared it to defraud it.”

Even IP addresses are not “clear-cut fraud or not” indicators. A data center IP address may not be bot fraud, if a real human user is using a VPN and thus appears to be coming from a data center IP address. A residential IP address may not be valid, if malware or fraudulent apps were generating ad impressions from one or more devices within that household. This is why the block lists of IP addresses that industry trade associations tout as “member benefits” are completely and utterly useless. First off, comparing the IP addresses of every single bid request against a blocklist of 100 million IP addresses is a tremendous waste of computing resources. Second, it’s not clear if an IP address is good or bad, per the above. And third, bad guys are no longer using ANY of those 100 million IP addresses anyway. The moment those IP addresses got blocked, and they could no longer make money, the fraudsters swapped them out for new ones. Thinking that the use of any IP address block lists will reduce fraud is why marketers continue to be ripped off by bad guys; and their own trade association has been helping the bad guys, not you the marketer.

I’ll close this article by saying that bad guys will create whatever marketers want to see. When marketers wanted more clicks, thinking that meant more engagement, they simply had their bots click more. When marketers wanted to buy “100% viewable” ads, criminals just used code to alter the viewability measurement to deliver 100% viewable 100% of the time. If marketers wanted “brand safe” the fake websites would eliminate any words on the page so their sites won’t get blocked. (Real news sites reporting on the coronavirus would get blocked by BS (”brand safety”) technology, while fake news sites would not be blocked). If marketers wanted to see their remarketing programs work really well, the remarketing companies would just falsify the reporting to claim credit for sales that would have happened anyway, thus making it appear that remarketing is working many times better than any other form of advertising. This causes the marketer to allocate even more dollars to it, making the fraudster more money and more happy.

Remember, the fraudsters just “declare it to defraud it.” So make sure you constantly look for what is declared versus what you can double-confirm (NOT double-verify) through real detection. When “detected” doesn’t match “declared” look into it further.

