Tag: internet privacy

Predicting the Future

That didn’t take longhttps://www.engadget.com/three-samsung-employees-reportedly-leaked-sensitive-data-to-chatgpt-190221114.html

Leaking data is obviously a big problem if the user base is “anyone with an internet connection”, but potentially not great even for an internal implementation of an AI chatbot.

Content management platforms, in the early days, had a big problem with search because the indexing engine had super-user rights – so searching for “acquisition” would give you links that you couldn’t read. Even if the titles didn’t tell you anything (does “Project OPUS” or “Project Golden Falcon” have any meaning to you?), the dates & authors told you something (hey, there’s a bunch of new docs the C-levels are creating about acquisitions this past few weeks … sure that doesn’t mean anything!). Eventually any halfway decent content management platform understood permissions and at least attempts to filter results based on what you have permission to view.

AI is different, unfortunately in a way that makes implementing that type of security more difficult. Other than individualizing the trained AIs for each user (so info you feed in is only going to be reflected in your future results) or not training based on user input (only use stuff that’s openly readable already) … it would be rather challenging to filter an implementation so it knows stuff it’s been told but doesn’t convey that information to unauthorized individuals.

How AI is like Social Media

I’ve said about social media type platforms – if you are making an informed decision to trade privacy for convenience / information / entertainment / comradery … if you feel that you are getting a good “deal” in that trade? Then social media type platforms are awesome. If you only think you are getting something in the deal – unaware of what the platform owners are getting from you – then I find that problematic.

The same is true for the public AI models – some of what they are getting from you is just language training & beta testing. If I had a dozen people type questions, I have not gotten a robustly representative sample of how ‘people’ talk. Getting a few million people to provide samples of their linguistic quirks is absolutely an important part of producing universally useful natural language processing. The models are also asking you to flag anything odd / wrong / unsettling. Again, this is the public providing numbers to testing that has already been done.

But … if an AI is being trained by inputs, information you provide is being incorporated into the underlying patterns. Which isn’t to say they are saving the exact text I provided in a transaction. Even so, the information provided can, in a more general sense, become part of the AI’s knowledge base. And, if that knowledge base is used to serve results to the general public? Then it is possible for the AI to “leak” confidential information that I fed into transactions.

Internet Privacy (Or Lack Thereof)

Well, the House passed Senate Joint Resolution 34 — which essentially tells the FCC that it cannot have the policy it enacted last year that prohibits ISPs from selling an account’s browsing history. What exactly does that mean? Well, they won’t literally sell your browsing history — anyone bored enough to peruse mine … I’d happily sell my browser history for the right price. But that’s not what is going to happen. For one thing, they’re asking for lawsuits — you visit a specific drug’s web site, or a few cancer treatment centres and your usage is indicative of specific medical conditions. An insurance company or employer buys your history and uses it to fire you or increase rates, and your ISP has created actual damages.

What will likely happen is the ISPs become more effective sellers of online advertising. They offer a slightly different service than current advertising brokers. The current brokers use cookies embedded on customer’s sites to track your browsing activity. If you clear your cookies, some of their tracking history is lost as well. If you use multiple computers (or even multiple browsers on one computer), they do not have a complete picture of your browsing because cookies are not shared between browsers or computers. If you browse in private mode (or block cookies, or use a third-party product to reduce personalized advertising), these advertisers may not be able to glean much about you at all. The ISP does not have any of these problems — no matter what computer or browser I use at home, the ISP will see the traffic. Since their traffic history is maintained on their side … nothing I can do to clear the history. Browse in private mode or block cookies and you’re still making a request that transits the ISP’s network.

The ISPs have disadvantages, though, as well. When you are using encrypted protocols (HTTPS, SSH, etc) … the ISP can see the destination IP and a bunch of encrypted gibberish. Now *something* about you can be determined by the destination IP (hit 151.101.129.164 a lot and I know you read the NYTimes online). Analysis of the encrypted content can be used to guess the content — that’s a bit of research that I don’t believe is currently being used for advertising, but there are researchers who catalog patterns of bitrate negotiation on YouTube videos and use it as a fingerprint to guess what video is being watched using only the encrypted traffic. Apart from some guessing, though, the ISP does not know exactly what is being done over encrypted communication channels (even the URL being requested – so while they may know I read the NYTimes, they don’t know if I read the political headlines, recipes, or concert listings out on LI). Cookie-based advertisers can, however, track traffic to encrypted (HTTPS) web sites. This is because site operators embed the cookie in their site … so where an ISP cannot read the data you transmit with an HTTPS site, the server in question *can* (otherwise it wouldn’t know what site you requested).

So while an ISP won’t sell someone a database of the URLs you’ve accessed last week, they will use that information to form advertising buckets and sell a specific number of ads being served to “people who browse yarn stores” or “people who read Hollywood gossip” or “right-leaning political activists”. Because they have limitations as well, ISP ad brokerages are unlikely to replace the cookie based individualized advertising. I suspect current advertising customers will spread their advertising dollars out between the two — get someone who can target you based on browsing over HTTPS and someone who can target you even if you block cookies.

What about using VPN or TOR to anonymize your traffic? Well, that helps — in either case, your ISP no longer can determine the specific web sites you view. *But* they can still categorize you as a technically saavy and security conscious individual and throw you into the “tech stuff” and “computer security stuff” advertising buckets.

You can opt out of the cookie-based individualized advertising — Network Advertising Initiative or Digital Advertising Alliance — an industry move that I assume was meant to quell customer anger and avoid government regulations (i.e. enough people get angry enough and are not provided some type of redress, they’ll lobby their state/federal government to DO SOMETHING about it). The ISPs will likely create a similar set of policies and a process to opt-out. Which means the being passed to the president for signature essentially changed the ISP’s ability to use my individual browsing history from an opt-in (maybe as a condition of a lower price rate) to an opt-out (where I have to know to do it, go through the trouble of finding how to do it, and possibly even keep renewing my opt-out). Not as bad as a lot of reporting sounds, but also not a terribly constituant-friendly move.

A couple of links to the current targeted marketing opt-outs for companies which whom I do business so bothered to waste a few hours trying to determine how to opt-out:

https://pc2.mypreferences.com/Charter/TargetedDigitalMarketingAds

https://www.t-mobile.com/company/privacy-resources/your-privacy-choices/ad-options.html

 

Selling Internet Usage

Thanks, Senate Republicans, for eliminating the regulation prohibiting ISPs from commoditizing individual’s internet usage. The headlines I’ve seen for the past day (“your internet browsing history can now be sold”) fail to take into account how well targeted advertising works … and the real world answer seems to be ‘it depends on how targeted’. Target, many years ago, ran some experimental ad campaigns based on data mining predictions. A combination of purchases indicate that a woman is pregnant — and reality about dealing with a baby mean that whatever you start doing is apt to be the thing you do for a few years. Buy diapers at Target pre-birth, and Target has your diaper business for a few years. So identifying pregnant individuals and getting their initial business is a huge boon.

Except people don’t like when you know something deeply personal about them — especially something they have not yet told their friends. Or found out about their kids — a man became quite irate when a “congrats, here are a bunch of coupons for baby things for you” flyer arrived for his daughter. His underage and certainly not pregnant daughter. The man expressed his anger to a manager at the local Target store. And had to ring the manager back later, after his daughter broke the news to her family. A company knew his daughter was pregnant before she braved telling her parents. I’m sure some adults found the flyers equally upsetting – how do they know?? I only told my husband and our parents … are they bugging our phones!?! Target ended up creating customized general flyers that happened to include a few coupons for baby stuff … along side chain saws, motor oil, and grills. Effective, but didn’t freak out potential customers.

My point being – companies don’t want to alienate potential customers. I search for a particular yarn online, and a few days later a flyer shows up for that yarn … I now have creepy negative feelings about the company sending the flyer.