Tag: artificial intelligence

Predicting the Future

That didn’t take longhttps://www.engadget.com/three-samsung-employees-reportedly-leaked-sensitive-data-to-chatgpt-190221114.html

Leaking data is obviously a big problem if the user base is “anyone with an internet connection”, but potentially not great even for an internal implementation of an AI chatbot.

Content management platforms, in the early days, had a big problem with search because the indexing engine had super-user rights – so searching for “acquisition” would give you links that you couldn’t read. Even if the titles didn’t tell you anything (does “Project OPUS” or “Project Golden Falcon” have any meaning to you?), the dates & authors told you something (hey, there’s a bunch of new docs the C-levels are creating about acquisitions this past few weeks … sure that doesn’t mean anything!). Eventually any halfway decent content management platform understood permissions and at least attempts to filter results based on what you have permission to view.

AI is different, unfortunately in a way that makes implementing that type of security more difficult. Other than individualizing the trained AIs for each user (so info you feed in is only going to be reflected in your future results) or not training based on user input (only use stuff that’s openly readable already) … it would be rather challenging to filter an implementation so it knows stuff it’s been told but doesn’t convey that information to unauthorized individuals.

How AI is like Social Media

I’ve said about social media type platforms – if you are making an informed decision to trade privacy for convenience / information / entertainment / comradery … if you feel that you are getting a good “deal” in that trade? Then social media type platforms are awesome. If you only think you are getting something in the deal – unaware of what the platform owners are getting from you – then I find that problematic.

The same is true for the public AI models – some of what they are getting from you is just language training & beta testing. If I had a dozen people type questions, I have not gotten a robustly representative sample of how ‘people’ talk. Getting a few million people to provide samples of their linguistic quirks is absolutely an important part of producing universally useful natural language processing. The models are also asking you to flag anything odd / wrong / unsettling. Again, this is the public providing numbers to testing that has already been done.

But … if an AI is being trained by inputs, information you provide is being incorporated into the underlying patterns. Which isn’t to say they are saving the exact text I provided in a transaction. Even so, the information provided can, in a more general sense, become part of the AI’s knowledge base. And, if that knowledge base is used to serve results to the general public? Then it is possible for the AI to “leak” confidential information that I fed into transactions.

Using AI For Lead Qualification & Cost Reduction

Microsoft posted an overview of how they use AI (and a bot) to score leads in their sales process — Microsoft IT Showcase/Microsoft increases sales by using AI for lead qualification. My personal ‘most interesting thing’ is that they’re using scikit-learn in Python for some of the analysis — I’m using similar components in a custom-written spam filter at home. Their idea of running the text through a spell checker first is interesting — I want to try running my e-mails through Aspell and see if there are statistically significant changes in classifications.

 

They’ve previously detailed reducing energy usage through machine learning — that’s something I’d love to see more companies doing. Energy can be a significant operating cost, and reducing energy use has a positive environmental impact.

On Snowden and Sharepoint

I’ve seen a number of articles focus on how the NSA failed to properly secure data within SharePoint, thus allowing Snowden to take off with a huge amount of sensitive data. What I haven’t seen anyone discuss is some type of AI that would analyze the SharePoint audit records against organisational information and what others in the same position access. Maybe the access would have gotten flagged to management and someone would have said “Oh, he’s doing this data migration to the Hawaiian cluster so I guess it’s reasonable he’d be accessing the data”. Maybe. Or they would have dug deeper and seen that something malicious was happening. Or, hell, maybe just talking to the guy about his suspicious access would have scared him enough that he’d have stopped. Who knows. But asking humans to read through the audit logs on a SharePoint server (the remediation suggestions that I’ve seen) is ‘find this needle in a stack of needles’ silly. Algorithms, and especially learning algorithms, are much better suited for that type of analysis.