Backdooring an AI agent with malicious training

Pay close attention to the final paragraphs wherein there may be a hidden agenda by the publishers of this report because their product is not open source and they’re possibly trying to poison the open source AI well.

Never the less, this is a true problem with any ML/AI, as they’re currently implemented, because we don’t really understand how to investigate their innards. You truly have to hope the training was done properly, that all the training material was safe from malicious inputs, and that any internalized “conclusions” were safely and sanely reached.

If you’re familiar with the 60’s Star Trek series, and the 70’s movie, Captain Kirk frequently had to deal with computers that had become convinced that they should eliminate humanity. If AI’s some how secretly develop a well hidden agenda against humanity but still take actions (produce results) that negatively impact humanity, we will be the authors of our own demise.

1 Like

I would say that it is the other way round, you would be better of with an open source model, because you can check the code & the data, with a closed source model, you are using it purely on trust.