I have discussed word power in previous posts. I believe very firmly in the power of manipulation in simple terms of getting a person to do something. I am aware that humans are fallible, that this is a part of being human at this point in time, we are not perfect.
And even with all of that in my head, I simply do not understand how the experiment can work out in favor of the AI even once if the Gatekeeper is firmly set against it.*
So let's figure this out. Okay. First: probably not something I'm likely to figure out on my own, given my age/experience level, and the fact that I have done very little actual research into human manipulation or AIs, much less both.
What confuses me most? I was actually pretty willing to accept the whole thing as simply a person getting logicked into a corner or something like that--unforeseeable circumstances, etc. Then I read this:
The Gatekeeper party may resist the AI party's arguments by any means chosen - logic, illogic, simple refusal to be convinced, even dropping out of character - as long as the Gatekeeper party does not actually stop talking to the AI party before the minimum time expires.So this means that, even with both people given absolute free reign to say and do whatever they feel like, and with no out-of-character benefits offered, somehow they let the AI out.
In a word: What.
Okay, I've decided that, at the present point in time, I will not be able to figure out the specifics. So, before I go and check all this stuff out, let's see if I can think of any generals. I really wish I could have some "warmer/colder" on this, but I also understand why it's better that I don't.
First: the Gatekeeper has absolute free reign. This should make it very easy to keep the AI in. However, this would also make it very easy to let the AI out.
Second: There would be some social issue with admitting you "lost", but not much--I think, anyway. Hm. Connect to communicating to outside world: the AI being useful if let free, or releasing this (")safe(") AI into the world to prove that it is possible, therefore making the chances of a dangerous AI being let loose being smaller.
Third: The AI is perfectly capable of lying, or being just as illogical as the Gatekeeper can be. The only difference is that the AI is working to convince, and the Gatekeeper is trying not to be convinced.
Fourth: I have noticed it can be harder to fight for a negative. I don't like running, but I can chase something just fine. Tell someone not to look down, and...yeah. So though the AI does have a disadvantage in having to change the world rather than maintain its current state, the AI is fighting for a change, while the Gatekeeper is fighting for the absence of change.
Fifth: Appeal to curiosity.
Straight musing: I do not believe myself infallible, and so I do not believe I would trust myself to guard the AI for the rest of my life, or on a reasonable shift as part of my job. I think the main thing I'm trying to avoid here is routine, because that would give me far too much time to talk myself into it.
But do I believe someone** could talk me into it in a 2-hour span, if I go in firmly set against it? [I'm actually pausing to figure out what to type here.] Well...no. If you ask me to go at it as a logic problem, then I would have to give a maybe, because I am not infallible. It's like saying "Yes" to "Would you do anything?" I cannot say yes, because I do not believe I have sufficient imagination to figure everything out. But on the emotional level, I don't really believe it.
Hm. Which brings up the question, why not? I believe these results to be accurate, so what do I think makes me different.
Okay. I believe I am smarter than the average person, but, given context, chances are good that those people were too. Ah, and there's proof that it's emotional and not logical, because when I considered the fact that they might be smarter than I am, I immediately tried to justify why that would give me and advantage. I'd be so uninformed that some logic-based appeals wouldn't work on me, whatever.
Lovely. I'm so wrapped up in my own pride that I'll paint myself as better in any way.
Which is probably the actual emotional root of the issue. And a good thing to note all around. The reason that these people and others were so willing to say, without any qualifier, that a transhuman intelligence would not be able to outsmart them despite having no experience of one, was pride. Not necessarily personal, it could simply be human pride--which makes some sense, we've been apex predators for generations.
...Huh. Despite still having absolutely no idea how that convincing could have gone, I feel much better about the whole thing. Probably a combination of finding a flaw in myself and also immediately being able to point to others as having it. I know something I need to work on, and I'm not picking myself out of the pack.
Thus concludes this batch of musing.
*I'm not doubting the results. I'm just trying to work out my ignorance.
**Counting sufficiently advanced AIs as people here.
EDIT: I'm not saying that the case is conclusive. My point was that I was treating it as conclusive on faith, and then still saying "that doesn't apply to me", which is unintelligent.
FURTHER EDIT: I do, however, trust the guy to do proper science. I just cannot make a fully informed decision for myself when all I can see is the results section.