I am a cat
Interpreting Visual Culture today: a neo-semiotic perspective
Hidetaka Ishida (University of Tokyo)
0 How a cat can say “I am a cat” ?
I am a Cat(吾輩は猫である) [1] is a satirical novel written in the beginning of 20thcentury 1905-1906 by Natsume Sōseki, novelist that I prefer and every day I pass near by his old house to go to the university. [2] There is a statue in bronze of the cat on the wall.
My talk is about this cat! [3] How a cat can say “I am a cat”?
[4] The incipit of Soseki’s novel is “I am a cat. As yet I have no name.”
[5] How a cat can say “I am a cat” in the age of Artificial Intelligence? This is the question that I would like to address today.
Why today? Because all people talk today about Artificial Intelligence, about Big data, about Machine Learning, Deep Learning, Neural Network, etc. Six years ago in 2012 a neural network of Google of 16,000 computer processors with one billion connections, succeeded, without any learning nor supervising by human to detect cats faces:
The Wired magazine reports, “10 million randomly selected YouTube video thumbnails over the course of three days and, after being presented with a list of 20,000 different items, it began to recognize pictures of cats using a "deep learning" algorithm. This was despite being fed no information on distinguishing features that might help identify one.”(Wired, 2012 0626 ) [6]
[7] It would be better if I explain a little in my limited possibility of comprehension the innovation by Google. That was a breakthrough in domain of Machine Learning and Deep Learning. The blog says: “We then ran experiments that asked, informally: If we think of our neural network as simulating a very small-scale “newborn brain,” and show it YouTube video for a week, what will it learn? Our hypothesis was that it would learn to recognize common objects in those videos. Indeed, to our amusement, one of our artificial neurons learned to respond strongly to pictures of... cats. Remember that this network had never been told what a cat was, nor was it given even a single image labeled as a cat. Instead, it “discovered” what a cat looked like by itself from only unlabeled YouTube stills. That’s what we mean by self-taught learning.” [8] We can see the layers and the outline of mechanism of object recognition by machine. The node which responds to the face category stimulus, the other to cat face, etc. The image of cat is an output of features which are characteristic of cat face, registered by machine, i.e. the very comprehension of the idea of cat by machine.
I The symbol grounding problem
[9] What means epistemologically the breakthrough by Google Cat which gave a great impetus to the actual Third Wave of Artificial Intelligence? To address this question, we must go a little back to the history of AI and to the philosophical debate on foundation of the computation model of cognition.
What computers cannot do1972 and the following books by Hubert Dreyfus and or the papers by John Searle on Strong and Weak AIare classics of the critique of Artificial Intelligence. For semiotician like me, I am particularly interested in the discussion about the named Symbol grounding problem in Artificial Intelligence, the naming is from the Hungarian cognitive scientist Stevan Harnad 1990. But the question of symbol grounding in general is a traditional and well known semiotic problematique, at least since Frege or Peirce, common to actual analytic philosophy or phenomenology.
The actual third boom of AI concerns very closely this debate, because of its connectionist paradigm. Semioticians couldn’t ignore that because the question of symbol or sign is fundamentally of course their central matter. A fortiori we Asian semioticians must not remain indifferent to the very content of the claims by proponents.
[10] In this regard, I remarked a strange charm pointin the famous thought experiment anti-AI by John Searle namely The Chinese Room Argument (Searle 1980). I cite from Stanford Encyclopedia of Philosophy(entry "The Chinese Room Argument” ):
“ Searle imagines himself alone in a room following a computer program for responding to Chinese characters slipped under the door. Searle understands nothing of Chinese, and yet, by following the program for manipulating symbols and numerals just as a computer does, he produces appropriate strings of Chinese characters that fool those outside into thinking there is a Chinese speaker in the room.”
[11]Searle writes:
“a digital computer is a syntactical machine. It manipulates symbols and does nothing else. For this reason, the project of creating human intelligence by designing a computer program that will pass the Turing Test, the project I baptized years ago as Strong Artificial Intelligence (Strong AI), is doomed from the start. The appropriately programmed computer has a syntax but no semantics.
Minds, on the other hand, have mental or semantic content. I illustrated that in these pages with what came to be known as theChinese Room Argument. Imagine someone who doesn’t know Chinese—me, for example—following a computer program for answering questions in Chinese. We can suppose that I pass the Turing Test because, following the program, I give the correct answers to the questions in Chinese, but all the same, I do not understand a word of Chinese. And if I do not understand Chinese on the basis of implementing the computer program, neither does any other digital computer solely on that basis.”
What is denied by this thought experiment is the symbolism paradigm of AI. But I remarked a strange ambiguity in the naming of Chinese Room.
[12] What means in fact the Chinese for Searle in this Room? The Chinese language? Chinese people? Or the Chinese characters? If you read the text, there is no mention of Chinese speaking language; the only things you find in the situation are Chinese characters. The situation of communication is scriptural, whole communicates by writing. The situation is so to speak grammatological in Derridian sense.
[13] Then we need to enter in the room of Chinese characters, the room for Charateristica sinica, by which Leibniz was inspired for his conception of Characteristica universalis. Searle says in the text “I do not understand a word of Chinese”, but what meaning for this “word”? A speaking word, so as language, or a written word, so as writing -- écriture in French word -- ? If that is a written word or written symbol, I am not sure that Searle didn’t understand it at all. As we know, in Chinese writing system, there is the etymological classification of characters called "liùshū Six Writings" which explains the production and derivation of characters. For the categories “象形pictogram” and “指事diagram or index” they are directly motivated and semantic, while the other categories are combinations, derivations or phonetic borrowings, etc., so coded and syntactic. It is therefore not sure that Searle even if totally ignorant of Chinese writing system, couldn’t be aware of meaning of certain number of symbols which are directly semantic. In this sense, the Chinese room is symbolically grounded in part at least. And I think that Searle could invent his Chinese Room Argument because of his ignorance of the Chinese writing system. His argument is short in this regard and his responsibility is in fact limited to this extent. [14] We cannot not think of the Limited Inc.polemic with Derrida. The French philosopher would say : Searle is always already a computer because of his phonocentrism…. Thinking the writing as solely syntactic activity is certainly an effect of the vicious tradition of alphabetic phonetic writing system, the tradition which finally gave birth to invention of von Neumann machine in 20thcentury. Searle is in the same type of reasoning on syntactic character of writing than the symbolic paradigm for AI. I am quite surprised by the fact that nobody including Chinese people has made any remark in this sense until now about the Chinese Room.
[15] The symbolic AI is dead, as the aporia was remarked by Searle. But the current problem with AI is more with the connectionism.
To examine paradigms for AI, the semiotic perspective is fundamentally useful and instructive. With the semiotic pyramid diagram mapping the well known peircian distinction Icon/Index/Symbol(I modify the order inverting the Index and the Icon, I will return to this later) , the symbolic paradigm of AI corresponds to the symbol of Peircian diagram. The connectionist paradim with neural network with deep learning and Bayesian inference exploits the levels of iconic and categorical representation; this is a bottom Up approach that processes the input by perceptual stimulus to make categorization with statistical treatment of information. [16] From 10 millions of images, with recursive treatments to extract significative features of objects, the machine makes categories to identify face and the cat face in the end.
II The Sign grounding problem
[17] From the semiotic point of view, I think it is more appropriate to call the problematique addressed by Harnad Sign grounding problem to integrate it in the general semiotic debate.
[18] Long time before Google developed a huge capacity for computer machine learning technology and before the development of massive database called Big data, Stevan Harnad had called for a hybrid solution combining the symbolic formalism and the connectionist approach:
[19]
“Symbolic representations must be grounded bottom-up in nonsymbolic representations of two kinds: (1) iconic representations, which are analogs of the proximal sensory projections of distal objects and events, and (2) categorical representations, which are learned and innate feature detectors that pick out the invariant features of object and event categories from their sensory projections. Elementary symbols are the names of these object and event categories, assigned on the basis of their (nonsymbolic) categorical representations. Higher-order (3) symbolic representations, grounded in these elementary symbols, consist of symbol strings describing category membership relations (e.g. "An X is a Y that is Z "). Connectionism is one natural candidate for the mechanism that learns the invariant features underlying categorical representations, thereby connecting names to the proximal projections of the distal objects they stand for. In this way connectionism can be seen as a complementary component in a hybrid nonsymbolic/symbolic model of the mind, rather than a rival to purely symbolic modeling.”
[20] Since many years, I preconize mapping of semiotic problematique by the semiotic pyramid diagram, conceived from Peircian semiotic, modified by Daniel Bougnoux, French theorist of médiologie, and re-transformed by me to map the problems of media interface. I explained this schema in my last book translated in Korean language last year (here on slide)and I develop more in detail in my next book that I wrote with Azuma Hiroki 東浩紀, The Neo-semiotic: how brain meets media 新記号論 脳とメディアが出会うとき, coming soon next year.
[21] If I map the processing by the neural network for Google cat, explained in Harnad’s terms, and compare it with the semiotic pyramid, we obtain a parallelism between the neural network processing and the semiotic pyramid. The input of percepts
give at first level iconic representation, followed by categorial representation to attain the level of symbolic representation. That is exactly the semiosisby which a human being perceives objects, produces images, categorizes them to deliver to symbolic and logical operation. The benefice of this comparison with neural network for the semiotic is that this type of homologation hypothesis give a theoretical possibility to conceptualize a neural basis for the semiosis. I think fundamentally that the new semiotic perspective is open at the interface of the media – so informational – process and the brain.
give at first level iconic representation, followed by categorial representation to attain the level of symbolic representation. That is exactly the semiosisby which a human being perceives objects, produces images, categorizes them to deliver to symbolic and logical operation. The benefice of this comparison with neural network for the semiotic is that this type of homologation hypothesis give a theoretical possibility to conceptualize a neural basis for the semiosis. I think fundamentally that the new semiotic perspective is open at the interface of the media – so informational – process and the brain.
[21] Even if we can make work the comparison of the semiosis with the neural networking process, I make a neat distinction of the human semiosis and the machine’s information processing. I conceive thus a double pyramid for information semiotic: human semiosis is at interface with machine processing, while men live semiosis machines do information processing. That is the very media condition of actual modern civilization, we are in constant interface with computer.
[22] The actual development of AI by Deep learning technology is in this regard a spectacular achievement of the artificial semiotic technology completing the double semiotic pyramid. [23] With the machine learning without teacher, the predication and categorical judgment are likely possible. An artificial brain can make visible his idea of cat, so it can say perhaps “(this is)a cat”, or recognize in mirror and say “(I am)a cat”. But is it enough to have a name, a proper name? [24]
III Ontological and epistemological question
[25] With the achievement of deep learning and the object recognition, it is as if the machine were gaining a visual cortex and a faculty of high level categorical judgement. The inverted pyramid [ in red ] will have its own machine-ontological grounds and categories. That must give direct consequences for the critique of visual culture.
[26] Nowadays as I said, by development of media interface, the human semiosis is permanently in connection with the information processing. Machines are literally learning themselves the semiosis. The man-machine interface is now universal and machines are auto-educating.
[27] Does it mean for study of visual culture would be the End of Critique? Is that the consequence of the “End of Theory”, predicted by Chris Anderson the ex-director of the magazine Wiredin 2008 in context of promotion of Big data? I don’t think so, nor I agree with the thesis of the “End of Theory”. The data deluge would not make the theory obsolete.
[28] The transformation of the condition of Critique is certain. It is as if the AI were reflecting in Kantian terms of division in Transcendental Aesthetics and Transcendental Logic, or as if it were operating phenomenological or eidetic reduction. You would try different model of categorization in function of algorithms governing machines. Definitely the computer is a philosophical machine.
[29] As to the discussion on posthumanism, we would say with Foucault that the man is not anymore the hermeneutic fold of the modern Episteme : Foucault wrote in The Order of Things (Les Mots et les Choses)‘Man, in the analytic of finitude, is a strange empirico-transcendental doublet, since he is a being such that knowledge will be attained in him of what renders all knowledge possible’. As it was announced at the end of the book, this figure of man has disappeared. Man is not anymore in the center of knowledge. That is the meaning of the posthuman condition.
IV The Human as singularity: a question of proper name
[30] Now where the human fold which was the“empirico-transendental doublet” in Kantian sense has shifted? On the wall, I would say. Because now where is the Soseki’s cat? He is on the wall. Our Cat, is-it on the Wall, in reference to the Searle’s another famous argument “The cat is on the mat” ? [31]
Now you can see immediately in what sense “the cat is on the wall”. Our intentionality is now always already supplemented by the ubiquitous information and communication technologies environment. I have no time today to develop about the recent mode of the so called speculative realism and or speculative materialism. But the negation of the Kantian “correlativism” seems to correspond to this generalized and ubiquitous technological supplementarity. The cat was on the wall in 1905 in the quarter of Sendagi Tokyo in the house lived by Natsume Soseki, writer and ex-professor of the University of Tokyo, near the house I, professor of the same university, live now in 2018 in the same quarter walking each morning by the monument to go to the university. The cat which didn’t have a name, a proper name, perhaps surely lived in the quarter, was surely a singular being. Now you have a quasi-transcendental machinic eye to locate the place, identify the category cat and archive it without any learning by human teacher. The only existential human condition for the cat was his singularity, his proper name that unfortunately he didn’t have.
Conclusion
[32] To conclude, I would like to situate my talk in a general outline of my problematique. This reflection is a part of my interrogation about the general development of universal infosphere on going, in which all “communiquate”, all are becoming “data”, all “are networking”.
[33] The question of critique is now shifting toward the interface human-machine. In my schema, the bottom of the semiotic pyramid is a critical zone of the critique. We must not be frighten by the spectacular advance of the neuroscientific and informatic technologies which are now capable of many things, of a sort of transcendental critique. The human sciences have their own cognitive paradigms to go deeper in Critique of technological transcendentalism.
[34] That is what I called the Sign grounding problem. I have no time today to develop in details but I work in this sense to reformulate the three neo-semiotic paradigms with Peirce and his transcendental semiotic in the sense of Karl Otto Apel; [35]you have another paradigm in Husserl with his question of the flow of consciousness and internal temporality; and [36] Derrida with his grammatology.
[36]
My last word is that:
Il ne faut donner sa langue au chat !
(Do not give your tongue to the cat!)
even if we were in post humanity...
Thank you for your attention.