7S-07
Universal Rules for Fooling Deep Text Classification
○李 ディ,ヴァルガス ダニロ,櫻井幸一(九大)
Recently, deep learning based natural language processing techniques are being extensively used to deal with spam mail, censorship evaluation in social networks, among others. However, there is only a couple of works evaluating the vulnerabilities of such deep neural networks. We investigate here the existence of universal rules which could modify any text to fool a deep neural network. For that we propose an optimization algorithm to develop such rules. Our preliminary results show that more than 2000 modified texts are misclassified from 10000 modified samples. Thus, the proposed attack shows 28\% of the samples can be fooled with such general rule, showing that the principle hold and further improvement on the rule and evolutionary algorithm may reveal further security issues.