<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">tuzsut</journal-id><journal-title-group><journal-title xml:lang="ru">Труды учебных заведений связи</journal-title><trans-title-group xml:lang="en"><trans-title>Proceedings of Telecommunication Universities</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1813-324X</issn><issn pub-type="epub">2712-8830</issn><publisher><publisher-name>СПбГУТ</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.31854/1813-324X-2019-5-3-58-65</article-id><article-id custom-type="elpub" pub-id-type="custom">tuzsut-87</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИНФОРМАТИКА, ВЫЧИСЛИТЕЛЬНАЯ ТЕХНИКА И УПРАВЛЕНИЕ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>INFORMATICS, COMPUTER ENGINEERING AND MANAGEMENT</subject></subj-group></article-categories><title-group><article-title>Методика многоаспектной оценки и категоризации вредоносных информационных объектов в сети Интернет</article-title><trans-title-group xml:lang="en"><trans-title>The Technique of Multi-aspect Evaluation and Categorization of Malicious Information Objects on the Internet</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Браницкий</surname><given-names>А. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Branitskiy</surname><given-names>A. ..</given-names></name></name-alternatives><email xlink:type="simple">alexander.branitskiy@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Саенко</surname><given-names>И. Б.</given-names></name><name name-style="western" xml:lang="en"><surname>Saenko</surname><given-names>I. ..</given-names></name></name-alternatives><email xlink:type="simple">noemail@neicon.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru">Санкт-Петербургский институт информатики и автоматизации Российской академии наук<country>Россия</country></aff><aff xml:lang="en">Saint-Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences<country>Russian Federation</country></aff></aff-alternatives><pub-date pub-type="collection"><year>2019</year></pub-date><pub-date pub-type="epub"><day>13</day><month>04</month><year>2021</year></pub-date><volume>5</volume><issue>3</issue><fpage>58</fpage><lpage>65</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Браницкий А.А., Саенко И.Б., 2021</copyright-statement><copyright-year>2021</copyright-year><copyright-holder xml:lang="ru">Браницкий А.А., Саенко И.Б.</copyright-holder><copyright-holder xml:lang="en">Branitskiy A..., Saenko I...</copyright-holder><license license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://tuzs.sut.ru/jour/article/view/87">https://tuzs.sut.ru/jour/article/view/87</self-uri><abstract><p>В условиях быстрого развития информационных технологий возникает задача, связанная с обнаружением источников вредоносной информации в сети Интернет. Для ее решения могут применяться методы машинного обучения как один из наиболее популярных и мощных инструментов, предназначенных для выявления зависимостей между входными (наблюдаемыми) данными и выходными (желаемыми) результатами. В данной статье представлена методика, направленная на многоуровневую обработку входных данных о вредоносных информационных объектах в сети Интернет и обеспечивающая их многоаспектную оценку и категоризацию с использованием методов машинного обучения. Цель исследования заключается в повышении эффективности процесса обнаружения вредоносной информации в сети Интернет на примере задачи классификации веб-страниц.</p></abstract><trans-abstract xml:lang="en"><p>Under the influence of rapid development in the sphere of information technologies, rises the challenge related to detection of malicious information sources on the Internet. To solve this we can use machine learning methods as one of the most popular and powerful tools designed to identify dependencies between input (observed) data and output (desired) results. This article presents a methodology which is aimed at multi-level processing of input data about malicious information objects on the Internet and providing their multi-aspect assessment and categorization using machine learning methods. The purpose of the investigation is to improve the efficiency of the detecting process of malicious information on the Internet using the examples of Web-pages classification.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>информационные объекты</kwd><kwd>вредоносная информация</kwd><kwd>классификаторы</kwd><kwd>веб-страницы</kwd><kwd>многоуровневая схема комбинирования</kwd></kwd-group><kwd-group xml:lang="en"><kwd>information objects</kwd><kwd>malicious information</kwd><kwd>classifiers</kwd><kwd>Web-pages</kwd><kwd>multi-level combination scheme</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Hayes P.J., Andersen P.M., Nirenburg I.B., Schmandt L.M. TCS: a shell for content-based text categorization // Proceedings of the Sixth Conference on Artificial Intelligence Applications (Santa Barbara, USA, 5-9 May 1990). Piscataway, NJ: IEEE, 1990. Vol. 1. PP. 320-326. DOI:10.1109/CAIA.1990.89206</mixed-citation><mixed-citation xml:lang="en">Hayes P.J., Andersen P.M., Nirenburg I.B., Schmandt L.M. TCS: a shell for content-based text categorization // Proceedings of the Sixth Conference on Artificial Intelligence Applications (Santa Barbara, USA, 5-9 May 1990). Piscataway, NJ: IEEE, 1990. Vol. 1. PP. 320-326. DOI:10.1109/CAIA.1990.89206</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Apté C., Damerau F., Weiss S.M. Automated learning of decision rules for text categorization // ACM Transactions on Information Systems (TOIS). 1994. Vol. 12. Iss. 3. PP. 233-251. DOI:10.1145/183422.183423</mixed-citation><mixed-citation xml:lang="en">Apté C., Damerau F., Weiss S.M. Automated learning of decision rules for text categorization // ACM Transactions on Information Systems (TOIS). 1994. Vol. 12. Iss. 3. PP. 233-251. DOI:10.1145/183422.183423</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Salton G., Buckley C. Term-weighting approaches in automatic text retrieval // Information Processing &amp; Management. 1988. Vol. 24. Iss. 5. PP. 513-523. DOI:10.1016/0306-4573(88)90021-0</mixed-citation><mixed-citation xml:lang="en">Salton G., Buckley C. Term-weighting approaches in automatic text retrieval // Information Processing &amp; Management. 1988. Vol. 24. Iss. 5. PP. 513-523. DOI:10.1016/0306-4573(88)90021-0</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Fattah M.A. A Novel Statistical Feature Selection Approach for Text Categorization // Journal of Information Processing Systems. 2017. Vol. 13. Iss. 5. PP. 1397-1409.</mixed-citation><mixed-citation xml:lang="en">Fattah M.A. A Novel Statistical Feature Selection Approach for Text Categorization // Journal of Information Processing Systems. 2017. Vol. 13. Iss. 5. PP. 1397-1409.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Lewis D.D., Ringuette M. A Comparison of Two Learning Algorithms for Text Categorization // In: Third Annual Symposium on Document Analysis and Information Retrieval. 1994. PP. 81-93.</mixed-citation><mixed-citation xml:lang="en">Lewis D.D., Ringuette M. A Comparison of Two Learning Algorithms for Text Categorization // In: Third Annual Symposium on Document Analysis and Information Retrieval. 1994. PP. 81-93.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Joachims T. Text categorization with Support Vector Machines: learning with many relevant features // Proceedings of the 10th European Conference on Machine Learning (ECML, Chemnitz, Germany, 21-23 April 1998). Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence). Berlin, Heidelberg: Springer, 1998. Vol. 1398. PP. 137-142. DOI:10.1007/BFb0026683</mixed-citation><mixed-citation xml:lang="en">Joachims T. Text categorization with Support Vector Machines: learning with many relevant features // Proceedings of the 10th European Conference on Machine Learning (ECML, Chemnitz, Germany, 21-23 April 1998). Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence). Berlin, Heidelberg: Springer, 1998. Vol. 1398. PP. 137-142. DOI:10.1007/BFb0026683</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Johnson R., Zhang T. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks // Proceeding of the Annual Conference of the North American Chapter of the Association for Computational Linguistics "Human Language Technologies" (Denver, USA, 31 May - 5 June 2015). Stroudsburg: Association for Computational Linguistics,2015. PP. 103-112. DOI:10.3115/v1/N15-1011</mixed-citation><mixed-citation xml:lang="en">Johnson R., Zhang T. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks // Proceeding of the Annual Conference of the North American Chapter of the Association for Computational Linguistics "Human Language Technologies" (Denver, USA, 31 May - 5 June 2015). Stroudsburg: Association for Computational Linguistics,2015. PP. 103-112. DOI:10.3115/v1/N15-1011</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Ghareb A.S., Bakar A.A., Hamdan A.R. Hybrid feature selection based on enhanced genetic algorithm for text categorization // Expert Systems with Applications. 2016. Vol. 49. Iss. C. PP. 31-47. DOI:10.1016/j.eswa.2015.12.004</mixed-citation><mixed-citation xml:lang="en">Ghareb A.S., Bakar A.A., Hamdan A.R. Hybrid feature selection based on enhanced genetic algorithm for text categorization // Expert Systems with Applications. 2016. Vol. 49. Iss. C. PP. 31-47. DOI:10.1016/j.eswa.2015.12.004</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Lorena A.C., De Carvalho A.C., Gama J.M.P. A review on the combination of binary classifiers in multiclass problems // Artificial Intelligence Review. 2008. Vol. 30. Iss. 1-4. DOI:10.1007/s10462-009-9114-9</mixed-citation><mixed-citation xml:lang="en">Lorena A.C., De Carvalho A.C., Gama J.M.P. A review on the combination of binary classifiers in multiclass problems // Artificial Intelligence Review. 2008. Vol. 30. Iss. 1-4. DOI:10.1007/s10462-009-9114-9</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Kotenko I., Chechulin A., Shorov A., Komashinsky D. Analysis and Evaluation of Web Pages Classification Techniques for Inappropriate Content Blocking // Proceeding of the 14th Industrial Conference on Data Mining "Advances in Data Mining. Applications and Theoretical Aspects" (ICDM, St. Petersburg, Russia, 16-20 July 2014). Lecture Notes in Computer Science. Cham: Springer, 2014. Vol. 8557. PP. 39-54. DOI:10.1007/978-3-319-08976-8_4</mixed-citation><mixed-citation xml:lang="en">Kotenko I., Chechulin A., Shorov A., Komashinsky D. Analysis and Evaluation of Web Pages Classification Techniques for Inappropriate Content Blocking // Proceeding of the 14th Industrial Conference on Data Mining "Advances in Data Mining. Applications and Theoretical Aspects" (ICDM, St. Petersburg, Russia, 16-20 July 2014). Lecture Notes in Computer Science. Cham: Springer, 2014. Vol. 8557. PP. 39-54. DOI:10.1007/978-3-319-08976-8_4</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Mikolov T., Chen K., Corrado G., Dean J. Efficient Estimation of Word Representations in Vector Space. 2013. URL: https:// arxiv.org/pdf/1301.3781 (дата обращения 10.04.2019)</mixed-citation><mixed-citation xml:lang="en">Mikolov T., Chen K., Corrado G., Dean J. Efficient Estimation of Word Representations in Vector Space. 2013. URL: https:// arxiv.org/pdf/1301.3781 (дата обращения 10.04.2019)</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
