<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">tuzsut</journal-id><journal-title-group><journal-title xml:lang="ru">Труды учебных заведений связи</journal-title><trans-title-group xml:lang="en"><trans-title>Proceedings of Telecommunication Universities</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1813-324X</issn><issn pub-type="epub">2712-8830</issn><publisher><publisher-name>СПбГУТ</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.31854/1813-324X-2020-6-4-80-90</article-id><article-id custom-type="elpub" pub-id-type="custom">tuzsut-143</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИНФОРМАТИКА, ВЫЧИСЛИТЕЛЬНАЯ ТЕХНИКА И УПРАВЛЕНИЕ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>INFORMATICS, COMPUTER ENGINEERING AND MANAGEMENT</subject></subj-group></article-categories><title-group><article-title>Предобработка текстов электронных писем в задаче обнаружения спама</article-title><trans-title-group xml:lang="en"><trans-title>Preprocessing of the Emails in the Spam Detection Task</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Корелов</surname><given-names>С. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Korelov</surname><given-names>S. ..</given-names></name></name-alternatives><email xlink:type="simple">korelovsv@cert.gov.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Петров</surname><given-names>А. М.</given-names></name><name name-style="western" xml:lang="en"><surname>Petrov</surname><given-names>A. ..</given-names></name></name-alternatives><email xlink:type="simple">noemail@neicon.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Ротков</surname><given-names>Л. Ю.</given-names></name><name name-style="western" xml:lang="en"><surname>Rotkov</surname><given-names>L. ..</given-names></name></name-alternatives><email xlink:type="simple">noemail@neicon.ru</email><xref ref-type="aff" rid="aff-2"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Горбунов</surname><given-names>А. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Gorbunov</surname><given-names>A. ..</given-names></name></name-alternatives><email xlink:type="simple">noemail@neicon.ru</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru">Национальный координационный центр по компьютерным инцидентам<country>Россия</country></aff><aff xml:lang="en">National Computer Incident Response &amp; Coordination Center<country>Russian Federation</country></aff></aff-alternatives><aff-alternatives id="aff-2"><aff xml:lang="ru">Национальный исследовательский Нижегородский государственный университет им. Н.И. Лобачевского<country>Россия</country></aff><aff xml:lang="en">National Research Lobachevsky State University of Nizhny Novgorod<country>Russian Federation</country></aff></aff-alternatives><pub-date pub-type="collection"><year>2020</year></pub-date><pub-date pub-type="epub"><day>13</day><month>04</month><year>2021</year></pub-date><volume>6</volume><issue>4</issue><fpage>80</fpage><lpage>90</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Корелов С.В., Петров А.М., Ротков Л.Ю., Горбунов А.А., 2021</copyright-statement><copyright-year>2021</copyright-year><copyright-holder xml:lang="ru">Корелов С.В., Петров А.М., Ротков Л.Ю., Горбунов А.А.</copyright-holder><copyright-holder xml:lang="en">Korelov S..., Petrov A..., Rotkov L..., Gorbunov A...</copyright-holder><license license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://tuzs.sut.ru/jour/article/view/143">https://tuzs.sut.ru/jour/article/view/143</self-uri><abstract><p>Функционирование практически любой организации в той или иной степени зависит от того, насколько надежно защищены ее информационные ресурсы от различных угроз безопасности информации, одной из которых является спам. При этом было совершено множество попыток раз и навсегда решить проблему его обнаружения. В данной предметной области постоянно ведутся исследования. По их результатам предлагаются и реализуются на практике различные подходы. Ранее авторами предложена модель электронных писем, учитывающая содержание электронных писем, которое зачастую меняется в зависимости от выполняемых пользователями задач и меняющихся их информационных потребностей.В настоящей статье обсуждается вопрос предобработки текстов электронных писем в задаче обнаружения спама с использованием модели электронных писем, полученной на основе генетического подхода к формированию математических моделей текстов, зарекомендовавшего себя для решения различных задач.</p></abstract><trans-abstract xml:lang="en"><p>The functioning of almost any organization to one degree or another depends on how reliably its information resources are protected from various information security threats, one of which is spam. At the same time there have been many attempts to solve the problem of its detection once and for all. Research is ongoing in this subject area constantly. Based on its results, various approaches are proposed and implemented in practice. The authors previously proposed a model of e-mails that takes into account the content of e-mails, which often changes depending on the tasks performed by users and their changing information needs. This article discusses the issue of preprocessing e-mail texts in the problem of spam detection using a model of e-mails obtained on the basis of a genetic approach to the formation of mathematical models of texts, which has proven itself for solving various problems.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>информационная безопасность</kwd><kwd>спам</kwd><kwd>обнаружение</kwd><kwd>модель электронного письма</kwd><kwd>генетический подход</kwd><kwd>генетическая модель</kwd><kwd>электронная почта</kwd><kwd>электронные почтовые сообщения</kwd><kwd>электронные письма</kwd><kwd>предобработка текста</kwd></kwd-group><kwd-group xml:lang="en"><kwd>information security</kwd><kwd>spam</kwd><kwd>detection</kwd><kwd>electronic letter model</kwd><kwd>genetic approach</kwd><kwd>genetic model</kwd><kwd>email</kwd><kwd>e-mail messages</kwd><kwd>electronic letters</kwd><kwd>text preprocessing</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Email Statistics Report, 2016-2020 // The Radicati Group. URL: https://www.radicati.com/?p=13546(дата обращения 25.11.2020)</mixed-citation><mixed-citation xml:lang="en">Email Statistics Report, 2016-2020 // The Radicati Group. URL: https://www.radicati.com/?p=13546(дата обращения 25.11.2020)</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Вергелис М., Щербакова Т., Сидорина Т. Спам и фишинг в 2018 году // Securelist. URL: https://securelist.ru/spam-and-phishing-in-2018/93453 (дата обращения 17.09.2019)</mixed-citation><mixed-citation xml:lang="en">Вергелис М., Щербакова Т., Сидорина Т. Спам и фишинг в 2018 году // Securelist. URL: https://securelist.ru/spam-and-phishing-in-2018/93453 (дата обращения 17.09.2019)</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Вергелис М., Щербакова Т., Сидорина Т., Куликова Т. Спам и фишинг в 2019 году // Securelist. URL: https://securelist.ru/spam-report-2019/95727 (дата обращения 29.10.2020)</mixed-citation><mixed-citation xml:lang="en">Вергелис М., Щербакова Т., Сидорина Т., Куликова Т. Спам и фишинг в 2019 году // Securelist. URL: https://securelist.ru/spam-report-2019/95727 (дата обращения 29.10.2020)</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Barushka, A., Hajek, P. Spam Filtering Using Integrated Distribution-Based Balancing Approach and Regularized Deep Neural Networks // Applied Intelligence. 2018. Vol. 48. PP. 3538-3556. DOI:10.1007/s10489-018-1161-y</mixed-citation><mixed-citation xml:lang="en">Barushka, A., Hajek, P. Spam Filtering Using Integrated Distribution-Based Balancing Approach and Regularized Deep Neural Networks // Applied Intelligence. 2018. Vol. 48. PP. 3538-3556. DOI:10.1007/s10489-018-1161-y</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Bhattacharya P., Singh A. E-mail Spam Filtering using Genetic Algorithm based on Probabilistic Weights and Words Count // International Journal of Integrated Engineering. 2020. Vol. 12. No. 1. PP. 40-49. DOI:10.30880/ijie.2020.12.01.004</mixed-citation><mixed-citation xml:lang="en">Bhattacharya P., Singh A. E-mail Spam Filtering using Genetic Algorithm based on Probabilistic Weights and Words Count // International Journal of Integrated Engineering. 2020. Vol. 12. No. 1. PP. 40-49. DOI:10.30880/ijie.2020.12.01.004</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Bibi A., Latif R., Khalid S., Ahmed W., Shabir R.A., Ansari M., et al. Spam Mail Scanning Using Machine Learning Algorithm // Journal of Computers. 2020. Vol. 15. No. 2. PP. 73-84. DOI:10.17706/jcp.15.2.73-84</mixed-citation><mixed-citation xml:lang="en">Bibi A., Latif R., Khalid S., Ahmed W., Shabir R.A., Ansari M., et al. Spam Mail Scanning Using Machine Learning Algorithm // Journal of Computers. 2020. Vol. 15. No. 2. PP. 73-84. DOI:10.17706/jcp.15.2.73-84</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Abdulhamid Sh.M., Shuaib M., Osho O., Ismaila I., Alhassan J.K. Comparative Analysis of Classification Algorithms for Email Spam Detection // International Journal of Computer Network and Information Security (IJCNIS). 2018. Vol. 10. No. 1. PP. 60-67. DOI:10.5815/ijcnis.2018.01.07</mixed-citation><mixed-citation xml:lang="en">Abdulhamid Sh.M., Shuaib M., Osho O., Ismaila I., Alhassan J.K. Comparative Analysis of Classification Algorithms for Email Spam Detection // International Journal of Computer Network and Information Security (IJCNIS). 2018. Vol. 10. No. 1. PP. 60-67. DOI:10.5815/ijcnis.2018.01.07</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Radhakrishnan A., Vaidhehi V. Email Classification Using Machine Learning Algorithms // International Journal of Engineering and Technology (IJET). 2017. Vol. 9. No. 2. PP. 335-340. DOI:10.21817/ijet/2017/v9i1/170902310</mixed-citation><mixed-citation xml:lang="en">Radhakrishnan A., Vaidhehi V. Email Classification Using Machine Learning Algorithms // International Journal of Engineering and Technology (IJET). 2017. Vol. 9. No. 2. PP. 335-340. DOI:10.21817/ijet/2017/v9i1/170902310</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Rusland N., Wahid N., Kasim Sh., Hafit H. Analysis of Naïve Bayes Algorithm for Email Spam Filtering across Multiple Datasets // Proceedings of International Research and Innovation Summit (IRIS2017, Melaka, Malaysia, 6-7 May 2017). IOP Conference Series: Materials Science and Engineering. Bristol: IOP Publishing, 2017. Vol. 226. DOI:10.1088/1757-899X/226/1/012091</mixed-citation><mixed-citation xml:lang="en">Rusland N., Wahid N., Kasim Sh., Hafit H. Analysis of Naïve Bayes Algorithm for Email Spam Filtering across Multiple Datasets // Proceedings of International Research and Innovation Summit (IRIS2017, Melaka, Malaysia, 6-7 May 2017). IOP Conference Series: Materials Science and Engineering. Bristol: IOP Publishing, 2017. Vol. 226. DOI:10.1088/1757-899X/226/1/012091</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Verma T., Gill N.S. Email Spams via Text Mining using Machine Learning Techniques // International Journal of Innovative Technology and Exploring Engineering (IJITEE). 2020. Vol. 9. No. 4. PP. 2535-2539. DOI:10.35940/ijitee.D1915.029420</mixed-citation><mixed-citation xml:lang="en">Verma T., Gill N.S. Email Spams via Text Mining using Machine Learning Techniques // International Journal of Innovative Technology and Exploring Engineering (IJITEE). 2020. Vol. 9. No. 4. PP. 2535-2539. DOI:10.35940/ijitee.D1915.029420</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Корелов С.В., Петров А.М., Ротков Л.Ю., Горбунов А.А. Модель электронных писем в задаче обнаружения спама // Вестник Поволжского государственного технологического университета. Серия: Радиотехнические и инфокоммуникационные системы. 2020. № 2(46). С. 44-54. DOI:10.25686/2306-2819.2020.2.44</mixed-citation><mixed-citation xml:lang="en">Корелов С.В., Петров А.М., Ротков Л.Ю., Горбунов А.А. Модель электронных писем в задаче обнаружения спама // Вестник Поволжского государственного технологического университета. Серия: Радиотехнические и инфокоммуникационные системы. 2020. № 2(46). С. 44-54. DOI:10.25686/2306-2819.2020.2.44</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Androutsopoulos I., Paliouras G., Michelakis E. Learning to Filter Unsolicited CommercialE-Mail // NCSR «Demokritos». Tech. Report number: 2004/2. 2004.</mixed-citation><mixed-citation xml:lang="en">Androutsopoulos I., Paliouras G., Michelakis E. Learning to Filter Unsolicited CommercialE-Mail // NCSR «Demokritos». Tech. Report number: 2004/2. 2004.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Sharaff A., Nagwani N., Dhadse A. Comparative Study of Classification Algorithms for Spam Email Detection // Shetty N., Prasad N., Nalini N. (eds) Emerging Research in Computing, Information, Communication and Applications. New Delhi: Springer, 2016. PP. 237-244. DOI:10.1007/978-81-322-2553-9_23</mixed-citation><mixed-citation xml:lang="en">Sharaff A., Nagwani N., Dhadse A. Comparative Study of Classification Algorithms for Spam Email Detection // Shetty N., Prasad N., Nalini N. (eds) Emerging Research in Computing, Information, Communication and Applications. New Delhi: Springer, 2016. PP. 237-244. DOI:10.1007/978-81-322-2553-9_23</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Androutsopoulos I., Koutsias J., Chandrinos K., Spyropoulos C. An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-Mail Messages // Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’00, Athens, Greece, 24-28 July 2000). New York: Association for Computing Machinery, 2000. PP. 160-167. DOI:10.1145/345508.345569</mixed-citation><mixed-citation xml:lang="en">Androutsopoulos I., Koutsias J., Chandrinos K., Spyropoulos C. An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-Mail Messages // Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’00, Athens, Greece, 24-28 July 2000). New York: Association for Computing Machinery, 2000. PP. 160-167. DOI:10.1145/345508.345569</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Metsis V., Androutsopoulos I., Paliouras G. Spam Filtering with Naive Bayes - Which Naive Bayes? // Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS 2006, Mountain View, USA, 27-28 July 2006). 2006. PP. 28-69.</mixed-citation><mixed-citation xml:lang="en">Metsis V., Androutsopoulos I., Paliouras G. Spam Filtering with Naive Bayes - Which Naive Bayes? // Proceedings of the 3rd Conference on Email and Anti-Spam (CEAS 2006, Mountain View, USA, 27-28 July 2006). 2006. PP. 28-69.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Visani Ch., Jadeja N., Modi M. A Study on Different Machine Learning Techniques for Spam Review Detection // Proceedings of the International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS, Chennai, India, 1-2 August 2017). IEEE, 2017. PP. 676-679. DOI:10.1109/ICECDS.2017.8389522</mixed-citation><mixed-citation xml:lang="en">Visani Ch., Jadeja N., Modi M. A Study on Different Machine Learning Techniques for Spam Review Detection // Proceedings of the International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS, Chennai, India, 1-2 August 2017). IEEE, 2017. PP. 676-679. DOI:10.1109/ICECDS.2017.8389522</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Carreras X., Marquez L. Boosting Trees for Anti-Spam Email Filtering // Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing (RANLP, 5-7 September 2001). 2001. PP. 58-64.</mixed-citation><mixed-citation xml:lang="en">Carreras X., Marquez L. Boosting Trees for Anti-Spam Email Filtering // Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing (RANLP, 5-7 September 2001). 2001. PP. 58-64.</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Sheu JJ., Chen YK., Chu KT., Tang JH., Yang WP. An Intelligent Three-Phase Spam Filtering Method Based on Decision Tree Data Mining // Security and Communication Networks. 2016. Vol. 9. No. 17. PP. 4013-4026. DOI:10.1002/sec.1584</mixed-citation><mixed-citation xml:lang="en">Sheu JJ., Chen YK., Chu KT., Tang JH., Yang WP. An Intelligent Three-Phase Spam Filtering Method Based on Decision Tree Data Mining // Security and Communication Networks. 2016. Vol. 9. No. 17. PP. 4013-4026. DOI:10.1002/sec.1584</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Drucker H., Wu D., Vapnik V. Support Vector Machine for Spam Categorization // IEEE Transactions on Neural Networks. 1999. Vol. 10. No. 5. PP. 1048-1054. DOI:10.1109/72.788645</mixed-citation><mixed-citation xml:lang="en">Drucker H., Wu D., Vapnik V. Support Vector Machine for Spam Categorization // IEEE Transactions on Neural Networks. 1999. Vol. 10. No. 5. PP. 1048-1054. DOI:10.1109/72.788645</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Jiang S., Pang G., Wu M., Kuang L. An Improved k-Nearest-Neighbor Algorithm for Text Categorization // Expert System with Applications. 2012. Vol. 39. No. 1. PP. 1503-1509. DOI:10.1016/j.eswa.2011.08.040</mixed-citation><mixed-citation xml:lang="en">Jiang S., Pang G., Wu M., Kuang L. An Improved k-Nearest-Neighbor Algorithm for Text Categorization // Expert System with Applications. 2012. Vol. 39. No. 1. PP. 1503-1509. DOI:10.1016/j.eswa.2011.08.040</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Yue X., Abraham A., Chi ZX., Hao YY., Mo H. Artificial Immune System Inspired Behavior-Based Anti-Spam Filter // Soft Computing. 2007. Vol. 11. PP. 729-740. DOI:10.1007/s00500-006-0116-0</mixed-citation><mixed-citation xml:lang="en">Yue X., Abraham A., Chi ZX., Hao YY., Mo H. Artificial Immune System Inspired Behavior-Based Anti-Spam Filter // Soft Computing. 2007. Vol. 11. PP. 729-740. DOI:10.1007/s00500-006-0116-0</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Малыхина М.П., Частикова В.А., Биктимиров А.А. Методика обнаружения спама на основе искусственных иммунных систем // Вестник Астраханского государственного технического университета. Серия: Управление, вычислительная техника и информатика. 2018. № 3. С. 38-48. DOI:10.24143/2072-9502-2018-3-38-48</mixed-citation><mixed-citation xml:lang="en">Малыхина М.П., Частикова В.А., Биктимиров А.А. Методика обнаружения спама на основе искусственных иммунных систем // Вестник Астраханского государственного технического университета. Серия: Управление, вычислительная техника и информатика. 2018. № 3. С. 38-48. DOI:10.24143/2072-9502-2018-3-38-48</mixed-citation></citation-alternatives></ref><ref id="cit23"><label>23</label><citation-alternatives><mixed-citation xml:lang="ru">Clark J., Koprinska I., Poon J. A Neural Network Based Approach to Automated Email Classification // Proceedings of the IEEE/WIC International Conference on Web Intelligence (WI 2003, Halifax, Canada, 13-17 October 2003). IEEE, 2003. PP. 702-705. DOI:10.1109/WI.2003.1241300</mixed-citation><mixed-citation xml:lang="en">Clark J., Koprinska I., Poon J. A Neural Network Based Approach to Automated Email Classification // Proceedings of the IEEE/WIC International Conference on Web Intelligence (WI 2003, Halifax, Canada, 13-17 October 2003). IEEE, 2003. PP. 702-705. DOI:10.1109/WI.2003.1241300</mixed-citation></citation-alternatives></ref><ref id="cit24"><label>24</label><citation-alternatives><mixed-citation xml:lang="ru">Катасёв А.С., Катасёва Д.В., Кирпичников А.П. Нейросетевая технология классификации электронных почтовых сообщений // Вестник технологического университета. 2015. Т. 18. № 5. С. 180-183.</mixed-citation><mixed-citation xml:lang="en">Катасёв А.С., Катасёва Д.В., Кирпичников А.П. Нейросетевая технология классификации электронных почтовых сообщений // Вестник технологического университета. 2015. Т. 18. № 5. С. 180-183.</mixed-citation></citation-alternatives></ref><ref id="cit25"><label>25</label><citation-alternatives><mixed-citation xml:lang="ru">Катасёв А.С., Катасёва Д.В., Кирпичников А.П., Семёнов Я.Е. Спам-фильтрация электронных почтовых сообщений на основе нейросетевой и нейронечеткой моделей // Вестник технологического университета. 2015. Т. 18. № 15. С. 217-221.</mixed-citation><mixed-citation xml:lang="en">Катасёв А.С., Катасёва Д.В., Кирпичников А.П., Семёнов Я.Е. Спам-фильтрация электронных почтовых сообщений на основе нейросетевой и нейронечеткой моделей // Вестник технологического университета. 2015. Т. 18. № 15. С. 217-221.</mixed-citation></citation-alternatives></ref><ref id="cit26"><label>26</label><citation-alternatives><mixed-citation xml:lang="ru">Катасёв А.С., Катасёва Д.В. Разработка нейросетевой системы классификации электронных почтовых сообщений // Вестник Казанского государственного энергетического университета. 2015. № 1(25). С. 68-78.</mixed-citation><mixed-citation xml:lang="en">Катасёв А.С., Катасёва Д.В. Разработка нейросетевой системы классификации электронных почтовых сообщений // Вестник Казанского государственного энергетического университета. 2015. № 1(25). С. 68-78.</mixed-citation></citation-alternatives></ref><ref id="cit27"><label>27</label><citation-alternatives><mixed-citation xml:lang="ru">Ларионова А.В., Хорев П.Б. Метод фильтрации спама на основе искусственной нейронной сети // Науковедение. 2016. Т. 8. № 3. URL: http://naukovedenie.ru/PDF/04TVN316.pdf (дата обращения 26.11.2020)</mixed-citation><mixed-citation xml:lang="en">Ларионова А.В., Хорев П.Б. Метод фильтрации спама на основе искусственной нейронной сети // Науковедение. 2016. Т. 8. № 3. URL: http://naukovedenie.ru/PDF/04TVN316.pdf (дата обращения 26.11.2020)</mixed-citation></citation-alternatives></ref><ref id="cit28"><label>28</label><citation-alternatives><mixed-citation xml:lang="ru">Ларионова А.В., Хорев П.Б. Оценка эффективности метода фильтрации спама на основе искусственной нейронной сети // Науковедение. 2016. Т. 8. № 2. DOI:10.15862/134TVN216</mixed-citation><mixed-citation xml:lang="en">Ларионова А.В., Хорев П.Б. Оценка эффективности метода фильтрации спама на основе искусственной нейронной сети // Науковедение. 2016. Т. 8. № 2. DOI:10.15862/134TVN216</mixed-citation></citation-alternatives></ref><ref id="cit29"><label>29</label><citation-alternatives><mixed-citation xml:lang="ru">Hussain N., Turab Mirza H., Rasool G., Hussain I., Kaleem M. Spam Review Detection Techniques: A Systematic Literature Review // Applied Sciences. 2019. Vol. 9. No. 5. PP. 1-26. DOI:10.3390/app9050987</mixed-citation><mixed-citation xml:lang="en">Hussain N., Turab Mirza H., Rasool G., Hussain I., Kaleem M. Spam Review Detection Techniques: A Systematic Literature Review // Applied Sciences. 2019. Vol. 9. No. 5. PP. 1-26. DOI:10.3390/app9050987</mixed-citation></citation-alternatives></ref><ref id="cit30"><label>30</label><citation-alternatives><mixed-citation xml:lang="ru">Корелов С.В., Петров А.М., Ротков Л.Ю., Горбунов А.А. К вопросу об определении численного значения параметра в модели электронных писем // Труды XXIV научной конференции по радиофизике, посвященной 75-летию радиофизического факультета (Нижний Новгород, Российская Федерация, 13-31 мая 2020). Нижний Новгород: ННГУ, 2020. С. 471-474. URL: http://www.rf.unn.ru/wp-content/uploads/sites/21/2020/10/rf-conf-2020-book-1.pdf (дата обращения 26.11.2020)</mixed-citation><mixed-citation xml:lang="en">Корелов С.В., Петров А.М., Ротков Л.Ю., Горбунов А.А. К вопросу об определении численного значения параметра в модели электронных писем // Труды XXIV научной конференции по радиофизике, посвященной 75-летию радиофизического факультета (Нижний Новгород, Российская Федерация, 13-31 мая 2020). Нижний Новгород: ННГУ, 2020. С. 471-474. URL: http://www.rf.unn.ru/wp-content/uploads/sites/21/2020/10/rf-conf-2020-book-1.pdf (дата обращения 26.11.2020)</mixed-citation></citation-alternatives></ref><ref id="cit31"><label>31</label><citation-alternatives><mixed-citation xml:lang="ru">Климов Д.В. Предобработка текстовых сообщений для метрического классификатора // Символ науки. 2017. № 12. C. 25-32.</mixed-citation><mixed-citation xml:lang="en">Климов Д.В. Предобработка текстовых сообщений для метрического классификатора // Символ науки. 2017. № 12. C. 25-32.</mixed-citation></citation-alternatives></ref><ref id="cit32"><label>32</label><citation-alternatives><mixed-citation xml:lang="ru">Haddi E., Liu X., Shi Y. The Role of Text Pre-processing in Sentiment Analysis // Procedia Computer Science. 2013. Vol. 17. PP. 26-32. DOI:10.1016/j.procs.2013.05.005</mixed-citation><mixed-citation xml:lang="en">Haddi E., Liu X., Shi Y. The Role of Text Pre-processing in Sentiment Analysis // Procedia Computer Science. 2013. Vol. 17. PP. 26-32. DOI:10.1016/j.procs.2013.05.005</mixed-citation></citation-alternatives></ref><ref id="cit33"><label>33</label><citation-alternatives><mixed-citation xml:lang="ru">Devaraj S., Krishnakumar A. Effective Search Engine Spam Classification // International Journal of Recent Technology and Engineering (IJRTE). 2019. Vol. 8. No. 2S8. PP. 1541-1545. DOI:10.35940/ijrte.B1100.0882S819</mixed-citation><mixed-citation xml:lang="en">Devaraj S., Krishnakumar A. Effective Search Engine Spam Classification // International Journal of Recent Technology and Engineering (IJRTE). 2019. Vol. 8. No. 2S8. PP. 1541-1545. DOI:10.35940/ijrte.B1100.0882S819</mixed-citation></citation-alternatives></ref><ref id="cit34"><label>34</label><citation-alternatives><mixed-citation xml:lang="ru">HaCohen-Kerner Y., Miller D., Yigal Y. The influence of preprocessing on text classification using a bag-of-words representation // PLoS ONE. 2020. Vol. 15(5): e0232525. DOI:10.1371/journal.pone.0232525</mixed-citation><mixed-citation xml:lang="en">HaCohen-Kerner Y., Miller D., Yigal Y. The influence of preprocessing on text classification using a bag-of-words representation // PLoS ONE. 2020. Vol. 15(5): e0232525. DOI:10.1371/journal.pone.0232525</mixed-citation></citation-alternatives></ref><ref id="cit35"><label>35</label><citation-alternatives><mixed-citation xml:lang="ru">Vijayarani S., Ilamathi J., Nithya M. Preprocessing Techniques for Text Mining - An Overview // International Journal of Computer Science &amp; Communication Networks. 2015. Vol. 5. No. 1. PP. 7-16.</mixed-citation><mixed-citation xml:lang="en">Vijayarani S., Ilamathi J., Nithya M. Preprocessing Techniques for Text Mining - An Overview // International Journal of Computer Science &amp; Communication Networks. 2015. Vol. 5. No. 1. PP. 7-16.</mixed-citation></citation-alternatives></ref><ref id="cit36"><label>36</label><citation-alternatives><mixed-citation xml:lang="ru">Weng J. NLP Text Preprocessing: A Practical Guide and Template. URL: https://towardsdatascience.com/nlp-text-preprocessing-a-practical-guide-and-template-d80874676e79 (дата обращения 14.07.2020)</mixed-citation><mixed-citation xml:lang="en">Weng J. NLP Text Preprocessing: A Practical Guide and Template. URL: https://towardsdatascience.com/nlp-text-preprocessing-a-practical-guide-and-template-d80874676e79 (дата обращения 14.07.2020)</mixed-citation></citation-alternatives></ref><ref id="cit37"><label>37</label><citation-alternatives><mixed-citation xml:lang="ru">Uysal A., Gunal S. The Impact of Preprocessing on Text Classification // Information Processing &amp; Management. 2014. Vol. 50. No. 1. PP. 104-112. DOI:10.1016/j.ipm.2013.08.006</mixed-citation><mixed-citation xml:lang="en">Uysal A., Gunal S. The Impact of Preprocessing on Text Classification // Information Processing &amp; Management. 2014. Vol. 50. No. 1. PP. 104-112. DOI:10.1016/j.ipm.2013.08.006</mixed-citation></citation-alternatives></ref><ref id="cit38"><label>38</label><citation-alternatives><mixed-citation xml:lang="ru">Enron-Spam datasets. URL: http://www2.aueb.gr/users/ion/data/enron-spam (дата обращения 26.11.2020)</mixed-citation><mixed-citation xml:lang="en">Enron-Spam datasets. URL: http://www2.aueb.gr/users/ion/data/enron-spam (дата обращения 26.11.2020)</mixed-citation></citation-alternatives></ref><ref id="cit39"><label>39</label><citation-alternatives><mixed-citation xml:lang="ru">Sebastiani F. Machine Learning in Automated Text Categorization // ACM Computing Surveys. 2002. Vol. 34. No. 1. PP. 1-47. DOI:10.1145/505282.505283</mixed-citation><mixed-citation xml:lang="en">Sebastiani F. Machine Learning in Automated Text Categorization // ACM Computing Surveys. 2002. Vol. 34. No. 1. PP. 1-47. DOI:10.1145/505282.505283</mixed-citation></citation-alternatives></ref><ref id="cit40"><label>40</label><citation-alternatives><mixed-citation xml:lang="ru">Sebastiani F. Text Categorization // Zanasi A. (ed.). Text Mining and its Applications. Southampton: WIT Press, 2005. PP. 109-129.</mixed-citation><mixed-citation xml:lang="en">Sebastiani F. Text Categorization // Zanasi A. (ed.). Text Mining and its Applications. Southampton: WIT Press, 2005. PP. 109-129.</mixed-citation></citation-alternatives></ref><ref id="cit41"><label>41</label><citation-alternatives><mixed-citation xml:lang="ru">Aas K., Eikvil L. Text Categorisation: A Survey // Norwegian Computing Center. Tech. Report number: 941, 1999.</mixed-citation><mixed-citation xml:lang="en">Aas K., Eikvil L. Text Categorisation: A Survey // Norwegian Computing Center. Tech. Report number: 941, 1999.</mixed-citation></citation-alternatives></ref><ref id="cit42"><label>42</label><citation-alternatives><mixed-citation xml:lang="ru">Manning C., Raghavan P., Shütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press, 2008. DOI:10.1017/CBO9780511809071</mixed-citation><mixed-citation xml:lang="en">Manning C., Raghavan P., Shütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press, 2008. DOI:10.1017/CBO9780511809071</mixed-citation></citation-alternatives></ref><ref id="cit43"><label>43</label><citation-alternatives><mixed-citation xml:lang="ru">Sokolova M., Lapalme G. A Systematic Analysis of Performance Measures for Classification Tasks // Information Processing &amp; Management. 2009. Vol. 45. Iss. 4. PP. 427-437. DOI:10.1016/j.ipm.2009.03.002</mixed-citation><mixed-citation xml:lang="en">Sokolova M., Lapalme G. A Systematic Analysis of Performance Measures for Classification Tasks // Information Processing &amp; Management. 2009. Vol. 45. Iss. 4. PP. 427-437. DOI:10.1016/j.ipm.2009.03.002</mixed-citation></citation-alternatives></ref><ref id="cit44"><label>44</label><citation-alternatives><mixed-citation xml:lang="ru">Мироненко А.Н. Алгоритм контентной фильтрации спама на базе совмещения метода опорных векторов и нейронных сетей. Автореферат дис. … канд. техн. наук. Санкт-Петербург, 2012. 18 с.</mixed-citation><mixed-citation xml:lang="en">Мироненко А.Н. Алгоритм контентной фильтрации спама на базе совмещения метода опорных векторов и нейронных сетей. Автореферат дис. … канд. техн. наук. Санкт-Петербург, 2012. 18 с.</mixed-citation></citation-alternatives></ref><ref id="cit45"><label>45</label><citation-alternatives><mixed-citation xml:lang="ru">Чернопрудова Е.Н. Защита почтовых сервисов от несанкционированных рассылок на основе контентной фильтрации электронных сообщений. Автореферат дис. … канд. техн. наук. Уфа, 2013. 16 с.</mixed-citation><mixed-citation xml:lang="en">Чернопрудова Е.Н. Защита почтовых сервисов от несанкционированных рассылок на основе контентной фильтрации электронных сообщений. Автореферат дис. … канд. техн. наук. Уфа, 2013. 16 с.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
