Preview

Proceedings of Telecommunication Universities

Advanced search

Reverse Engineering of Software Using the Smart Brute Force Method: Prototype and Experiment

https://doi.org/10.31854/1813-324X-2025-11-6-88-100

EDN: TQFFYA

Abstract

Introduction. one approach to finding vulnerabilities in programs is converting the executable machine code into human-oriented source code, which would be more suitable for an information security expert. The authors previously developed a corresponding method for «smart» enumeration of source code variants to identify a copy that compiles to a given machine code. A logical continuation of this research would be the implementation of a software prototype to test the method's performance and experimentally determine some characteristics.

Purpose: implementing the software prototype of smart exhaustive search of source code variants (according to the described method), as well as the experimental evaluation of its operability and the limits of its applicability.

Methods: software engineering, experimentation, approximation of values.

Results. the creation of a software prototype for selecting an instance of the source code according to the set machine code, obtained in previous author's studies. The prototype was used to obtain the source code of a mathematical expression from its machine code using part of the formal syntax of a programming language (defined in the form of a graph of syntactic rules). A series of experiments was conducted to evaluate the characteristics of the prototype by determining the following dependencies: the number of all source code variants on the syntactic heterogeneity of the syntax and the maximum depth of traversal of its graphical representation, as well as the search time for a specific source code from a set depth of traversal. These tests proved the basic functionality of the prototype and its hypothetical potential, which also justifies the possibility of opposite reverse engineering as compared to the traditional method - from source code, rather than machine code.

Practical significance: the current version of the prototype can be used practically to decompile small parts of machine code, without being dependent on a specific programming language and processor architecture (since only a compilation tool is required).

Discussion: the qualitive optimization of the "smart" exhaustive search through the use of artificial intelligence in terms of genetic algorithms can significantly improve the search.

About the Authors

K. E. Izrailov
Saint-Petersburg University of State Fire Service of EMERCOM of Russia
Russian Federation


M. V. Buinevich
MIREA – Russian Technological University
Russian Federation


References

1. Komarov V., Mesinova N., Evdokimova E. Analysis of the conditions for the implementation of information security threats through the exploitation of information asset vulnerabilities. Scientific and Analytical Journal «Vestnik Saint-Petersburg University of State Fire Service of EMERCOM of Russia». 2024;2:126–135. (in Russ.) DOI:10.61260/2218-130X-2024-2-126-135. EDN:NRXXGV

2. Le T.D., Pham M.H., Dinh T.D., Do H.P. Applying machine learning algorithms for PE-header-based malware detection on the Windows operating system. Information and Control Systems. 2022;4(119):44–57. (in Russ.) DOI:10.31799/1684-8853-2022-4-44-57. EDN:YFIBQJ

3. Izrailov K.E., Buinevich M.V. Reverse Engineering of Software Using the Smart Brute Force Method: Step-by-Step Scheme. Proceedings of Telecommunication Universities. 2025;11(4):129–142. (in Russ.) DOI:10.31854/1813-324X-2025-11-4-129-142. EDN:UOKLHB

4. Putro H.P., Yuhana U.L., Yuniarno E.M., Purnomo M.H. Source Code Statement Classification Using ANTLR and Random Forest. Proceedings of the International Seminar on Intelligent Technology and Its Applications, ISITIA, 26–27 July 2023, Surabaya, Indonesia. IEEE; 2023. p.60–65. DOI:10.1109/ISITIA59021.2023.10220999

5. Fu J., Zhang K., Zheng J., Li W., Zhu Y. Research and Application of Grey Box Detection Technology Based on Reverse Engineering and Dynamic Pollution Diffusion. Proceedings of the 7th Information Technology and Mechatronics Engineering Conference, ITOEC, 15–17 September 2023, Chongqing, China. IEEE; 2023. p.2380–2384. DOI:10.1109/ITOEC57671.2023.10291380

6. Devine T.R., Campbell M., Anderson M., Dzielski D. SREP+SAST: A Comparison of Tools for Reverse Engineering Machine Code to Detect Cybersecurity Vulnerabilities in Binary Executables. Proceedings of the International Conference on Computational Science and Computational Intelligence, CSCI, 14–16 December 2022, Las Vegas, USA. IEEE; 2022. p.862–869. DOI:10.1109/CSCI58124.2022.00156

7. Hu Y., Wang H., Zhang Y., Li B., Gu D. A Semantics-Based Hybrid Approach on Binary Code Similarity Comparison. Transactions on Software Engineering. 2021;47(6):1241–1258. DOI:10.1109/TSE.2019.2918326. EDN:ILNITT

8. Adamchuk N., Schlüter W. Automatic Acceptor Generation Based on EBNF Grammar Definition. Proceedings of the 11th International Conference on Advanced Computer Information Technologies, ACIT, 15–17 September 2021, Deggendorf, Germany. IEEE; 2021. p.618–622. DOI:10.1109/ACIT52158.2021.9548492

9. Savchenko A.A., Mineeva T.A. Assembly Programming Language. The Difference Between Low-Level and High-Level Languages. Tendencii razvitija nauki i obrazovanija. 2022;92-10:131–135. (in Russ.) DOI:10.18411/trnio-12-2022-502. EDN:QMZFNE

10. Nechesov A.V. Some questions on polynomially computable representations for generating grammars and Backus-Naur forms. Mathematical works. 2022;25(1):134–151. (in Russ.) DOI:10.33048/mattrudy.2022.25.106. EDN:SFDFPB

11. Ryazanov Yu.D., Nazina S.V. Building parsers based on syntax diagrams with multiport components. Applied Discrete Mathematics. 2022;55:102–119 (in Russ.) DOI:10.17223/20710410/55/8. EDN:XHAFEV

12. Tretyak A.V., Tretyak E.V., Vereshchagina E.A. Development of cognitive-ergonomic syntax for a new hardware-oriented programming language. Modern Science: Actual Problems of Theory and Practice". Series "Natural and Technical Sciences". 2020;7:145–153. (in Russ.) DOI:10.37882/2223-2966.2020.07.33. EDN:GVAAGG

13. Kostenko M.S., Cicareva V.V. Application of Efficient Graph Traversal Methods (Depth-First Search, Breadth-First Search) in Solving Problems of the Second Stage of the Republican Olympiad in the Subject "Computer Science". Sovremennoe obrazovanie Vitebshhiny. 2024;2(44):24–26. (in Russ.) EDN:WRZIGQ

14. Zaginajlo M. V., Fathi V. A. Genetic Algorithm as an Effective Tool for Evolutionary Algorithms. Innovacii. Nauka. Obrazovanie. 2020;22:513–518 (in Russ.) EDN:UTMAEL

15. Izrailov K.E. The genetic de-evolution concept of program representations. Part 1. Voprosy kiberbezopasnosti. 2024;1(59):61–66. (in Russ.) DOI:10.21681/2311-3456-2024-1-61-66. EDN:CBCKRF

16. Izrailov K.E. The genetic de-evolution concept of program representations. Part 2. Voprosy kiberbezopasnosti, 2024;2(60):81–86. (in Russ.) DOI:10.21681/2311-3456-2024-2-81-86. EDN:JUBPML

17. He H., Lin L., Yu T., Zhong X. CloneBAS: A Code Clone Detection Method Based on Abstract Syntax Tree and Simhash. Proceedings of the 3rd International Conference on Data Science and Computer Application, ICDSCA, 27–29 October 2023, Dalian, China. IEEE; 2023. p.1539–1544. DOI:10.1109/ICDSCA59871.2023.10392292

18. Izrailov K. GREMC: Genetic Reverse-Engineering of Machine Code to Search Vulnerabilities in Software for Industry 4.0. Predicting the Size of the Decompiling Source Code. Proceedings of the International Russian Smart Industry Conference, SmartIndustryCon, 25–29 March 2024, Sochi, Russian Federation. IEEE; 2024. p.622–628. DOI:10.1109/SmartIndustryCon61328.2024.10515515

19. Mironov S.V., Batraeva I.A., Dunaev P.D. Library for development of compilers. Proceedings of the Institute for System Programming of the RAS. 2022;34(5):77–88 (in Russ.). Doi:10.15514/ISPRAS-2022-34(5)-5. EDN:JPGPIY

20. Qu Z., Hu Y., Zeng J., Cai B., Yang S. Method Name Generation Based on Code Structure Guidance. Proceedings of the International Conference on Software Analysis, Evolution and Reengineering, SANER, 15–18 March 2022, Honolulu, USA. IEEE; 2022. p.1101–1110. DOI:10.1109/SANER53432.2022.00127

21. Petukhov M., Gudauskayte E., Kaliyev A., Oskin M., Ivanov D., Wang Q. Method Name Prediction for Automatically Generated Unit Tests. Proceedings of the International Conference on Code Quality, ICCQ, 23 April 2022, Innopolis, Russian Federation. IEEE; 2022. p.29–38. DOI:10.1109/ICCQ53703.2022.9763112. EDN:TOCMXI

22. Borodin A.V., Yudina M.A., Vasileva M.A. About the problem of classification on the neighborhood of the root of the control flow graph of the program in the context of process of reproduction of file computer viruses. Modern High Technologies. 2019;1:31–35 (in Russ.) EDN:VUCEWK

23. Kudelya V.N. Methods for enumerating paths in a graph. H&ES Research. 2023;15(5):28–38 (in Russ.) DOI:10.36724/2409-5419-2023-15-5-28-38. EDN:HQEASN

24. Kussainov A.R., Glazyrina N.S. Overview of static program code analysis tools. Colloquium-Journal. 2020;32-1(84):48–52. (in Russ.) EDN:JXSKQX

25. Kotenko I., Izrailov K., Buinevich M., Saenko I., Shorey R. Modeling the Development of Energy Network Software, Taking into Account the Detection and Elimination of Vulnerabilities. Energies. 2023;16(13):5111. DOI:10.3390/en16135111. EDN:CFRQLO

26. Pichugova L.N. Perspective technologies of reverse engineeringand fast prototyping. Fundamentalnye osnovy mehaniki. 2023;11:43–48. (in Russ.) DOI:10.26160/2542-0127-2023-11-43-48. EDN:CYVEES

27. Aralbaev R.A., Tarasov A.A. Optimization Problems and Application of Genetic Algorithms in Practice. Innovacii. Nauka. Obrazovanie. 2021;48:1645–1653. (in Russ.) EDN:VGUBIH


Review

For citations:


Izrailov K.E., Buinevich M.V. Reverse Engineering of Software Using the Smart Brute Force Method: Prototype and Experiment. Proceedings of Telecommunication Universities. 2025;11(6):88-100. (In Russ.) https://doi.org/10.31854/1813-324X-2025-11-6-88-100. EDN: TQFFYA

Views: 7


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1813-324X (Print)
ISSN 2712-8830 (Online)