Portfolio item number 2
Short description of portfolio item number 2 
Short description of portfolio item number 2 
Published in Proceedings of the 64th All-Russian Scientific Conference of MIPT, 2021
The paper is devoted to static verification of applications optimized by BOLT, which ensured the detection of errors on SPEC CPU 2017 when writing new optimizations.
Recommended citation: Lisitsyn S., Shurygin A. (2021). "Verification of static binary optimizing translation for RISC architecture" Proceedings of the 64th All-Russian Scientific Conference of MIPT. November 29 โ December 03, 2021 Radio Engineering and Computer Technologies.
Download Paper
Published in Proceedings of the 65th All-Russian Scientific Conference of MIPT in honor of the 115th anniversary of L.D. Landau, 2023
The paper is devoted to the development of an algorithm for generating a profile based on a binary execution trace under the RISC architecture. The output format of the Callgrind application was taken as a basis, so that it was possible to visualize the execution profile using KCachegrind.
Recommended citation: Shurygin A., Petushkov I. (2023). "Detailed profile generation and visualization for RISC architectures based on program execution traces." Proceedings of the 65th All-Russian Scientific Conference of MIPT in honor of the 115th anniversary of L.D. Landau, April 3โ8, 2023. Radio engineering and computer technology..
Download Paper | Download Slides
Published in XI International Conference. "Engineering and Telecommunications โ En&T 2024", 2025
The paper solves the problem of analyzing the behavior of programs for RISC architectures based on binary execution traces. As part of the work, the profile generation algorithm was improved, and its subsequent visualization was supported using the KCachegrind application with an accuracy of up to linear sections of code. As a result of the work, accurate application execution profiles were obtained on the SPEC CPU 2017 performance benchmarks.
Published in 2025 lEEE International Conference on Cloud Computing Technology and Science (CloudCom), 2025
This paper proposes a novel matrix multiplication optimization for Huawei Ascend NPUs that offloads narrow MatMul computations from the underutilized Cube Unit to the Vector Unit using AscendC instructions. Applied to MLA inference in DeepSeek-V3, the method achieves a 20% mean performance gain in single-token processing by overlapping AIV and AIC execution.
Published in INFORMATION PROCESSES Electronic Scientific Journal ISSN: 1819-5822, 2026
Dynamic binary translation (DBT) for CPU simulation: a practical performance comparison of five JIT libraries (LLVM, Xbyak, AsmJit, GNU Lightning, MIR) using an RV32I simulator. Includes modular DBT prototyping, synthetic and algorithmic benchmarks, and integration insights. Provides actionable recommendations for selecting JIT tools for efficient DBT.
Published:
A comparison of modern JIT frameworks in the context of developing a high-performance architecture simulator (RISC-V) with Dynamic Binary Translation (DBT).
Undergraduate elective course, Moscow Institute of Physics and Technology, Department of Radio Engineering and Computer Technology, 1900
The course is devoted to the topic of software modeling.
During the course, students are introduced to various modeling techniques: from the simplest interpreter to a complex full-system based on a Discrete Event Simulation. In addition to the theoretical lecture part, there are also practical homework assignments in which students sequentially develop their own RISC-V simulator, which is subsequently integrated into a course project.