Intel published Issues of tools Controlflag 1.2 , which allows you to identify errors and abnormalities in the source texts using a machine learning system, trained on a large volume of existing code. Unlike traditional static analyzers, Controlflag does not apply ready -made rules in which it is difficult to provide all possible options, but is repelled from the statistics of the use of all kinds of linguistic constructions in a large number of existing projects. Controlflag code is written in C ++ and open under the license Mit.
The new issue is notable for the implementation of the full support of the identification of anomalies and learning based on standard Code code templates for C ++. In past versions, such support was provided for languages C and PHP. The system is suitable for determining various types of problems in the code, from determining typos and an incorrect combination of types, to identifying anomalies in the conditional expressions of IF and missed checks of NULL values in signs. The system is trained by building a statistical model of the existing array of open projects code in languages C, C ++ and PHP published in GitHub and similar public repositories.
At the training stage, the system determines the standard templates for constructing structures in the code and builds a syntactic tree of connections between these templates, reflecting the flow of code execution in the program. As a result, a reference decision of decisions is formed, combining the experience of developing all analyzed starting texts. A similar process of determining the templates that are checked with a reference decision tree is performed for the verified code. Large discrepancies with neighboring branches indicate the presence of anomalies in the check -in template.