Static Program Analysis Jun Ma majun@nju.edu.cn
Static Program Analysis majun@nju.edu.cn Jun Ma
Static Prgram Analysis: Overview
Static Prgram Analysis: Overview
What is Static Analysis? A program that takes programs as input and produces useful results. Transformed code (assembly,instrumented code,...) Code or Binary Static Analyser Useful Results Potential bugs(bad practices,null pointers,...) Software artifacts(diagrams,architecture,...) ■Examples Compilers (and optimization passes) Static checkers (e.g.,-Wall,lint,...) Useful results for SE practices
What is Static Analysis? Code or Binary Static Analyser Useful Results Transformed code (assembly, instrumented code, ...) Potential bugs (bad practices, null pointers, ...) Software artifacts (diagrams, architecture, ...) Examples Compilers (and optimization passes) Static checkers (e.g., -Wall, lint, …) Useful results for SE practices A program that takes programs as input and produces useful results
Categories of Static Program Analysis Static Analysis Lexical Analysis Syntax Analysis Semantic Analysis
Categories of Static Program Analysis Static Analysis Lexical Analysis Syntax Analysis Semantic Analysis
Lexical Analysis Treating program as a sequence of Symbols/Tokens
Lexical Analysis Treating program as a sequence of Symbols/Tokens
Example:Empirical Study on Variable Naming What are the style,abbreviation,..of variable names? Are they correlated to bugs/code quality/...? You can study this by treating code as a tokenized text stream 1"(a+b)*2”=> 2【(SYM,('),(ID,'a'),(BIN0P,'+),(ID,'b'),(SYM,)'),(BIN0P,*'),(INT,'2)J We are interested in the IDs
Example: Empirical Study on Variable Naming Are they correlated to bugs/code quality/…? You can study this by treating code as a tokenized text stream We are interested in the IDs What are the style, abbreviation, … of variable names? 1 "(a + b) * 2" => 2 [ (SYM, '('), (ID, 'a'), (BIN_OP, '+'), (ID, 'b'), (SYM, ')'), (BIN_OP, '*'), (INT, '2') ]
Example:Differencing Files How to define"diffs"between two file versions?
Example: Differencing Files How to define "diffs" between two file versions?
The Edit Distance Approximation a b c a bb a ,21 a a C Delete Insert Unchanged Myers,E.W.An O(ND)difference algorithm and its variations.Algorithmica 1,251-266(1986). https:/doi.org/10.1007/BF01840446
The Edit Distance Approximation Myers, E.W. An difference algorithm and its variations. Algorithmica 1, 251–266 (1986). https://doi.org/10.1007/BF01840446 O(ND)
Is Edit Distance a Good Idea? Open Problem:How to produce even more developer-friendly diffs? Minimizing edit distance is a good hack Lacks semantic explanations to what are changed Not work for adding indention,renaming variables,.. You can work out a paper on this!
Is Edit Distance a Good Idea? Minimizing edit distance is a good hack Lacks semantic explanations to what are changed Not work for adding indention, renaming variables, … You can work out a paper on this! Open Problem: How to produce even more developer-friendly diffs?
Syntax Analysis on AST
Syntax Analysis on AST