#
Semgrep
semgrep
(syntactic grep) is an open-source tool for finding patterns in code. It's useful for preventing the use of known anti-patterns in a codebase or enforcing the correct use of secure-by-default frameworks (e.g. always use a project's sanitization method on user-provided data).
semgrep is fast and powerful; it's grep-esque patterns are lifted into AST matchers. Compared to regexes these patterns aren’t affected by whitespaces, comments, newlines, the order of keyword arguments, variable renaming, and other language nuances.
Currently, the supported languages are: C, Go, Java, JavaScript, and Python.
#
Configuration
There are two types of rules in Semgrep:
- Simple rules - expressed with a single
pattern
. - Advanced rules - expressed with multiple patterns, like: X must be true AND Y must be too, or X but NOT Y, or X must occur inside a block of code that Y matches. These patterns are composed with the
patterns
keyword.
In salus.yaml, both simple and advanced rules can be specified with a path to a Semgrep YAML config file. In adddition, simple rules can be specified directly in salus.yaml.
#
Specifying path to Semgrep YAML config
In salus.yaml, you can specify a set of semgrep rules with a path to a Semgrep config file. You must specify
config
- a full Semgrep config file- Either
required: true
orforbidden: true
- If a found pattern is forbidden or if a not found pattern is required, then the scanner will fail and the
message
will be show to the developer in the report.
- If a found pattern is forbidden or if a not found pattern is required, then the scanner will fail and the
In addition, you can optionally specify
exclude
- Skip any file or directory that matches this pattern--exclude='*.py'
will ignore the following: foo.py, src/foo.py, foo.py/bar.sh. --exclude='tests' will ignore tests/foo.py as well as a/b/tests/c/foo.py. Can add multiple times.
scanner_configs:
Semgrep:
matches:
- config: semgrep_config_1.yaml
forbidden: true
- config: semgrep_config_2.yaml
forbidden: true
exclude:
- tests
Example semgrep_config_1.yaml. The rule says find all patterns of the form
$X == $X
, but exclude 0 == 0
.
rules:
- id: eqeq-always-true
patterns:
- pattern: $X == $X
- pattern-not: 0 == 0
message: "$X == $X is always true"
languages: [python]
severity: ERROR
Keywords in this file:
id
- Unique, descriptive identifier, cannot contain whitespaces (required)patterns
orpattern
- patterns or pattern or pattern-regex (required)message
- Message if rule (forbidden and found) or (required and not found) (optional)languages
- Any of: c, go, java, javascript, or python (required)severity
- One of: WARNING, ERROR (required)
#
Adding simple rule directly (without Semgrep config file)
Simple rules that can be expressed with a single pattern
can be directly specified in salus.yaml.
Each simple rule in salus.yaml must include
pattern
- the single patternforbidden: true
or `required: truelanguage
- Any of: c, go, java, javascript, or python
The user can optionally provide
exclude
- Skip any file or directory that matches this pattern--exclude='*.py'
will ignore the following: foo.py, src/foo.py, foo.py/bar.sh. --exclude='tests' will ignore tests/foo.py as well as a/b/tests/c/foo.py. Can add multiple times.
message
- Message if rule (forbidden and found) or (required and not found)
Example,
scanner_configs:
Semgrep:
matches:
- pattern: $X == $X
message: Useless equlity check
language: python
forbidden: true
exclude:
- tests
- pattern: $X.unsanitize(...)
message: Don't call `unsanitize()` methods without careful review
language: js
forbidden: true
exclude:
- node_modules
- pattern: $LOG_ENDPOINT = os.getenv("LOGGER_ENDPOINT", ...)
message: All files need to get the dynamic logger. Please don't hardcode this.
language: python
required: true
#
Whitelisting Findings
Please see semgrep's ignoring findings documentation.
#
Limitations of Semgrep
- There may be parser-related issues from Semgrep
- Parser-related issues will be displayed as warnings and do not cause salus to fail.
- Salus will still show semgrep results from files that do not have parser issues.
- Salus semgrep currently does not support scanning against pre-built rules.
- But we plan to support this in the near future!