Semgrep — Cairo 1.0 Support

AVNU
5 min readMay 26, 2023

--

Cairo 1.0 has just been released and it will be a game-changer for Starknet. The latest version of the language brings a whole lot of new features as well as a completely revised syntax based on Rust, making it safer and more usable for developers.

Although there have been lots of efforts by StarkWare and the community to provide documentation, tooling, and IDE, there is still a lack of useful tools to support the development of smart contracts in Cairo 1.0.

Therefore, at AVNU, we decided to add support for Cairo 1.0 in Semgrep, a static code analysis tool, to enhance the security tooling of Cairo 1.0. To the best of our knowledge, Cairo 1.0 is the second smart contract language supported by Semgrep after Solidity.

Semgrep

Code scanning at ludicrous speed

If you are coming from the realm of cybersecurity, I’m sure you are well acquainted with Semgrep and you can readily skip this section. On the other hand, if you’ve never heard about this amazing tool, buckle up!

Semgrep (Semantic-GREP) is an open-source static code analysis tool that has been built around the idea that most programming languages have very similar constructs and that we could combine language-specific parsers with a generic abstract syntax tree.

This architecture allows users to parametrize the code analysis by writing rules using a language whose syntax is agnostic from the underlying language to analyze.

Rule Example
rules:
- id: unsafe_exec
message: Call to unsafe function exec
languages:
- python
severity: WARNING
patterns:
- pattern: exec(...)

Abstracting the analysis from the underlying language makes Semgrep a unique tool that can be used in a wide range of situations. Indeed, while it is mostly used to check security issues, it can also be used to enforce best practices and coding conventions. This versatility makes it an extremely powerful and generic tool that should be part of any developer workflow.

Actually, its generic nature means that it can be substituted for almost any code analysis tool. This means that the time when you had to have 3 different code analyzers because you had a project with three different languages, is over. Semgrep is the single tool you need to rule them all and this has already been understood by major actors.

If you are hooked, you can learn more about Semgrep by directly reading the official documentation which is very well-written and thorough.

Semgrep for Cairo 1.0

At AVNU, we have recently introduced support for Cairo 1.0 in Semgrep in version v1.22.0.

How to start

# Download semgrep
pip3 install semgrep

# Download rules
mkdir -p ~/.semgrep/rules && wget https://github.com/avnu-labs/semgrep-cairo-rules/releases/download/v0.0.1/cairo-rules.yaml -O ~/.semgrep/rules/cairo.yaml

# Run semgrep
semgrep -c ~/.semgrep/rules/cairo.yaml path/to/repo/to/scan

Once Semgrep is properly installed you should be able to run successfully the following command from your terminal.

$> semgrep --version
> 1.21.0

To better understand how Semgrep works and how it can be used with Cairo, consider the following contract snippet.

use starknet::ContractAddress;

// ERC20 interface
trait ERC20{}

#[contract]
mod some_contract {
// Storage variables

#[external]
fn transfer_token(from: ContractAddress, to: ContractAddress, amount: u256) -> bool {
// Do some operations

ERC20::transfer(erc20_address, from, to, amount);
balances::write(from, balances::read(from) - amount);

// Do some more operations

return true;
}
}

We now show two examples of rules that can be used to detect common issues found in smart contracts.

Unsafe Arithmetic

Numbers can overflow and it’s important to either check that an arithmetic operation is sound or to use a safe math library. Otherwise, this can have very harmful consequences for the contract.

The usage of unsafe arithmetic operations can be easily checked by Semgrep using, for example, the naive rule

rules:
- id: unsafe math operator
message: Call unsafe math operators on $X
languages: [cairo]
severity: ERROR
pattern-either:
- pattern: $X + $Y
- pattern: $X += ...
- pattern: $X - $Y
- pattern: $X -= ...
- pattern: $X * $Y
- pattern: $X *= ...

While the above rule is very naive, it illustrates very well some of the most common constructs found in Semgrep. If you want to know more about the exact meaning of the patterns above, you can use the very well-written documentation that is directly provided by Semgrep.

Running the rule on our example contract should yield the following result

$> semgrep -c rules/unsafe_arithmetic.yaml example/contract_1.cairo
>
┌────────────────┐
│ 1 Code Finding │
└────────────────┘

example/contract_1.cairo
rules.unsafe math operator
Call unsafe math operators on balances::read(from)

14┆ balances::write(from, balances::read(from) - amount);

┌──────────────┐
│ Scan Summary │
└──────────────┘

Ran 1 rule on 1 file: 1 finding.

Reentrancy

Reentrancy has caused millions of dollars of loss as well as many sleepless nights for smart contract users. While the issue can be very subtle to identify, there exists a simple pattern to mitigate it.

We can easily implement a generic Semgrep rule that can identify potential reentrancy issues.

rules:
- id: reentrancy
message: |
Value mutated after call to external contract
severity: ERROR
mode: join
join:
rules:
- id: external-contract-declaration
languages: [cairo]
pattern: |
trait $SOME_CONTRACT {}
- id: external-call
languages: [cairo]
pattern: |
$SOME_CONTRACT::transfer(...);
...;
$X::write(...);
on:
- 'external-contract-declaration.$SOME_CONTRACT == external-call.$SOME_CONTRACT'

The above rule is a bit more complex than the previous one but illustrates one of the key features of Semgrep - metavariables. A metavariable is a Semgrep construct that can be used to bind information found when a pattern is matched. More specifically, considering the above example, if the pattern “trait $SOME_CONTRACT {}” is found, Semgrep will store the trait name under the metavariable “$SOME_CONTRACT”. This value can then be subsequently used in other patterns.

Without going into the exact meaning of the syntax above, the above rule works as follows.

  1. First match two distinct patterns, referred to as “external-contract-declaration” and “external-call”.
  2. Use the metavariable “$SOME_CONTRACT” to create a join condition that allows us to detect an issue if and only if these two patterns occur jointly.

The above two rules show the expressiveness and power of Semgrep and how they can be used to support secure smart contract development.

Conclusion

Security is a community endeavor

Semgrep is a widely used tool that can be easily integrated into your development workflow and can act as a first rampart against security vulnerabilities. However, its efficiency is highly dependent on the quality of the Cairo 1.0 parser as well as on the rules available, and this is why we must build a community around this tool.

If you are interested in participating in the development of the parser and the integration of Cairo 1.0 in Semgrep, feel free to reach out to us. The corresponding repository can be found here

If you are interested in creating rules, you can reach out to us as well. We’ve provisioned a repository here where we plan on putting the rules before we move them to the Semgrep registry.

ℹ️ Last but not least, you can join the public Telegram group and help this initiative thrive.

Acknowledgments

This work would not have been possible without Yoann Padioleau from the Semgrep team, whose guidance and expertise have been invaluable during the whole process.

Author: Romain Jufer Chief Scientist Officer at AVNU

--

--