Fix vulnerabilities automatically using LLMs
patchwork AutoFix
by default you will need to provide the openai_api_key
and the github_api_key
you can pass them as arguments:
patchwork AutoFix openai_api_key=<Your_API_KEY> github_api_key=<Your_GH_Token>
gpt-3.5-turbo
to generate the fix for the vulnerabilities. You can check the default prompt template. The generated fixes are then committed to the repository under a new branch and finally a pull request is created for the user to review and merge the changes.
openai_api_key
option.
You can also use llama.cpp to run inference on CPU locally. Just install the llama-cpp-python package and run their OpenAI compatible web server as described here with the command:
python3 -m llama_cpp.server --hf_model_repo_id TheBloke/deepseek-coder-6.7B-instruct-GGUF --model 'deepseek-coder-6.7b-instruct.Q4_K_M.gguf' --chat_format chatml
Once the local server is running you can set:
context_size
tokens to pass on to the LLM. You can change the default value by setting:
context_size
doesn’t necessarily lead to better fixes.
semgrep_extra_args
option as follows:
sarif_file_path
to the SARIF file and AutoFix will use the information there to generate the fixes. Otherwise, we will do a scan using Semgrep.
vulnerability_limit
number of issues. This is to avoid making large number of LLMs calls and to keep the generated PR manageable. You can set the value to your preference:
-1
) to fix all vulnerabilities found in the scan.
branch_prefix
, or disabling the creation of new branches with disable_branch
(commits will be made on the current branch). You can also disable PR creation with disable_pr
or force push commits to existing PR with force_pr_creation
.
semgrep login
before running the AutoFix patchflow to get access to your own custom or pro rules if you have a Semgrep account (available for free). In addition, you can use any SAST scanner that outputs results in the standard SARIF format. Just pass your scan results with the following:
{{messageText}}
and {{affectedCode}}
. They are generated by the steps within the AutoFix patchflow and replaced by the actual values during the execution. Also, remember to keep the output format as given in the default prompt as it is used to extract information that is used by the steps after the model response is processed. The following output format is expected at the moment: