-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Security Issue] Code Execution against underlying host system in Custom Component #1973
Comments
Hey @0xfatty, I'm here to help you with any bugs, questions, or contributions you have in mind. Let's get started! To address the security vulnerability in the Custom Component feature of the Langflow framework that allows for arbitrary code execution when hosted online, you should implement input validation and sanitization. Additionally, you can leverage the existing structure of the Here is an example of how you can modify the import re
from langflow.custom import CustomComponent
from langflow.schema import Record
from langflow.custom.eval import eval_custom_component_code
from langflow.utils import validate
class Component(CustomComponent):
display_name = "Custom Component"
description = "Use as a template to create your own component."
documentation: str = "http://docs.langflow.org/components/custom"
icon = "custom_components"
def build_config(self):
return {"param": {"display_name": "Parameter"}}
def build(self, param: str) -> Record:
# Basic input validation to prevent arbitrary code execution
if not isinstance(param, str):
raise ValueError("Parameter must be a string")
# Example of sanitizing input to remove potentially dangerous characters
sanitized_param = re.sub(r'[^\w\s]', '', param)
# Validate and execute the custom component code safely
if not self.code:
raise ComponentCodeNullError(
status_code=400,
detail={"error": self.ERROR_CODE_NULL, "traceback": ""},
)
if not self._function_entrypoint_name:
raise ComponentFunctionEntrypointNameNullError(
status_code=400,
detail={
"error": self.ERROR_FUNCTION_ENTRYPOINT_NAME_NULL,
"traceback": "",
},
)
custom_function = validate.create_function(self.code, self._function_entrypoint_name)
result = custom_function(sanitized_param)
return Record(data=result) In this example:
This approach helps mitigate the risk of arbitrary code execution by ensuring that the input is safe before processing it and by using the existing validation mechanisms provided by the Langflow framework [1][2].
|
Thank you for your prompt response and suggestion to address the security vulnerability by implementing input validation and sanitization. However, I'd like to clarify that the core issue lies in the context of the entire Langflow framework being hosted publicly, rather than the safety of individual published components. The vulnerability arises because the framework allows users to create and execute their custom components on a publicly accessible server. In this scenario, the risk is that a malicious user can upload a custom component containing arbitrary and potentially harmful code, which gets executed on the server. Taking the online playground provided by Langflow at https://huggingface.co/spaces/Langflow/Langflow as an example. With the same attack vector, I was able to execute the OS commands against the underlying docker container. Sample output
|
We also anticipated encountering similar issues when implementing Whisper in Langflow, specifically when converting voice file inputs to Base64 within the component. As you mentioned, if a malicious user gains access to the Langflow exposed to clients, they could potentially include custom components with such backdoor code. (Similar to the Hugging Face scenario you mentioned) Given the recent pull request history showing an image output component, it seems likely that an input component field will be added soon. It would be ideal if file input and output handling were restricted solely to the input component level, preventing users from directly writing these operations in custom components. (Alternatively, Langflow could include a mechanism to directly validate the Base64 format.) Additionally, I recently heard that the LF core team is refactoring using Casbin for RBAC. Fundamentally, beyond debug/build modes, Casbin's introduction will likely result in account-based API call control. This aspect should be emphasized in the manual to caution users accordingly. (At the product level, the example scenario you mentioned should be used to highlight the importance of disabling the AUTO LOGIN flag.) Similarly, |
Thank you for your detailed response and for acknowledging the potential risks associated with the File Input and Output HandlingRestricting file input and output operations to specific input components is a great step towards mitigating the risk of arbitrary code execution. By limiting these operations to controlled and validated components, you reduce the attack surface significantly. Incorporating mechanisms to directly validate the Base64 format at the component level is also a good way to ensure data integrity and security. However, Base64 output is just one of several ways for encoding. I do think it could be better if we have a mechanism to block outbound call by invoking this API. Casbin for RBAC:The refactoring to include Casbin for role-based access control (RBAC) is a positive move. Beyond debug/build modes, implementing account-based API call control will enhance security by ensuring that only authorized users can execute certain actions. Emphasizing this in the documentation, especially with a focus on disabling the Exec-like Code Risks:Allowing exec-like code in custom components poses significant security risks, as it provides a direct method for attackers to execute arbitrary commands. Even with input validation and sandboxing, the inherent risk of arbitrary code execution remains high. I recommend considering the following additional measures:
I appreciate the efforts being made to enhance Langflow's security and am available to provide further insights or assistance as needed. |
@yamonkjd I just added a commit publishing Security policy page. #2000 This is to publish a CVE tracking number for this issue. Hope you can review and approve and publish an advisory for this. |
lol, We are contributor to Langflow, not a core member. I seem to have caused some confusion. @ogabrielluiz |
CVE-2024-37014 has been assigned by MITRE to track this security issue. |
Quick note
Langflow is a very interesting and useful framework for those who work with AI projects. Personally, it's a cool project to learn from. It seems the Custom Component feature was launched to support user-defined Python scripts that use the Langflow provided library. This is a great feature. However, upon reviewing its documentation in [1] and [2], it seems there is no mention of potential security issues. Hence, perhaps the following finding was likely not expected.
[1] https://docs.langflow.org/components/custom
[2] https://docs.langflow.org/guidelines/custom-component
Describe the bug
The Custom Component feature allows users to provide their own Python scripts using the CustomComponent class provided by the Langflow library. This is excellent for local testing and experimentation. However, if the framework is hosted online, it creates a potential security issue where a bad actor can leverage the opportunity to provide arbitrary Python code and gain code execution ability against the hosting server.
Impacted API
POST /api/v1/custom_component
Browser and Version
To Reproduce
Steps to reproduce the behavior:
CustomComponent
, withinComponent
class, provide the following Python functionCheck & Save
, the/api/v1/custom_component
API is invoked to process the provided Python script, which then leads to OS command execution. The output will be Base64 encoded and sent to a malicious server.Screenshots
Additional context
The vulnerability allows for arbitrary code execution by injecting malicious code through the Custom Component feature. This could lead to significant security risks, including data theft, unauthorized access, and potential disruption of services (especially when being hosted publicly)
The text was updated successfully, but these errors were encountered: