-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
feat(llmobs): llm datasets and experiments #12918
base: main
Are you sure you want to change the base?
Conversation
…dd-trace-py into jonathan.chavez/llm-experiments
|
# Derived values | ||
def get_api_base_url() -> str: | ||
"""Get the base URL for API requests.""" | ||
if get_site().endswith("datadoghq.com"): |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High
datadoghq.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 days ago
To fix the problem, we need to parse the URL and check the hostname properly instead of using a simple string method. This can be done using the urlparse
function from the urllib.parse
module. By extracting the hostname and then performing the check, we can ensure that the URL is correctly validated.
The best way to fix the problem without changing existing functionality is to modify the get_api_base_url
and get_base_url
functions to use urlparse
for extracting and validating the hostname. This ensures that the check is performed on the actual hostname rather than any part of the URL string.
-
Copy modified lines R29-R34 -
Copy modified lines R40-R42 -
Copy modified lines R44-R46
@@ -28,5 +28,8 @@ | ||
"""Get the base URL for API requests.""" | ||
if get_site().endswith("datadoghq.com"): | ||
return f"https://api.{get_site()}" | ||
elif get_site().endswith("datad0g.com"): | ||
from urllib.parse import urlparse | ||
site = get_site() | ||
hostname = urlparse(f"https://{site}").hostname | ||
if hostname and hostname.endswith("datadoghq.com"): | ||
return f"https://api.{site}" | ||
elif hostname and hostname.endswith("datad0g.com"): | ||
return "https://dd.datad0g.com" | ||
@@ -36,7 +39,9 @@ | ||
"""Get the base URL for the LLM Observability UI.""" | ||
if get_site() == "datadoghq.com": | ||
site = get_site() | ||
hostname = urlparse(f"https://{site}").hostname | ||
if hostname == "datadoghq.com": | ||
return "https://app.datadoghq.com" | ||
elif get_site().endswith("datadoghq.com"): | ||
return f"https://{get_site()}" | ||
elif get_site() == "datad0g.com": | ||
elif hostname and hostname.endswith("datadoghq.com"): | ||
return f"https://{site}" | ||
elif hostname == "datad0g.com": | ||
return "https://dd.datad0g.com" |
"""Get the base URL for API requests.""" | ||
if get_site().endswith("datadoghq.com"): | ||
return f"https://api.{get_site()}" | ||
elif get_site().endswith("datad0g.com"): |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High
datad0g.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 days ago
To fix the problem, we need to ensure that the URL is properly parsed and its hostname is checked in a secure manner. Instead of using a simple string comparison, we should use the urlparse
function from the urllib.parse
module to extract the hostname and then perform the check.
- Parse the URL using
urlparse
to extract the hostname. - Check if the hostname ends with the allowed domain.
- Update the
get_api_base_url
andget_base_url
functions to use this approach.
-
Copy modified lines R27-R28 -
Copy modified lines R31-R35 -
Copy modified lines R41-R43 -
Copy modified lines R45-R47
@@ -26,7 +26,11 @@ | ||
# Derived values | ||
from urllib.parse import urlparse | ||
|
||
def get_api_base_url() -> str: | ||
"""Get the base URL for API requests.""" | ||
if get_site().endswith("datadoghq.com"): | ||
return f"https://api.{get_site()}" | ||
elif get_site().endswith("datad0g.com"): | ||
site = get_site() | ||
hostname = urlparse(f"https://{site}").hostname | ||
if hostname and hostname.endswith("datadoghq.com"): | ||
return f"https://api.{hostname}" | ||
elif hostname and hostname.endswith("datad0g.com"): | ||
return "https://dd.datad0g.com" | ||
@@ -36,7 +40,9 @@ | ||
"""Get the base URL for the LLM Observability UI.""" | ||
if get_site() == "datadoghq.com": | ||
site = get_site() | ||
hostname = urlparse(f"https://{site}").hostname | ||
if hostname == "datadoghq.com": | ||
return "https://app.datadoghq.com" | ||
elif get_site().endswith("datadoghq.com"): | ||
return f"https://{get_site()}" | ||
elif get_site() == "datad0g.com": | ||
elif hostname and hostname.endswith("datadoghq.com"): | ||
return f"https://{hostname}" | ||
elif hostname == "datad0g.com": | ||
return "https://dd.datad0g.com" |
"""Get the base URL for the LLM Observability UI.""" | ||
if get_site() == "datadoghq.com": | ||
return "https://app.datadoghq.com" | ||
elif get_site().endswith("datadoghq.com"): |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High
datadoghq.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 days ago
To fix the problem, we need to ensure that the URL is properly parsed and the hostname is validated correctly. We will use the urlparse
function from the urllib.parse
module to parse the URL and then check if the hostname ends with the allowed domain. This approach ensures that the check is not bypassed by malicious URLs.
We will modify the get_base_url
function to use urlparse
for parsing the URL and validating the hostname.
-
Copy modified line R2 -
Copy modified lines R37-R38 -
Copy modified lines R40-R43
@@ -1,3 +1,3 @@ | ||
import os | ||
|
||
from urllib.parse import urlparse | ||
from .._llmobs import LLMObs | ||
@@ -36,7 +36,9 @@ | ||
"""Get the base URL for the LLM Observability UI.""" | ||
if get_site() == "datadoghq.com": | ||
site = get_site() | ||
if site == "datadoghq.com": | ||
return "https://app.datadoghq.com" | ||
elif get_site().endswith("datadoghq.com"): | ||
return f"https://{get_site()}" | ||
elif get_site() == "datad0g.com": | ||
parsed_url = urlparse(f"https://{site}") | ||
if parsed_url.hostname and parsed_url.hostname.endswith("datadoghq.com"): | ||
return f"https://{site}" | ||
elif site == "datad0g.com": | ||
return "https://dd.datad0g.com" |
Bootstrap import analysisComparison of import times between this PR and main. SummaryThe average import time in this PR is: 228 ± 2 ms. The average import time in main is: 230 ± 2 ms. The import time difference between this PR and main is: -2.17 ± 0.08 ms. Import time breakdownThe following import paths have shrunk:
|
else: | ||
record = self._data[index].copy() | ||
record.pop("record_id", None) | ||
return record |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checklist
Reviewer Checklist